Index lookup method, system, electronic device and readable storage medium

By encoding the index into a binary search tree data structure and employing the binary search method, the problem of low index lookup performance in KV storage is solved, and the efficiency of index lookup is improved.

CN116401412BActive Publication Date: 2026-06-19ALIBABA (CHINA) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIBABA (CHINA) CO LTD
Filing Date
2023-03-10
Publication Date
2026-06-19

Smart Images

  • Figure CN116401412B_ABST
    Figure CN116401412B_ABST
Patent Text Reader

Abstract

This application provides an index lookup method, system, electronic device, and readable storage medium. The method includes obtaining a target index sequence containing an index to be searched; at least a portion of the indexes in the target index sequence are string type indexes; encoding the index to be searched into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and determining a first value corresponding to the encoded data structure; and using a binary search method to match the first value with a second value corresponding to the data structure of the node in the binary search tree to search for the index to be searched in the binary search tree.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data storage technology, and in particular to an index lookup method, apparatus, electronic device, and readable storage medium. Background Technology

[0002] In the field of data storage technology, key-value (KV) storage is a commonly used method. KV storage organizes data in the form of key-value pairs, using the key as an index to implement data storage, modification, querying, and deletion functions.

[0003] In key-value (KV) storage, the performance of CRUD operations (Create, Read, Update, Delete) heavily relies on index lookups. To improve the performance of index-based lookups, various algorithms and data structures have been designed, such as binary trees, B-trees, B+ trees, and Bw trees. However, even with these advancements, index lookups remain the most time-consuming operation and urgently require further optimization. Summary of the Invention

[0004] This application provides an index lookup method, the method comprising:

[0005] Obtain a target index sequence containing the index to be searched; at least a portion of the indexes in the target index sequence are string type indices.

[0006] The index to be searched is encoded into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and a first value corresponding to the encoded data structure is determined; wherein, the nodes of the binary search tree corresponding to the target index sequence are respectively used to store each index contained in the target index sequence; the data structure of the index stored in the node of the binary search tree includes several bits for filling at least part of the index content of the index stored in the node, and several bits for filling the length of the common character prefix of the index stored in the node relative to the index stored in the parent node corresponding to the node;

[0007] A binary search method is used to match the first value with the second value corresponding to the data structure of the node in the binary search tree, so as to find the index to be searched in the binary search tree.

[0008] Optionally, the database maintains an index tree for storing data;

[0009] The step of obtaining the target index sequence containing the index to be searched includes:

[0010] The target index sequence containing the index to be searched is found from the index tree maintained by the database for storing data using a binary search method.

[0011] Optionally, before encoding the index to be searched into a data structure of indexes stored in nodes of a binary search tree corresponding to the target index sequence, the method further includes:

[0012] Construct a binary search tree corresponding to the obtained target index sequence in real time;

[0013] Calculate the length of the common character prefix of the index stored in the node on the binary search tree relative to the index stored in the parent node corresponding to that node;

[0014] Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure according to the index content corresponding to the part after the common character prefix of the index stored in the parent node of the node.

[0015] Optionally, at least one bit in the data structure is a preset first flag bit; the first flag bit is used to indicate the bit in the data structure used to fill the length of the common character prefix;

[0016] Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure corresponding to the partial index content after the common prefix relative to the index stored in the parent node of that node, including:

[0017] Based on the first flag bit, determine the bit position in the data structure to fill the length of the common character prefix, and fill the calculated length of the common character prefix into the determined bit position; and store the index content corresponding to the part after the common character prefix of the index stored by the node on the binary search tree relative to the index stored by the parent node of the node into the remaining bit positions other than the determined bit position.

[0018] Optionally, the value corresponding to the first flag bit includes a first value and a second value; wherein, if the length of the common character prefix does not exceed a preset threshold, the value of the first flag bit is the first value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is N and the positions of the N bits in the data structure; if the length of the common character prefix exceeds the preset threshold, the value of the first flag bit is the second value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is M and the positions of the M bits in the data structure; the value of M is greater than the value of N.

[0019] Optionally, the indexes in the target index sequence include composite indexes; the composite index is an index composed of at least two indexes, including an index of the string type.

[0020] Optionally, at least two bits in the data structure are preset second flag bits; the second flag bits are used to mark the index identifier of the composite index stored in the node on the binary search tree, relative to the index identifier of the common index of the composite index stored in the parent node corresponding to the node.

[0021] Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure relative to the index of the parent node of that node, after the common character prefix. This includes:

[0022] Determine the composite index stored in the node on the binary search tree, the index identifier of the common index of the composite index stored relative to the parent node corresponding to that node, and at least one non-common index;

[0023] Based on the index identifier of the public index, the length of the common prefix of the non-public index of the node on the binary search tree relative to the parent node of the node, and the index content corresponding to the part after the common character prefix of the non-public index of the node on the binary search tree relative to the parent node of the node, the indexes stored in the nodes on the binary search tree are respectively encoded into the data structure.

[0024] Optionally, the data structure includes 64 bits; the 64 bits include 4 bits for representing the first flag bit and the second flag bit; the first flag bit corresponds to 1 bit of the 4 bits; the second flag bit corresponds to 3 bits of the 4 bits.

[0025] Optionally, the high 4 bits or the low 4 bits of the 64-bit array are used to represent the first flag bit and the second flag bit.

[0026] Optionally, the preset threshold is 4096; the value of N is 12; and the value of M is 20.

[0027] Wherein, if the length of the common character prefix does not exceed 4096, the value of the first flag bit is 1, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 12, and the 12 bits are either the lower 12 bits or the higher 12 bits of the 64 bits; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 48, and the 48 bits are either the lower 12-59 bits or the higher 12-59 bits of the 64 bits;

[0028] If the length of the common character prefix exceeds 4096, the value of the first flag bit is 0, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 20; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 40, and the 40 bits are the lower 20-59 bits or the higher 20-59 bits of the 64 bits.

[0029] This application also provides an index lookup device, the device comprising:

[0030] An index sequence acquisition unit is used to acquire a target index sequence containing the index to be searched; at least a portion of the indexes in the target index sequence are string type indices.

[0031] The lookup index encoding unit is used to encode the lookup index into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and to determine a first value corresponding to the encoded data structure; wherein, the nodes of the binary search tree corresponding to the target index sequence are used to store each index contained in the target index sequence; the data structure of the index stored in the node of the binary search tree includes several bits for filling at least a portion of the index content of the index stored in the node, and several bits for filling the length of the common character prefix of the index stored in the node relative to the index stored in the parent node corresponding to the node;

[0032] The index matching unit is used to match the first value with the second value corresponding to the data structure of the node on the binary search tree using a binary search method, so as to find the index to be searched on the binary search tree.

[0033] This application also provides an index lookup system that performs the above-described method.

[0034] This application also provides an electronic device, including a communication interface, a processor, a memory, and a bus, wherein the communication interface, the processor, and the memory are interconnected via the bus;

[0035] The memory stores machine-readable instructions, and the processor executes the above method by invoking the machine-readable instructions.

[0036] This application also provides a computer-readable storage medium storing machine-readable instructions that, when called and executed by a processor, implement the above-described method.

[0037] In the above embodiments, the index content in the target index sequence is converted into a preset data structure, and the traditional string matching process is converted into a value comparison process. This not only reduces the storage overhead of the index, but also reduces the index lookup operations and improves the efficiency of index lookup. Attached Figure Description

[0038] Figure 1 This is a schematic diagram illustrating the architecture of an index lookup method as an exemplary embodiment.

[0039] Figure 2 This is a schematic flowchart illustrating an index lookup method in an exemplary embodiment.

[0040] Figure 3 This is a schematic diagram of a binary search tree corresponding to a target index sequence, as shown in an exemplary embodiment.

[0041] Figure 4 This is a schematic diagram of a binary search tree corresponding to another target index sequence, as shown in an exemplary embodiment.

[0042] Figure 5 This is an exemplary embodiment illustrating a binary search tree corresponding to a target index sequence containing a composite index.

[0043] Figure 6 This is an exemplary embodiment illustrating the hardware structure of an electronic device in which an index lookup device is located.

[0044] Figure 7 This is a block diagram illustrating an index lookup device in an exemplary embodiment. Detailed Implementation

[0045] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0046] It should be noted that the steps of the corresponding methods are not necessarily performed in the order shown and described in this specification in other embodiments. In some other embodiments, the methods may include more or fewer steps than described in this specification. Furthermore, a single step described in this specification may be broken down into multiple steps in other embodiments; and multiple steps described in this specification may be combined into a single step in other embodiments.

[0047] Figure 1 This is a schematic diagram of the architecture of an index lookup method provided in an exemplary embodiment. For example... Figure 1 As shown, the system may include a network 10, a server 11, and several clients, such as desktop computers 12, laptops 13, and mobile phones 14.

[0048] Server 11 can be a physical server containing an independent host, or server 11 can be a virtual server, cloud server, etc., hosted by a host cluster. Mobile phones 12-14 are just one type of electronic device that users can use. In reality, users can obviously also use electronic devices such as tablets, laptops, PDAs (Personal Digital Assistants), wearable devices (such as smart glasses, smartwatches, etc.), etc., and one or more embodiments in this specification do not limit this. Network 10 can include various types of wired or wireless networks.

[0049] In one embodiment, the server 11 can cooperate with mobile phones 12-14; wherein, mobile phones 12-14 can obtain the index to be searched and the target index sequence containing the index to be searched, and upload the obtained index to be searched and the target index sequence containing the index to be searched to the server 11 through network 10, and then the server 11 performs a search based on the index search method of this specification. In another embodiment, mobile phones 12-14 can independently implement the index search method of this specification; wherein, mobile phones 12-14 obtain the index to be searched and the target index sequence containing the index to be searched, and perform a search based on the index search method of this specification.

[0050] The index lookup scheme of this manual will be described in detail below with reference to the accompanying drawings.

[0051] Figure 2 This is a flowchart illustrating an index lookup method provided in an exemplary embodiment. For example... Figure 2 As shown, the method may include the following steps:

[0052] Step 202: Obtain a target index sequence containing the index to be searched; at least a portion of the indexes in the target index sequence are string type indexes.

[0053] Step 204: Encode the index to be searched into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and determine the first value corresponding to the encoded data structure; wherein, the nodes of the binary search tree corresponding to the target index sequence are used to store each index contained in the target index sequence; the data structure of the index stored in the node of the binary search tree includes several bits for filling at least part of the index content of the index stored in the node, and several bits for filling the length of the common character prefix of the index stored in the node relative to the index stored in the parent node corresponding to the node.

[0054] Step 206: Using a binary search method, the first value is matched with the second value corresponding to the data structure of the node on the binary search tree, so as to find the index to be searched on the binary search tree.

[0055] In this specification, the target index sequence can be an index sequence containing the index to be searched, wherein at least some of the indices in the target index sequence are string type indices. In practical applications, the indices in the target index sequence may all be string type, or they may include both integer and string type indices; this specification does not impose specific limitations. The method for obtaining the target index sequence is not specifically limited in this specification. For example, the target index sequence can be obtained through user input, or it can be automatically generated by the database system detecting user operations on the database, etc.

[0056] The index to be searched is the index to be matched or awaited. This specification does not specify a particular method for obtaining the index to be searched. For example, the index to be searched can be obtained through user input, the database system can automatically detect user operations on the database, or the database system can obtain the index based on user-inputted SQL, etc.

[0057] It should be noted that, in this specification, the target index sequence is a sorted index sequence. In practical applications, key-value storage sorts by key value, allowing for fast searching using binary search or other methods.

[0058] For example, “ABCDEFG, ABCDXY, ABCDXYZ, ABCOXYA, ABCOXYY, ABCOXYZ, ABCXYZ” can be used as a target index sequence. The user needs to search for the index “ABCOXYZ” in the target index sequence. “ABCOXYZ” can be used as the index to be searched.

[0059] In one implementation, the database maintains an index tree for storing data. This index tree is a storage structure designed for key-value (KV) storage. In practical applications, KV storage typically requires a large amount of space, thus necessitating a proper design for data storage; otherwise, the efficiency of searching, deleting, and inserting data would be very low. The index tree is a core data structure design that improves performance by reducing the amount of data to be operated on.

[0060] In this specification, a binary search can also be used to find the target index sequence containing the index to be searched from the index tree maintained by the database for storing data.

[0061] In practical applications, the index tree can include B-trees, B+ trees, Bw trees, and other index trees. The non-leaf nodes of these index trees all store continuous index sequences. When actually matching the index to be searched, a binary search is performed from the root of the index tree to the leaf nodes. The index sequence stored in the non-leaf nodes containing the index to be searched can then be the target index sequence.

[0062] In this specification, after obtaining the target index sequence, a binary search tree can be further constructed based on the target index sequence, or the target index sequence that has already been constructed into the binary search tree can be directly obtained. No specific limitation is made in this specification.

[0063] A binary search tree, also known as a binary search tree or a binary sort tree, is a binary search tree in which: if the left subtree of any node is not empty, then the value of all nodes in the left subtree is not greater than the value of the parent node of the left subtree; if the right subtree of any node is not empty, then the value of all nodes in the right subtree is not less than the value of the parent node of the right subtree; and the left and right subtrees of any node are also binary search trees.

[0064] In this specification, the specific method for constructing a binary search tree based on the target index sequence can be referred to relevant technologies, and is not specifically limited in this specification.

[0065] In this specification, each node on the binary search tree corresponding to the target index sequence is used to store each index contained in the target index sequence, wherein the index is encoded as a preset data structure, and each node on the binary search tree stores the value corresponding to the data structure.

[0066] In one implementation, the nodes of the binary search tree may store the values ​​corresponding to the data structure and the content of the index itself. The values ​​corresponding to the data structure can be used for fast index lookup, and the content of the index itself can be used for further querying the data corresponding to the index. It should be noted that when storing both the values ​​corresponding to the data structure and the content of the index itself, the index can also be compressed. For example, common character prefixes of string-type indexes can be deleted, or portions of the index exceeding a preset length can be deleted, etc. This specification does not impose specific limitations on these methods.

[0067] The data structure of the index stored in the node of the binary search tree consists of several bits. A portion of the bits is used to fill at least part of the index content stored in the node, and another portion of the bits is used to fill the length of the common character prefix of the index stored in the node relative to the index stored in the corresponding parent node.

[0068] For the root node of a binary search tree, several bits of the data structure are directly used to fill part of the index content corresponding to the index stored in the root node; for non-root nodes of a binary search tree, in the data structure, a portion of the bits are used to fill the length of the common character prefix of the index stored in the node relative to the index stored in the corresponding parent node, and another portion of the bits are used to fill the part of the index content after the common character prefix of the index stored in the node relative to the index stored in the corresponding parent node.

[0069] It should be noted that the number of bits in the data structure can be set according to the actual application. The specific positions of the bits used to fill the length of the common character prefix and the bits used to fill part of the index content can also be set according to the actual application. No specific limitation is made in this specification.

[0070] In practical applications, the data structure is typically a 32-bit, 64-bit, or 128-bit integer data type. Each node in the binary search tree can directly store the value of the integer data type.

[0071] Please see Figure 3 , Figure 3 This is a schematic diagram illustrating a binary search tree corresponding to a target index sequence, as shown in an exemplary embodiment. Figure 3 As shown, Figure 3 The binary search tree shown is the binary search tree formed by the target index sequence "ABCDEFG, ABCDXY, ABCDXYZ, ABCOXYA, ABCOXYY, ABCOXYZ, ABCXYZ" in the example above.

[0072] For example, Figure 3 The root node stores the index "ABCDEFG". All bits in the corresponding data structure stored in the root node are used to fill the part of the index content corresponding to the index stored in that node. Taking a 64-bit data structure as an example, filling the entire content of this index requires 56 bits. The index content corresponding to this index can be filled up to the high 56 bits, and the extra bits are padded with 0.

[0073] Therefore, the data structure corresponding to the root node can be "01000001010000100100001101000100…", where 01000001 is the binary value of character A, 01000010 is the binary value of character B, and so on. Accordingly, the actual content stored in the nodes of the binary search tree is the value corresponding to this data structure.

[0074] It should be noted that if the number of bits in the data structure is insufficient to fill the entire index content, only a portion of the index content can be filled without affecting the subsequent matching process.

[0075] For example, Figure 3 The left child node "ABCDXY" of the root node of the binary search tree shown has a common character prefix of "ABC" relative to the root node "ABCOXYA" of the binary search tree. Therefore, the length of the common character prefix is ​​3. The part after the common character prefix relative to the root node of the binary search tree is "DXY".

[0076] In the data structure stored by a non-root node, a portion of the bits are used to fill the length of the common character prefix between the index stored by that node and the index stored by its corresponding parent node. Another portion of the bits are used to fill the index content after the common character prefix. Taking a 64-bit data structure as an example, the high 3 bits can be used to fill the length of the common character prefix, and the remaining 61 bits are used to fill the index content after the common character prefix.

[0077] Therefore, the data structure corresponding to the left child node of the root node can be "011010001000101100001011001 ..." where 011 represents the length of the common character prefix (3), 01000100 represents the binary value of character D, 01011000 represents the binary value of character X, 01000011 represents the binary value of character Y, and the remaining bits are padded with 0.

[0078] For example, Figure 3The left child of the left child of the root node of the binary search tree shown is “ABCDEFG”. Its common character prefix relative to the parent node “ABCDXY” is “ABCD”. Since the length of the common character prefix is ​​4, the part after the common character prefix is ​​“EFG”.

[0079] Therefore, the data structure corresponding to this node can be "100010001010100011001000111 ...", where 100 represents the length of the common character prefix (4), 01000101 represents the binary value of character E, 01000110 represents the binary value of character F, and 01000111 represents the binary value of character G, with the remaining bits padded with 0. Accordingly, the actual content stored in the node of the binary search tree above is the value corresponding to this data structure.

[0080] It should be noted that in practical applications, the number of bits, the length used to fill the common character prefix, and the position of the bits used for indexing can be set according to actual needs, and are not specifically limited in this specification. For example, the length of the common character prefix can be filled by the high 3 bits, or by the low 3 bits, or by the high 4 bits and the low 4 bits, etc.

[0081] By using the above data structure, the index content is stored in several bits, and the index matching process is transformed into a value matching process. This not only reduces space overhead but also significantly improves the index matching efficiency.

[0082] In this specification, after obtaining the index to be searched, the index can be encoded into the data structure, and the first value corresponding to the encoded data structure can be determined. The encoding process is similar to the encoding process of each node in the binary search tree described above, and will not be repeated here.

[0083] The first value is matched with the second value corresponding to the data structure of the node on the binary search tree to find the index to be searched on the binary search tree.

[0084] There are multiple scenarios for matching the first value with the second value corresponding to the data structure of the node in the binary search tree. The matching process of the index to be searched in the binary search tree is described in detail below.

[0085] In scenario one, when the index to be searched is matched for the first time with an index in the target index sequence, that is, when the index to be searched is matched with the root node of the binary search tree corresponding to the target index sequence, since several bits in the data structure corresponding to the index stored in the root node of the binary search tree are used to fill the index content, the index content of the index to be searched can also be directly filled into several bits in the data structure.

[0086] After encoding the index to be searched into the data structure, the first value corresponding to the data structure can be further determined, and it can be determined whether the first value matches the second value corresponding to the data structure corresponding to the root node of the binary search tree.

[0087] If a match is found, it means that the index to be searched is the root node of the binary search tree, and the index to be searched can be found directly.

[0088] If they do not match, determine the size of the first value and the second value.

[0089] In the second scenario, if the first value is smaller than the second value, relative to the root node of the binary search tree, the index to be searched is further encoded into the data structure, and the third value corresponding to the data structure is further determined and matched with the fourth value corresponding to the data structure stored in the left child node corresponding to the root node of the binary search tree.

[0090] Scenario 3: If a match is successful, it means that the index to be searched is the left child node corresponding to the root node of the binary search tree. If a match fails, the length of the common character prefix stored in the data structure of the index to be searched and the length of the common character prefix stored in the data structure of the left child node are further determined.

[0091] Case 4: If the length of the common character prefix stored in the data structure of the index to be searched is greater than the length of the common character prefix stored in the data structure of the left child node, then the index to be searched can be encoded into the data structure relative to the left child node, the fifth value corresponding to the data structure can be further determined, and it can be matched with the sixth value of the data structure corresponding to the index stored in the right child node of the left child node.

[0092] Since the length of the common character prefix stored in the data structure of the index to be searched is greater than the length of the common character prefix stored in the data structure of the left child node, it indicates that the index to be searched has a longer common character prefix relative to the root node. Since the left child node is smaller than the root node, it indicates that the left child node is smaller than the index to be searched. Therefore, the index to be searched can be matched with the right child node of the left child node.

[0093] Case 5: If the length of the common character prefix stored in the data structure of the index to be searched is less than the length of the common character prefix stored in the data structure of the left child node, the third value can be directly matched with the seventh value of the data structure corresponding to the index stored in the left child node of the left child node.

[0094] Since the length of the common character prefix stored in the data structure of the index to be searched is less than the length of the common character prefix stored in the data structure of the left child node, it indicates that the index to be searched has a shorter common character prefix relative to the root node. Therefore, the index to be searched is less than the left child node, and the third value corresponding to the data structure of the index to be searched relative to the root node is valid, and it is directly matched with the left child node of the left child node.

[0095] If the length of the common character prefix stored in the data structure of the index to be searched is equal to the length of the common character prefix stored in the data structure of the left child node, then the comparison can be based on the part after the common character prefix.

[0096] Case 6: If the value of the part after the common character prefix in the data structure corresponding to the index to be searched is greater than the value of the part after the common character prefix in the data structure corresponding to the index stored in the left child node, then the index to be searched can be encoded into the data structure relative to the left child node, the fifth value corresponding to the data structure can be further determined, and matched with the sixth value of the data structure corresponding to the index stored in the right child node of the left child node.

[0097] Case 7: If the value of the part after the common character prefix in the data structure corresponding to the index to be searched is less than the value of the part after the common character prefix in the data structure corresponding to the index stored in the left child node, then relative to the left child node, the index to be searched is encoded into the data structure, the fifth value corresponding to the data structure is further determined, and it is matched with the seventh value of the data structure corresponding to the index stored in the left child node of the left child node.

[0098] Case 8: If the first value is greater than the second value, relative to the root node of the binary search tree, the index to be searched is further encoded into the data structure, and the third value corresponding to the data structure is further determined and matched with the eighth value corresponding to the data structure stored in the right child node corresponding to the root node of the binary search tree.

[0099] Case 9: If a match is successful, it means that the index to be searched is the right child node corresponding to the root node of the binary search tree. If a match fails, further determine the length of the common character prefix stored in the data structure of the index to be searched and the length of the common character prefix stored in the data structure of the right child node.

[0100] Case 10: If the length of the common character prefix stored in the data structure of the index to be searched is greater than the length of the common character prefix stored in the data structure of the right child node, then the index to be searched can be encoded into the data structure relative to the right child node, the ninth value corresponding to the data structure can be further determined, and it can be matched with the eleventh value of the data structure corresponding to the index stored in the left child node of the right child node.

[0101] Since the length of the common character prefix stored in the data structure of the index to be searched is greater than the length of the common character prefix stored in the data structure of the right child node, it indicates that the index to be searched has a longer common character prefix relative to the root node. Since the right child node is greater than the root node, it indicates that the right child node is greater than the index to be searched. Therefore, the index to be searched can be matched with the left child node of the right child node.

[0102] Case 11: If the length of the common character prefix stored in the data structure of the index to be searched is less than the length of the common character prefix stored in the data structure of the right child node, the third value can be directly matched with the twelfth value of the data structure corresponding to the index stored in the right child node of the right child node.

[0103] Since the length of the common character prefix stored in the data structure of the index to be searched is less than the length of the common character prefix stored in the data structure of the right child node, it indicates that the index to be searched has a shorter common character prefix relative to the root node. Therefore, the index to be searched is greater than the right child node, and the third value corresponding to the data structure of the index to be searched relative to the root node is valid, and it is directly matched with the right child node of the right child node.

[0104] If the length of the common character prefix stored in the data structure of the index to be searched is equal to the length of the common character prefix stored in the data structure of the right child node, then a comparison can be made based on the part after the common character prefix.

[0105] Case 12: If the value of the part after the common character prefix in the data structure corresponding to the index to be searched is greater than the value of the part after the common character prefix in the data structure corresponding to the index stored in the right child node, then the index to be searched can be encoded into the data structure relative to the right child node, the ninth value corresponding to the data structure can be further determined, and matched with the twelfth value of the data structure corresponding to the index stored in the right child node of the right child node.

[0106] Case 13: If the value of the part after the common character prefix in the data structure corresponding to the index to be searched is less than the value of the part after the common character prefix in the data structure corresponding to the index stored in the right child node, then relative to the right child node, the index to be searched is encoded into the data structure, the ninth value corresponding to the data structure is further determined, and it is matched with the eleventh value of the data structure corresponding to the index stored in the left child node of the right child node.

[0107] The matching process described above is illustrated below with a concrete example. Please refer to [link / reference]. Figure 4 , Figure 4 This is a schematic diagram illustrating another binary search tree corresponding to a target index sequence, as shown in an exemplary embodiment. Figure 4 As shown, Figure 4 The length of the common character prefix of the non-root node relative to its parent node, as well as the index content after the common character prefix, are marked near the non-root node.

[0108] For example, with Figure 4 Taking the binary search tree shown and the 64-bit data structure as an example, it can be specified that the high 3 bits of the 64 bits are used to fill the length of the common character prefix, and the remaining 61 bits are used to fill the index content after the length of the common character prefix.

[0109] Assume the index to be searched is “ABCXYZ”. When the index to be searched is matched with the target index sequence for the first time, the above-mentioned case one is entered, and the data structure corresponding to the index to be searched is “010000010100001001000011 01011000…”.

[0110] The data structure of the root node of the binary search tree is "01000001010000100100001101000100…".

[0111] The first value of the data structure does not match the second value of the root node of the binary search tree, and the first value is greater than the second value. Therefore, the above-mentioned case eight is entered. So, relative to the root node of the binary search tree, the index to be searched is encoded into a new data structure, and a third value is generated to match the right child node "ABCOXYZ" of the root node.

[0112] Relative to the root node of the binary search tree, the common character prefix is ​​"ABC", the length of the common character prefix is ​​3, and the non-common character prefix part is "XYZ".

[0113] Therefore, the data structure corresponding to the index to be searched is "011010110000101100101011010…".

[0114] The common character prefix of the right child node relative to the root node is "ABCOXY", the length of the common character prefix is ​​6, and the non-common character prefix part is "Z".

[0115] Therefore, the data structure of the right child node of the root node is "1100101101000000000 00000000…".

[0116] Since the value corresponding to the data structure of the index to be searched does not match the value corresponding to the data structure of the right child node of the root node, the length of the common character prefix stored in the data structure of the index to be searched and the length of the common character prefix stored in the data structure of the right child node are further determined.

[0117] The length of the common character prefix filled in the data structure of the index to be searched is 3, and the length of the common character prefix filled in the data structure of the right child node is 6.

[0118] If the length of the common character prefix filled in the data structure of the index to be searched is less than the length of the common character prefix filled in the data structure of the right child node, then the above-mentioned case eleven is entered, and the value corresponding to the data structure of the index to be searched, which is “011010110000101100101011010…”, is matched with the value corresponding to the right child node of the right child node.

[0119] The common character prefix of the right child node relative to the right child node is "ABC", the length of the common character prefix is ​​3, and the non-common character prefix part is "XYZ".

[0120] Therefore, the data structure corresponding to the right child node of the right child node is “011010110000101100101011010…”, and its value is equal to the value of the data structure corresponding to the index to be searched, so the match is successful.

[0121] In one implementation, after obtaining the target index sequence, a binary search tree corresponding to the target index can be constructed in real time. After constructing the binary search tree corresponding to the target index sequence, the length of the common character prefix of the index stored in the node on the binary search tree relative to the index stored in the parent node corresponding to that node can be further calculated. Then, based on the calculated length of the common character prefix and the index content corresponding to the part after the common character prefix of the index stored in the node on the binary search tree relative to the index stored in the parent node corresponding to that node, the indexes stored in the node on the binary search tree are encoded into the data structure.

[0122] The method for calculating the common character prefix is ​​not specifically limited in this specification. For example, the length of the common character prefix of each node in the binary search tree relative to its parent node can be determined by matching characters one by one in the string. Alternatively, it can be calculated using preset regular inequalities or preset string matching functions, etc.

[0123] For details on how the binary search tree is encoded into the data structure, please refer to the above text, which will not be repeated here.

[0124] In one embodiment, at least one bit in the data structure is a preset first flag bit; the first flag bit is used to indicate the bit in the data structure used to fill the length of the common character prefix.

[0125] Before encoding each node of the binary search tree into the data structure, the number and position of bits used to fill the length of the common character prefix, and the number and position of bits after the index content corresponding to the part after the common character prefix can be determined based on the value of the first flag bit. The calculated length of the common character prefix is ​​then filled into the determined bits. Additionally, the index content corresponding to the part after the common character prefix of the index stored by the node on the binary search tree relative to the index stored by the parent node of that node is stored in the remaining bits other than the determined bits.

[0126] For example, taking a 64-bit data structure as an example, one bit can be used as a preset first flag bit, and this bit is in the lower 1 position. When the first flag bit is 0, the higher 3 bits can be used to fill the length of the common character prefix, and the remaining 60 bits can be used to fill the index content corresponding to the part after the common character prefix. When the first flag bit is 1, the higher 4 bits can be used to fill the length of the common character prefix, and the remaining 59 bits can be used to fill the index content corresponding to the part after the common character prefix.

[0127] In one implementation, the value corresponding to the first flag bit includes a first value and a second value; wherein, if the length of the common character prefix does not exceed a preset threshold, the value of the first flag bit is the first value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is N and the positions of the N bits in the data structure; if the length of the common character prefix exceeds the preset threshold, the value of the first flag bit is the second value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is M and the positions of the M bits in the data structure; the value of M is greater than the value of N.

[0128] The preset threshold for the length of the common character prefix can be the maximum value of the common character prefixes of all indexes in the database, or it can be further set according to the specific needs of the actual application. This specification does not make any specific limitation.

[0129] In practical applications, the values ​​of M and N can be set according to the specific needs of the application, and are not specifically limited in this manual.

[0130] In one implementation, the indexes in the target index sequence further include composite indexes; the composite index is an index composed of at least two indexes, including an index containing the string type. For example, "ABCDEFG-ZXY-123456" is a composite index with three indexes. The indexes can be numbered from left to right: "ABCDEFG" is index 1, "ZXY" is index 2, and 123456 is index 3.

[0131] In this data structure, at least one bit can be a second flag bit, which is used to mark the index identifier of the composite index stored in the node on the binary search tree, relative to the index of the common index of the composite index stored in the parent node corresponding to that node.

[0132] For example, the root node of a binary search tree stores an index of "ABCDEFG-XYZ-123456", the left child node of the root node stores an index of "ABCDEFG-XY-123456", the left child node of the root node stores a composite index, and the common index relative to the composite index stored in the root node is "ABCDEFG", the common index identifier is 1, and the non-common index is "XY-123456".

[0133] Before encoding each node in the binary search tree into the data format, the composite index stored in the node in the binary search tree, the index identifier of the common index of the composite index stored relative to the parent node corresponding to that node, and at least one non-common index can be determined.

[0134] Based on the index identifier of the public index, the length of the common prefix of the non-public index of the node in the binary search tree relative to the parent node of the node, and the index content corresponding to the part after the common character prefix of the non-public index of the node in the binary search tree relative to the parent node of the node, the indexes stored in the nodes of the binary search tree are respectively encoded into the data structure.

[0135] It should be noted that the composite index may contain both string and integer type indexes. For string type indexes, the encoding data structure can be found above and will not be repeated here.

[0136] For integer indexes, if the number of bits in the integer is less than the number of bits in the data structure used to fill the integer index, the integer index is directly filled into the data structure, and any excess bits are padded with 0. If the number of bits in the integer is greater than the number of bits in the data structure used to fill the integer index, the lower bits of the integer are discarded, and the integer is converted into the integer of the number of bits used to fill the integer index in the data structure.

[0137] For example, the data structure is a 64-bit integer, and the high 3 bits of the data structure are composite index flags. Therefore, the number of bits used to fill the index of the integer type in the data structure is 61. When the index is a 32-bit integer type, the 32-bit integer is directly filled into the low 61 bits of the data structure, and any extra bits are padded with 0. When the index is a 64-bit integer type, the low 3 bits of the index are discarded, and the remaining 61 bits are filled into the low 61 bits of the data type.

[0138] The following example illustrates in detail how to encode a composite index into the aforementioned data structure.

[0139] Please see Figure 5 , Figure 5 This is an exemplary embodiment illustrating a schematic diagram of a binary search tree containing a target index sequence that matches the index.

[0140] For example, Figure 5The root node stores the index "ABCDEFG-XYZ-123456". All bits in the corresponding data structure stored in the root node are used to fill the portion of the index content corresponding to the index stored in that node. Taking a 64-bit data structure as an example, 64 bits are clearly insufficient to fill the entire composite index content. Therefore, in practical applications, it can be pre-defined that the index content corresponding to the first index in the composite index is filled, and the index content of the remaining indices is filled sequentially. Alternatively, a portion of several bits can be used to fill the index content of the first index, and another portion to fill the index content of the second index, and so on, with several bits used to store the index content of each index in the composite index.

[0141] Here, we take the example of filling the data structure corresponding to the root node with the index content of the first index in the composite index, and then filling the index content of the remaining indexes in turn.

[0142] Therefore, the data structure corresponding to the root node can be "01000001 01000010 0100001101000100…", where 01000001 is the binary value of character A, 01000010 is the binary value of character B, and so on. Accordingly, the actual content stored in the nodes of the binary search tree is the value corresponding to this data structure.

[0143] For example, Figure 5 The left child node of the root node of the binary search tree shown is “ABCDEFG-XY-123456”. Compared with the root node of the binary search tree “ABCDEFG-XYZ-123456”, the common index is “ABCDEFG”, the index identifier of the common index is 1, the first non-common index is “XYZ”, the common character prefix of the non-common index is “XY”, so the length of the common character prefix is ​​2, and the part after the common character prefix is ​​“Z”.

[0144] Taking a 64-bit data structure as an example, the high 2 bits are used to fill the index identifier of the common index in the composite index, the high 3-5 bits can be used to fill the length of the common character prefix of the non-common index, and the remaining bits are used to fill the part of the index content after the common character prefix of the non-common index.

[0145] Therefore, the data structure corresponding to the left child node of the root node can be “010100101101000000000…” where 01 represents the index identifier 1 of the common index, 010 represents the length 2 of the common character prefix of the non-common index, 01011010 represents the binary value of the character Z, and the remaining bits are padded with 0.

[0146] For example, Figure 5The left child of the left child of the root node of the binary search tree shown is “ABCDEF-XY-123456”. Relative to its parent node “ABCDEFG-XY-123456”, the common index is empty, the first non-common index is “ABCDEFG”, the common character prefix of the non-common index is “ABCDEF”, and since the length of the common character prefix is ​​6, the part after the common character prefix is ​​“F”.

[0147] Therefore, the data structure corresponding to the left child of the left child of the root node can be “001100100011000000000 …”, where 00 represents the index identifier 0 of the common index, that is, there is no common index, 110 represents the length of the common character prefix of the non-common index is 6, 01000110 represents the binary value of the character F, and the remaining bits are padded with 0.

[0148] The matching process for composite indexes is generally similar to the matching process described above. The index to be searched can be matched with the root node in advance. If the match fails, the index to be searched is encoded into the data structure relative to the root node, and the left and right child nodes of the root node are further matched.

[0149] Before matching with the left and right child nodes of the root node, the value of the second flag bit in the data structure corresponding to the index to be searched can be compared with the value of the second flag bit in the data structure corresponding to the index stored in the left or right child node.

[0150] When the index to be searched is less than the index stored in the root node, it can be matched with the left child node of the root node.

[0151] If the value of the second flag bit in the data structure corresponding to the index to be searched is less than the value of the second flag bit in the data structure corresponding to the index stored in the left child node, it means that the left child node and the root node have the same common index. The index to be searched is smaller than the root node. Therefore, the index to be searched is smaller than the left child node. The value of the index to be searched is valid relative to the data structure corresponding to the root node, and it is directly matched with the left child node of the left child node.

[0152] If the value of the second flag bit in the data structure corresponding to the index to be searched is greater than the value of the second flag bit in the data structure corresponding to the index stored in the left child node, it means that the index to be searched has the same common index as the root node. Since the left child node is smaller than the root node, it means that the left child node is smaller than the index to be searched. Therefore, the index to be searched can be encoded into the data structure relative to the left child node, and further matched with the right child node of the left child node.

[0153] When the index to be searched is greater than the index stored in the root node, it can be matched with the right child node of the root node.

[0154] If the value of the second flag bit in the data structure corresponding to the index to be searched is less than the value of the second flag bit in the data structure corresponding to the index stored in the right child node, it means that the right child node and the root node have the same common index. The index to be searched is larger than the root node. Therefore, the index to be searched is larger than the right child node. The value of the index to be searched is valid relative to the data structure corresponding to the root node, and it is directly matched with the right child node of the right child node.

[0155] If the value of the second flag bit in the data structure corresponding to the index to be searched is greater than the value of the second flag bit in the data structure corresponding to the index stored in the right child node, it means that the index to be searched has the same common index as the root node. Since the right child node is larger than the root node, it means that the right child node is larger than the index to be searched. Therefore, the index to be searched can be encoded into the data structure relative to the right child node, and further matched with the left child node of the right child node.

[0156] If the value of the second flag in the data structure corresponding to the index to be searched is equal to the value of the second flag in the data structure corresponding to the index stored in the left or right child node, it means that the common index is successfully matched. Further matching can be performed based on the matching method of the length of the common prefix characters mentioned above, which will not be elaborated here.

[0157] The following example illustrates the matching process of the composite index described above. Figure 5 Taking the binary search tree corresponding to the target index sequence containing the composite index as an example, and the 64-bit data structure as an example, it can be specified that the high 2 bits of the 64 bits are used to fill the index identifier of the common index, the high 3-5 bits are used for the length of the common character prefix, and the remaining bits are used to fill the part of the index content after the length of the common character prefix.

[0158] Assume the index to be searched is “ABCDEF-XY-123456”. When the index to be searched is matched with the target index sequence for the first time, the index content corresponding to the index to be searched is filled into the several bits. The data structure corresponding to the index to be searched is “010000010100001001000011 01000100…”.

[0159] If the value of the data structure corresponding to the index to be searched is less than the value of the data structure corresponding to the root node of the binary search tree, then the root node of the binary search tree encodes the index to be searched into a new data structure and matches it with the left child node of the root node.

[0160] The index to be searched relative to the root node of the binary search tree has an empty common index, the first non-common index is "ABCDEF", the common character prefix of the non-common index is "ABCDEF", the length of the common character prefix is ​​6, and the part after the common character prefix is ​​empty.

[0161] Therefore, the data structure corresponding to the index to be searched is "00110000000000000000000000000…".

[0162] The left child node has a common index of "ABCDEFG" relative to the root node of the binary search tree. The index flag of the common index is 1. The first non-common index is "XY". The common character prefix of the non-common index is "XY". The length of the common character prefix is ​​2. The part after the common character prefix is ​​empty.

[0163] Therefore, the data structure corresponding to the left child node is "010100000000000000000000000000…".

[0164] Since the value of the second flag bit in the data structure corresponding to the index to be searched is less than the value of the second flag bit in the data structure corresponding to the index stored in the left child node, it indicates that the left child node and the root node have the same common index. The index to be searched is smaller than the root node. Therefore, the index to be searched is smaller than the left child node. The value of the index to be searched is valid relative to the data structure corresponding to the root node, and it is directly matched with the left child node of the left child node.

[0165] The left child node of the left child node relative to the left child node has an empty common index, the first non-common index is "ABCDEF", the common character prefix of the non-common index is "ABCDEF", the length of the common character prefix is ​​6, and the part after the common character prefix is ​​empty.

[0166] Therefore, the data structure corresponding to the left child node of the left child node is “00110000000000000000000000000…”, and its value is equal to the value of the data structure corresponding to the index to be searched, so the match is successful.

[0167] In the above embodiments, the composite key composed of multiple keys is converted into a preset data format and stored as a value. During matching, the traditional string matching is also converted into value matching, which reduces the index lookup operation and improves the efficiency of index lookup.

[0168] In one embodiment, the data structure described above typically includes 64 bits; the 64 bits include 4 bits for representing the first flag bit and the second flag bit; the first flag bit corresponds to 1 bit of the 4 bits; and the second flag bit corresponds to 3 bits of the 4 bits.

[0169] It should be noted that in practical applications, the above data structure is only one possible implementation. The positions of the first flag bit and the second flag bit can be arbitrarily chosen. For example, the high 4 bits or the low 4 bits of the 64 bits can be used to represent the first flag bit and the second flag bit. No specific limitation is made in this specification.

[0170] When the data structure is 64 bits, the preset threshold for the length of the common character prefix can usually be set to 4096; correspondingly, the value of N can be 12; and the value of M can be 20.

[0171] Wherein, if the length of the common character prefix does not exceed 4096, the value of the first flag bit is 1, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 12, and the 12 bits are either the lower 12 bits or the higher 12 bits of the 64 bits; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 48, and the 48 bits are either the lower 12-59 bits or the higher 12-59 bits of the 64 bits;

[0172] If the length of the common character prefix exceeds 4096, the value of the first flag bit is 0, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 20; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 40, and the 40 bits are the lower 20-59 bits or the higher 20-59 bits of the 64 bits.

[0173] For example, taking a 64-bit data structure as an example, the high 4 bits of the 64 bits represent the first flag bit and the second flag bit. The high 1-3 bits of the high 4 bits represent the first flag bit, and the 4th bit of the high 4 bits represents the second flag bit.

[0174] like Figure 5Taking the left child node of the root node of the binary search tree corresponding to the target index sequence containing the composite index as an example, the left child node, relative to the root node of the binary search tree, has a common index of "ABCDEFG", with the index flag bit of the common index set to 1. The first non-common index is "XY", the common character prefix of the non-common index is "XY", the length of the common character prefix is ​​2, and the part after the common character prefix is ​​empty. The length of the common character prefix does not exceed the threshold, so the value of the first flag bit is 1. The high 12-59 bits are used to fill the part after the common character prefix, and the low 0-11 bits are used to fill the length of the common character prefix.

[0175] Therefore, the data structure corresponding to the left child node is "00110 ...

[0176] Corresponding to the embodiments of the index lookup method described above, this specification also provides embodiments of the index lookup device.

[0177] Please see Figure 6 , Figure 6 This is an exemplary embodiment illustrating the hardware structure of an electronic device containing an index lookup device. At the hardware level, the device includes a processor 602, an internal bus 604, a network interface 606, memory 608, and non-volatile memory 610, and may also include other necessary hardware. One or more embodiments of this specification can be implemented in software, for example, the processor 602 reads the corresponding computer program from the non-volatile memory 610 into memory 608 and then runs it. Of course, besides software implementation, one or more embodiments of this specification do not exclude other implementation methods, such as logic devices or a combination of hardware and software, etc. That is to say, the execution entity of the following processing flow is not limited to individual logic units, but can also be hardware or logic devices.

[0178] Please see Figure 7 , Figure 7 This is a block diagram illustrating an index lookup device according to an exemplary embodiment. The index lookup device can be applied to... Figure 6 The electronic device shown implements the technical solution of this specification. The aforementioned index lookup device may include:

[0179] The index sequence acquisition unit 702 is used to acquire a target index sequence containing the index to be searched; at least a portion of the indexes in the target index sequence are string type indexes.

[0180] The lookup index encoding unit 704 is used to encode the lookup index into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and to determine a first value corresponding to the encoded data structure; wherein, the nodes of the binary search tree corresponding to the target index sequence are used to store each index contained in the target index sequence; the data structure of the index stored in the node of the binary search tree includes several bits for filling at least a portion of the index content of the index stored in the node, and several bits for filling the length of the common character prefix of the index stored in the node relative to the index stored in the parent node corresponding to the node;

[0181] The index matching unit 706 is used to match the first value with the second value corresponding to the data structure of the node on the binary search tree using a binary search method, so as to find the index to be searched on the binary search tree.

[0182] In this embodiment, the database maintains an index tree for storing data;

[0183] The index sequence acquisition unit is further configured to search for a target index sequence containing the index to be searched from the index tree maintained by the database for storing data, using a binary search method.

[0184] In this embodiment, the index tree includes any one of the following index trees:

[0185] B-tree, B+ tree, Bw tree.

[0186] In this embodiment, the device further includes: a binary search tree construction unit, used to construct a binary search tree corresponding to the obtained target index sequence in real time;

[0187] Calculate the length of the common character prefix of the index stored in the node on the binary search tree relative to the index stored in the parent node corresponding to that node;

[0188] Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure according to the index content corresponding to the part after the common character prefix of the index stored in the parent node of the node.

[0189] In this embodiment, at least one bit in the data structure is a preset first flag bit; the first flag bit is used to indicate the bit in the data structure used to fill the length of the common character prefix;

[0190] The binary search tree construction unit is further configured to determine the bit position for filling the length of the common character prefix in the data structure based on the first flag bit, and fill the calculated length of the common character prefix into the determined bit position; and to store the index content corresponding to the part after the common character prefix of the index stored by the node on the binary search tree relative to the index stored by the parent node of the node into the remaining bit position other than the determined bit position.

[0191] In this embodiment, the value corresponding to the first flag bit includes a first value and a second value; wherein, if the length of the common character prefix does not exceed a preset threshold, the value of the first flag bit is the first value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is N and the positions of the N bits in the data structure; if the length of the common character prefix exceeds the preset threshold, the value of the first flag bit is the second value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is M and the positions of the M bits in the data structure; the value of M is greater than the value of N.

[0192] In this embodiment, the indexes in the target index sequence include composite indexes; the composite index is an index composed of at least two indexes, including an index of the string type.

[0193] In this embodiment, at least two bits in the data structure are preset second flag bits; the second flag bits are used to mark the index identifier of the composite index stored in the node on the binary search tree, relative to the index identifier of the common index of the composite index stored in the parent node corresponding to the node.

[0194] The binary search tree construction unit is further configured to determine the composite index stored in the node on the binary search tree, the index identifier of the common index of the composite index stored relative to the parent node corresponding to the node, and at least one non-common index.

[0195] Based on the index identifier of the public index, the length of the common prefix of the non-public index of the node on the binary search tree relative to the parent node of the node, and the index content corresponding to the part after the common character prefix of the non-public index of the node on the binary search tree relative to the parent node of the node, the indexes stored in the nodes on the binary search tree are respectively encoded into the data structure.

[0196] In this embodiment, the data structure includes 64 bits; the 64 bits include 4 bits for representing the first flag bit and the second flag bit; the first flag bit corresponds to 1 bit of the 4 bits; the second flag bit corresponds to 3 bits of the 4 bits.

[0197] In this embodiment, the high 4 bits or the low 4 bits of the 64-bit array are used to represent the first flag bit and the second flag bit.

[0198] In this embodiment, the preset threshold is 4096; the value of N is 12; and the value of M is 20.

[0199] Wherein, if the length of the common character prefix does not exceed 4096, the value of the first flag bit is 1, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 12, and the 12 bits are either the lower 12 bits or the higher 12 bits of the 64 bits; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 48, and the 48 bits are either the lower 12-59 bits or the higher 12-59 bits of the 64 bits;

[0200] If the length of the common character prefix exceeds 4096, the value of the first flag bit is 0, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 20; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 40, and the 40 bits are the lower 20-59 bits or the higher 20-59 bits of the 64 bits.

[0201] The specific implementation process of the functions and roles of each unit in the above device can be found in the implementation process of the corresponding steps in the above method, and will not be repeated here.

[0202] For the device embodiments, since they basically correspond to the method embodiments, the relevant parts can be referred to in the description of the method embodiments. The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of the solution in this specification according to actual needs. Those skilled in the art can understand and implement this without creative effort.

[0203] Corresponding to the embodiments of the index lookup method described above, this specification also provides embodiments of the index lookup system.

[0204] The aforementioned index lookup system may include:

[0205] An index sequence acquisition subsystem is used to acquire a target index sequence containing the index to be searched; at least a portion of the indexes in the target index sequence are string type indexes.

[0206] A lookup index encoding subsystem is used to encode the lookup index into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and to determine a first value corresponding to the encoded data structure; wherein, the nodes of the binary search tree corresponding to the target index sequence are used to store each index contained in the target index sequence; the data structure of the index stored in the node of the binary search tree includes several bits for filling at least a portion of the index content of the index stored in the node, and several bits for filling the length of the common character prefix of the index stored in the node relative to the index stored in the parent node corresponding to the node;

[0207] The index matching subsystem is used to match the first value with the second value corresponding to the data structure of the node on the binary search tree using a binary search method, so as to find the index to be searched on the binary search tree.

[0208] The implementation process of the functions and roles of each subsystem in the above system is detailed in the implementation process of the corresponding steps in the above method, and will not be repeated here.

[0209] For system embodiments, since they basically correspond to method embodiments, relevant details can be found in the descriptions of the method embodiments. The system embodiments described above are merely illustrative. The subsystems described as separate components may or may not be physically separate, and the components shown as subsystems may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of the solution described in this specification, depending on actual needs. Those skilled in the art can understand and implement this without creative effort. This specification also provides an embodiment of a computer-readable storage medium. The computer-readable storage medium stores machine-readable instructions, which, when called and executed by a processor, can implement the index lookup method provided in any embodiment of this specification.

[0210] The computer-readable storage media provided in the embodiments of this specification may include, but are not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random Access Memory), EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), flash memory, magnetic cards, or fiber optic cards. In other words, readable storage media include readable media capable of storing or transmitting information.

[0211] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer, which can take the form of a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email sending and receiving device, game console, tablet computer, wearable device, or any combination of these devices.

[0212] In a typical configuration, a computer includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0213] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0214] Computer-readable media include both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0215] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0216] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.

[0217] The terminology used in one or more embodiments of this specification is for the purpose of describing particular embodiments only and is not intended to limit the scope of one or more embodiments of this specification. The singular forms “a,” “described,” and “the” as used in one or more embodiments of this specification and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any or all possible combinations of one or more associated listed items.

[0218] It should be understood that although the terms first, second, third, etc., may be used to describe various information in one or more embodiments of this specification, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, first information may also be referred to as second information without departing from the scope of one or more embodiments of this specification, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "in response to a determination," or "when," or "in the event of a determination."

[0219] The above description is merely a preferred embodiment of one or more embodiments of this specification and is not intended to limit the scope of one or more embodiments of this specification. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of one or more embodiments of this specification should be included within the protection scope of one or more embodiments of this specification.

Claims

1. An index lookup method, the method comprising: Retrieve the target index sequence containing the index to be searched; At least a portion of the indices in the target index sequence are string type indices; The index to be searched is encoded into a data structure of the index stored in a node of a binary search tree corresponding to the target index sequence, and a first value corresponding to the encoded data structure is determined; wherein, the nodes of the binary search tree corresponding to the target index sequence are respectively used to store each index contained in the target index sequence; the data structure of the index stored in the node of the binary search tree includes several bits for filling at least part of the index content of the index stored in the node, and several bits for filling the length of the common character prefix of the index stored in the node relative to the index stored in the parent node corresponding to the node; A binary search method is used to match the first value with the second value corresponding to the data structure of the node on the binary search tree, so as to find the index to be searched on the binary search tree; The length of the number of bits used to fill the common character prefix of the index stored in the node relative to the index stored in the corresponding parent node is determined by any of the following methods: The length of the bits used to fill the length of the common character prefix in the data structure is predetermined; A first flag bit is set in the data structure, which is used to mark the bit in the data structure used to fill the length of the common character prefix.

2. The method according to claim 1, wherein the database maintains an index tree for storing data; The step of obtaining the target index sequence containing the index to be searched includes: The target index sequence containing the index to be searched is found from the index tree maintained by the database for storing data using a binary search method.

3. The method according to claim 1, further comprising, before encoding the index to be searched into a data structure of the index stored in the nodes of the binary search tree corresponding to the target index sequence: Construct a binary search tree corresponding to the obtained target index sequence in real time; Calculate the length of the common character prefix of the index stored in the node on the binary search tree relative to the index stored in the parent node corresponding to that node; Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure according to the index content corresponding to the part after the common character prefix of the index stored in the parent node of the node.

4. The method according to claim 3, wherein at least one bit in the data structure is a preset first flag bit; Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure corresponding to the partial index content after the common prefix relative to the index stored in the parent node of that node, including: Based on the first flag bit, determine the bit position in the data structure to fill the length of the common character prefix, and fill the calculated length of the common character prefix into the determined bit position; and store the index content corresponding to the part after the common character prefix of the index stored by the node on the binary search tree relative to the index stored by the parent node of the node into the remaining bit positions other than the determined bit position.

5. The method according to claim 4, wherein the value corresponding to the first flag bit includes a first value and a second value; wherein, If the length of the common character prefix does not exceed a preset threshold, the first flag bit takes a first value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is N and the positions of the N bits in the data structure; if the length of the common character prefix exceeds the preset threshold, the first flag bit takes a second value, indicating that the number of bits used to fill the length of the common character prefix in the data structure is M and the positions of the M bits in the data structure; the value of M is greater than the value of N.

6. The method according to claim 3, wherein the indexes in the target index sequence include composite indexes; the composite index is an index composed of at least two indexes, including an index of the string type.

7. The method according to claim 6, wherein at least two bits in the data structure are preset second flag bits; the second flag bits are used to mark the index identifier of the composite index stored in the node on the binary search tree relative to the common index of the composite index stored in the parent node corresponding to the node; Based on the calculated length of the common character prefix and the index stored in the node of the binary search tree, the indexes stored in the node of the binary search tree are encoded into the data structure relative to the index of the parent node of that node, after the common character prefix. This includes: Determine the composite index stored in the node on the binary search tree, the index identifier of the common index of the composite index stored relative to the parent node corresponding to that node, and at least one non-common index; Based on the index identifier of the public index, the length of the common prefix of the non-public index of the node on the binary search tree relative to the parent node of the node, and the index content corresponding to the part after the common character prefix of the non-public index of the node on the binary search tree relative to the parent node of the node, the indexes stored in the nodes on the binary search tree are respectively encoded into the data structure.

8. The method according to claim 7, wherein the data structure comprises 64 bits; the 64 bits include 4 bits for representing the first flag bit and the second flag bit; the first flag bit corresponds to 1 bit of the 4 bits; and the second flag bit corresponds to 3 bits of the 4 bits.

9. The method according to claim 8, wherein the high 4 bits or the low 4 bits of the 64 bits are used to represent the first flag bit and the second flag bit.

10. The method according to claim 9, wherein the preset threshold is 4096; N is 12; and M is 20; in, If the length of the common character prefix does not exceed 4096, then the value of the first flag bit is 1, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 12, and the 12 bits are either the lower 12 bits or the higher 12 bits of the 64 bits; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 48, and the 48 bits are either the lower 12-59 bits or the higher 12-59 bits of the 64 bits; If the length of the common character prefix exceeds 4096, the value of the first flag bit is 0, indicating that the number of bits used to fill the length of the common character prefix in the data structure is 20; correspondingly, the number of bits used to fill the index content corresponding to the non-common character prefix in the data structure is 40, and the 40 bits are the lower 20-59 bits or the higher 20-59 bits of the 64 bits.

11. An index lookup system that performs the method as described in any one of claims 1-10.

12. An electronic device, comprising a communication interface, a processor, a memory, and a bus, wherein the communication interface, the processor, and the memory are interconnected via the bus; The memory stores machine-readable instructions, and the processor executes the method according to any one of claims 1-10 by invoking the machine-readable instructions.

13. A computer-readable storage medium storing machine-readable instructions that, when invoked and executed by a processor, implement the method of any one of claims 1-10.