Adaptive indexing method and device

An adaptive and indexing technology, applied in the database field, can solve the problems of slow database convergence, resource consumption of query processing, low query efficiency, etc., and achieve the effect of reducing resource consumption, improving convergence speed, and increasing convergence speed.

Inactive Publication Date: 2015-04-15
NEC (CHINA) CO LTD
2 Cites 4 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0007] The database using the HCC algorithm continuously updates the result data block set through range data query, and continuously updates the index of the result data block set to gradually improve the query efficiency of the database. However, although the data of each result data block recorded in the index range, which can improve the efficiency of locating the result data block. However, after the result data block is located through the index, the query efficiency is s...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

In the embodiment of the invention, through the setting of the cost condition, only the result data block that the cost of the sorting process meets the requirements can be sorted, in the early stage of the database, the resource consumption in the query process can be reduced, and, by satisfying a certain cost Sorting the result data blocks of the conditions can gradually make the result data blocks in the result data block set orderly, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can improve the query efficiency more effectively. Therefore, the database Convergence speed can be improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.
In the embodiment of the invention, through the setting of the cost condition, only the result data block whose cost of the sorting process meets the requirements can be sorted, in the early stage of the database, the resource consumption in the query process can be reduced, and the efficiency of the query can be improved simultaneously, Moreover, by sorting the result data blocks that meet a certain cost condition, the result data blocks in the result data block set can be gradually ordered, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can be more effectively improved. Query efficiency, so the convergence speed of the database can be improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.
I...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses an adaptive indexing method and an adaptive indexing device, and belongs to the database technology field. The adaptive indexing method includes: receiving an query request which carries range query conditions; obtaining at least one first result data block corresponding to the query request according to the range query conditions; sorting data in the at least one first result data block corresponding to the range query conditions, which is disordered and meets a preset price condition; updating a result data block set according to the result data blocks which are sorted and the other result data blocks which are not sorted, and updating an index of the result data block set. By using the adaptive indexing method and the adaptive indexing device, convergence rate of a database is improved on the premise that resource consumption is reduced during the early query process of the database.

Application Domain

Technology Topic

Image

  • Adaptive indexing method and device
  • Adaptive indexing method and device
  • Adaptive indexing method and device

Examples

  • Experimental program(3)

Example Embodiment

[0030] Embodiment one
[0031] The embodiment of the present invention provides a method for self-adaptive indexing, and the result data block set of the database is indexed, such as figure 1 As shown, the processing flow of the method may include the following steps:
[0032] Step 101, receiving a query request carrying range query conditions.
[0033] Step 102, according to the range query condition, at least one first result data block corresponding to the query request is obtained.
[0034] Step 103 , in the first result data block corresponding to the query request, perform intra-block data sorting on the result data blocks whose data in the block is out of order and satisfy the preset cost condition.
[0035] Step 104: Update the result data block set according to the sorted first result data block and the unsorted first result data block, and update the index of the result data block set.
[0036] In the embodiment of the invention, by setting the cost condition, only the result data blocks whose sorting processing cost meets the requirements can be sorted, and the resource consumption in the query process can be reduced in the early stage of the database. Sorting the data blocks can gradually make the result data blocks in the result data block set orderly, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can improve the query efficiency more effectively. Therefore, the convergence speed of the database can be get improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.

Example Embodiment

[0037] Embodiment two
[0038] The embodiment of the present invention provides a method for self-adaptive indexing. The result data block set of the database has an index, and the data range of each result data block in the result data block set can be recorded in the index. Preferably, an AVL tree can be used as the result The index of the data block collection, each leaf node of the AVL tree corresponds to record the data range of each result data block. The execution subject of the method may be a server or a terminal device with a database established therein.
[0039] The following will be combined with specific implementation methods to figure 1The processing flow shown is described in detail, and the content can be as follows:
[0040] Step 101, receiving a query request carrying range query conditions.
[0041] Wherein, the range query condition is a query condition for querying data within a certain data range, for example, the range query condition may be greater than a and less than b, or greater than c, and so on.
[0042] In implementation, when performing large-scale data analysis, a database can be established for the data to be analyzed, and analysts can set range query conditions according to the needs of data analysis, and send corresponding query requests to the database. For example, analysts can query the data in the database. For all data within the range of 1-100, the corresponding range query condition can be [1,100].
[0043] Or, an application program can establish a database locally on the terminal. During the process of using the application program, the user can query the range data locally on the terminal, and send a query request with a certain range query condition to the terminal. When using instant messaging, you can query friends whose age range is 20-30 years old, and the corresponding range query condition can be [20,30].
[0044] Step 102, according to the range query condition, at least one first result data block corresponding to the query request is obtained.
[0045] Wherein, the first result data block corresponding to the query request is a data block composed of data satisfying the range query condition of the query request. In this step, one first result data block or multiple first result data blocks can be obtained. , each first result data block obtained from the query may include the original result data block (that is, the result data block obtained directly from the result data block set), or may include a newly generated result data block.
[0046] Specifically, data query may be performed in the initial data block set and/or the result data block set of the database respectively according to the range query condition, to obtain the first result data block corresponding to the query request. In the early stage of database establishment, no query has been performed, and there is only an initial data block set in the database, but no result data block set. At this time, the query can only be performed in the initial data block set. In the mid-term of database operation, the database includes both the initial data block set and the result data block set, and queries can be performed in both the initial data block set and the result data block set. In the later stage of database operation, after a large number of range data queries, the data in the initial data block set has been transferred to the result data block set through the range data query process. There is only a result data block set in the database, and there is no initial data block set. At this time, you can only query in the result data block collection. As the proportion of the result data block collection in the database increases, the query efficiency of the database will gradually increase.
[0047] There is no intersection between the data ranges of the initial data block set and the result data block set, and the data range of the result data block set can be recorded. According to the relationship between the data range of the result data block set and the query range (the query range refers to the data range of the range query condition ), determine whether to query in the result data block set, and determine whether to query in the initial data block set, correspondingly, when step 102 is executed, different processing can be performed in the following situations:
[0048] Processing 1, if the data range of the result data block set completely includes the query range, then determine at least one first result data block corresponding to the query request according to the data in the result data block set that meets the range query condition.
[0049] Wherein, the result data block set includes the result data block obtained when the range data query is performed on the database, and the result data block set can be updated during the range data query process. An index corresponding to each result data block may be established in the result data block set, and the data range of each result data block may be recorded in the index. Preferably, an AVL tree can be used as the index of the result data block collection. In the AVL tree, each leaf node corresponds to a result data block, and each leaf node is arranged according to the order of the data range of the corresponding result data block (according to the result data block The characteristics of the establishment and update process of the collection, the data range of each result data block will not overlap).
[0050] In implementation, the data range of the result data block set can be compared with the query range, if the query range is completely within the data range of the result data block set, it means that the data to be queried is in the result data block set, the initial data The block set has no data to be queried, so it can be processed according to processing 1, and only the result data block set is queried. The data range of the result data block set can be obtained in the index (such as AVL tree) of the result data block set.
[0051] Specifically, according to the relationship between the data range of each result data block in the result data block set and the query range, different methods may be used to obtain the first result data block corresponding to the query request, specifically as follows:
[0052] Case 1, for each result data block in the result data block set, if a part of the data in the block is within the query range, it is judged whether the data in the block is in order, if the data in the block is out of order, use the split crack method in Query the data that meets the range query conditions in the block to form the first result data block corresponding to the query request. If the data in the block is in order, then query the data that meets the range query conditions according to the order of the data in the block to form the first result data block corresponding to the query request. A result data block.
[0053] In implementation, the data range of the result data block can be compared with the query range, and if the latter contains a part of the former, it can be determined that a part of the data of the result data block is within the query range. The data range of the result data block can be obtained from the index of the result data block set, and the index records the data range of each result data block in the result data block set.
[0054] For example, the set of result data blocks includes result data blocks with ranges [1,10], [11,20], [21,30], [31,40], and the range query condition is [5,35], then, The result data blocks in the range [1,10], [31,40] belong to case 1, and the result data blocks in the range [11,20], [21,30] belong to case 2 below.
[0055] For the result data block of case 1, because the result data block contains both data that meets the range query conditions and data that does not meet the range query conditions, it is necessary to query within the block to determine the data that meets the range query conditions.
[0056] There are many ways to judge whether the data in the block is ordered. Preferably, the data in the block of the result data block can be marked when the result data block is sorted, and the block of the result data block without this mark can be determined. The data is out of order.
[0057] When using the crack method to perform an intra-block query on a certain result data block, the data that meets the range query conditions in the block is extracted from the result data block, and at the same time, the remaining data in the result data block is split, that is, the larger-than-range query The conditional data and the data less than the range query condition are placed in different data blocks respectively, and two (or one) new result data blocks are obtained, and the proposed data meeting the range query conditions constitutes the result data block of the query request. In this way, two or three result data blocks can be obtained, one of which is the first result data block corresponding to the query request.
[0058] There are many ways to query the data that meets the range query condition according to the order of the data in the block. Preferably, the half method can be used to query. Specifically, first, the boundary value of the range query condition can be searched by the half method within the block. The position, then, can determine the data within the block that meets the range query condition based on the position of the boundary value and the order of the data within the block.
[0059] For example, the data range of the result data block is [0,12], the data order in the block is increasing, and the data range of the range query condition is [8,40], you can take the intermediate value 6 between 0 and 12 and compare it with 8, 6 is less than 8, then continue to compare the intermediate value of 6 and 12 between 9 and 8, and so on, until the position of 8 in the result data block is determined, because the data order is increasing, so take all the data from the position of 8 backward , as the data in the result block that meets the range query criteria.
[0060] In case 2, for each result data block in the result data block set, if all the data in the block is within the query range, it will be used as the first result data block corresponding to the query request.
[0061] In implementation, the data range of the result data block can be compared with the query range, and if the latter completely includes the former, it can be determined that all the data of the result data block is within the query range.
[0062] For the result data block of case 2, because its data is all within the query range, it is not necessary to perform data query in the block, and obtain all the data in the block to form the first result data block corresponding to the query request, that is, directly query this type of result data block The result data block serves as the first result data block corresponding to the query request.
[0063] Processing two, if the data range of the result data block set does not include the query range at all, then determine at least one first result data block corresponding to the query request according to the data in the initial data block set that meets the range query condition.
[0064] Wherein, the initial data block set includes one or more initial data blocks, and each initial data block is obtained by grouping original data according to pre-established rules when the database is established. In processing two, according to the data found in the initial data block set that meets the range query condition, the determined number of result data blocks of the query request is preferably one.
[0065] In the implementation, the data range of the result data block set can be compared with the query range. If there is no intersection between the two, it means that the result data block set does not contain data that meets the range query conditions, while the initial data block set may contain data that meets the query range. The data of the range query condition, so it can be processed according to the second processing, and only the initial data block set is queried.
[0066] Specifically, in the second process, the crack method can be used to query the data meeting the range query condition in the initial data block set, and the queried data can be combined to obtain the first result data block corresponding to the query request.
[0067] Each initial data block can be queried by the crack method, and the data that meets the range query condition is extracted from the corresponding initial data block, and at the same time, the remaining data rows in the initial data block are split, that is, the data that is larger than the range query condition and The data smaller than the range query conditions are placed in different data blocks, and two (or one) new initial data blocks are obtained. Then, all the data meeting the range query condition proposed in the initial data block set are combined to obtain a first result data block corresponding to the query request.
[0068] Processing three, if the data range of the result data block set includes a part of the query range, then determine at least one first result data block corresponding to the query request according to the data in the result data block set that meets the range query conditions, and, according to the initial data block For the data in the collection that meets the range query conditions, determine at least one first result data block corresponding to the query request, at least one first result data block corresponding to the query request determined in the result data block set, and the initial data block set At least one first result data block corresponding to the query request determined in , collectively serves as the first result data block corresponding to the query request.
[0069]In implementation, the data range of the result data block set can be compared with the query range. If the former contains a part of the latter range, it means that the result data block set contains data that meets the range query conditions, and the initial data block set may also Contains data that meets the range query conditions, so it can be processed according to processing three, and the initial data block set and the result data block set are queried respectively. For the specific processing process, please refer to the contents of processing one and processing two. Then, all the first result data blocks determined in the result data block set and the first result data blocks determined in the initial data block set are used as the first result data blocks corresponding to the query request.
[0070] Step 103 , in the first result data block corresponding to the query request, perform intra-block data sorting on the result data blocks whose data in the block is out of order and satisfy the preset cost condition.
[0071] Wherein, the preset cost condition may be a requirement on resource consumption of the current query and/or subsequent query, and the resource may be time, processing resources (processor, memory and other resources) and the like.
[0072] Specifically, the processing of step 103 may be as follows:
[0073] First, in the first result data block corresponding to the query request, a result data block whose data in the block is out of order is selected as the second result data block.
[0074] In this step 103, according to the preset cost condition, it is judged to sort the data in the first result data blocks, so the result data blocks with unordered data in the block can be selected for subsequent processing, and for the result data blocks with ordered data, then There is no need to judge whether to sort or not. For the method of judging whether the data in the block is in order, refer to the content in the above-mentioned processing 1, and will not repeat it here.
[0075] Then, the current income and subsequent income corresponding to each second result data block are obtained.
[0076] Among them, the current income corresponding to the result data block is the amount of resources saved by this query when no intra-block data sorting is currently performed, that is, this query does not perform intra-block data sorting on the result data block. Compared with internal data sorting, the amount of resources that can be saved in this query process. The subsequent revenue corresponding to the result data block is the amount of resources saved by subsequent queries (which can be subsequent query processing within a preset period of time, such as one month, one week, etc.) when the data in the block is currently sorted, that is, this time The query performs intra-block data sorting on the result data block. Compared with not performing intra-block data sorting, the total amount of resources that can be saved in subsequent query processing within a preset period of time.
[0077] A cost function u=F(T) can be set to represent the relationship between the amount of resources consumed and the processing time, where T is the processing time and u is the cost value, which is used to represent the amount of resources consumed. The resource can be time, or Processing resources, etc., this function can be considered as a linear function, that is, F(T)=a*T………………(1)
[0078] Among them, a is a fixed coefficient (can be calculated from experimental data), if the resource is time, then F(T)=T.
[0079] According to the definition of current income and subsequent income, they can be represented by F(T) as follows:
[0080] Current Earnings: F(T 0 +T C +T S )-F(T 0 +T C )………………………(2)
[0081] Subsequent income: p*t*(F(T 1 +T C )-F(T 1 +T B ))…………………… (3)
[0082] Among them, T C is the query processing time of the crack method, that is, the processing time of the crack method query on the data block, T B is the half-way query processing time, that is, the processing time of the half-way query on the data block, T S is the sorting processing time, that is, the processing time for sorting data blocks, T 0 T1 is the processing time of work other than crack query and sorting in the current query process, T1 is the processing time of work other than crack query and half-method query in each subsequent query process, and p is the result data block Data query frequency, t is the preset time period for calculating the later income, such as one month or three months, etc. The subsequent income can be considered as the amount of resources saved in the subsequent period of time. The data query frequency of the result data block is used to indicate the frequency at which the data in the result data block is queried, and the data query frequency is an estimated value.
[0083] Visible, F(T 0 +T C +T S ) can represent, for a certain result data block, if the data in the block is sorted, the amount of resources consumed by this query process; F(T 0 +T C ) can represent, for the result data block, if the data in the block is not sorted, the amount of resources consumed by this query process. Their subtraction means that if the query does not sort the data in the result data block, the amount of resources that can be saved in the query process is the current income corresponding to the result data block.
[0084] F(T 1 +T C ) can approximately represent, for a certain result data block, if the data in the block is not sorted, the amount of resources consumed by each subsequent query process; F(T 1 +T B ) can approximately represent, for the result data block, if the data in the block is sorted, the amount of resources consumed by each subsequent query process. Their subtraction means that if the data in the block is sorted for the result data block, the amount of resources that can be saved in each subsequent query process. Then multiply this difference with p and t to get the subsequent income corresponding to the result data block.
[0085] Specifically, the process of obtaining the current revenue corresponding to each second result data block may be as follows:
[0086] Obtain the sorting processing duration corresponding to each second result data block; determine the current income corresponding to each second result data block according to the sorting processing duration corresponding to each second result data block.
[0087] From formulas (1) and (2), we can see that the current income can be expressed as a*T S , and a is a fixed coefficient, so when obtaining the current income corresponding to a certain result data block, it can be based on the T corresponding to the result data block S To determine the corresponding current income.
[0088] Specifically, the process of obtaining the subsequent income corresponding to each second result data block may be as follows:
[0089] Obtain the query processing time corresponding to each second result data block; obtain the data query frequency of each second result data block; according to the query processing time corresponding to each second result data block, and the The data query frequency determines the subsequent revenue corresponding to each second result data block.
[0090] Wherein, the query processing time may include the query processing time of the crack method and the query processing time of the half method.
[0091] From formulas (1) and (3), we can see that the follow-up income can be expressed as p*t*a*(T C -T B ), and a is a fixed coefficient, and t is a preset value, so when obtaining the follow-up income corresponding to a certain result data block, it can be based on the T corresponding to the result data block C , T B and p to determine the corresponding subsequent benefits.
[0092] Preferably, in the above processing, the sorting processing duration corresponding to each second result data block can be determined according to the data number (n) of each second result data block; according to the data number (n) of each second result data block number, to determine the query processing duration corresponding to each second result data block, specifically, according to the data number of each second result data block, determine the crack method query processing duration corresponding to each second result data block, according to The number of data of each second result data block determines the half-method query processing time corresponding to each second result data block.
[0093] Because, for a result data block whose number of data is n, the computational complexity of querying by crack method is proportional to n, and the computational complexity of querying by half method is proportional to lg 2 Proportional to n, the computational complexity of sorting is n*lg 2 n is proportional, and the computational complexity is proportional to the processing time. Therefore, the following relationship can be obtained:
[0094] T C =A*n,T B =B*lg 2 n.T S =C*n*lg 2 n……………………(4)
[0095] Among them, A, B, and C are fixed coefficients, which are determined by the software and hardware environment of the database. The values ​​of A, B, and C can be determined through experiments, and experiments are performed on the three processing processes of crack method query, half method query and sorting, and then according to n, T in the experimental data C , T B , T S Determine the values ​​of A, B, and C.
[0096] Preferably, there are many ways to obtain the data query frequency of the result data block. For example, a preset default value (such as 0.5) can be used, and the default value can be an empirical value; or, the previous query can be counted. The query occurrence frequency of data within the data range of the result data block is taken as the data query frequency of the result data block; or, the average value of the query frequency of each data in the result data block in the previous query can be calculated, and the This average is used as the data query frequency for the result data block. Preferably, at the initial stage of database establishment, the number of historical queries is small, and the default value can be used as the data query frequency of the result data block. When the number of queries of the database reaches a certain number, a statistical method can be used to determine the data query frequency of the result data block .
[0097] According to the above formulas, the current income and subsequent income can be expressed as follows:
[0098] The current income is a*C*n*lg 2 n, the follow-up income is p*t*a*(A*n-B*lg 2 n).
[0099] Then, calculate the preset relationship between the current income and subsequent income corresponding to each second result data block, and select the result data block whose preset relationship satisfies the preset condition among all the second result data blocks as the first Three result data blocks.
[0100] Wherein, the cost condition in the selection process is a requirement for a preset relationship between the current revenue corresponding to the result data block and the subsequent revenue. The preset relationship can be any calculation relationship such as proportional relationship, difference relationship or product relationship. For example, the preset relationship can be The preset relationship can be set according to actual needs.
[0101] In the implementation, the preset conditions can be set according to the actual needs, and the preset conditions can be expressed in the form of equality or inequality. The preset relationship satisfies the following inequality:
[0102]
[0103] For another example, when paying more attention to the early income, you can set the requirement that the preset relationship satisfy the following inequality:
[0104]
[0105] Among them, α and β can be preset values, the value range is greater than 0 and less than 1, and can be set arbitrarily according to the needs. The higher the value of α, the higher the requirement for subsequent income, and the lower the value of β, Indicates higher requirements for early-stage earnings.
[0106] The expression of the current income and subsequent income obtained above, that is, the current income is a*C*n*lg 2 n, the follow-up income is p*t*a*(A*n-B*lg 2 n), substituting the equation or inequality corresponding to the preset condition, and then substituting n and p of a certain result data block, the relationship between the current income and subsequent income corresponding to the result data block can be judged according to whether the equality or inequality holds true Whether the pre-set relationship satisfies the pre-set conditions.
[0107] Taking (5) as an example, the following inequality can be obtained:
[0108] p * t * a * ( A * n - B * lg 2 n ) a * C * n * lg 2 n + p * t * a * ( A * n - B * lg 2 n ) α
[0109] That is (1-α)*p*t*(A*n-B*lg 2 n)>α*C*n*lg 2 n…………(7)
[0110] If the inequality (7) holds true, it can be determined that the relationship between the current income and the subsequent income corresponding to the result data block satisfies the preset condition of (5). For other preset conditions, similar methods may be used to calculate and determine them, and the embodiments of the present invention will not repeat them here.
[0111] Finally, intra-block data sorting is performed on each third result data block.
[0112] There are many methods for data sorting, and any method can be used for data sorting according to requirements, which is not limited in the embodiment of the present invention.
[0113] Preferably, after the intra-block data sorting is performed on the third result data block, the third result data block can be recorded as a result data block with ordered intra-block data.
[0114] After the above sorting process, each first result data block corresponding to the query request may include sorted result data blocks and/or unsorted result data blocks.
[0115] Step 104: Update the result data block set according to the sorted first result data block and the unsorted first result data block, and update the index of the result data block set.
[0116] Specifically, among the first result data blocks corresponding to the query request, for the first result data block that is sorted within the block during this query process, if there is a corresponding first result data block in the result data block set If there is no result data block before sorting of the first result data block, replace the result data block before sorting with the first result data block, if there is no result data block before sorting corresponding to the first result data block in the result data block set, then replace The first result data block is added to the result data block set; among the first result data blocks corresponding to the query request, for the first result data block that has not been sorted within the block during the query process, if in the result If the data of the first result data block is not contained in the data block set, then the first result data block is added to the result data block set.
[0117] When updating the result data block set, the index of the result data block set can be updated based on the updated result data block set, and an index item corresponding to the newly added result data block can be added to the index to record the newly added result data The data extent of the block. The index may preferably use an AVL tree, and the update process of the AVL tree may be performed in a conventional manner, which is not limited in this embodiment of the present invention.
[0118] like figure 2 Shown is an application example provided by the embodiment of the present invention. When establishing the database, the original data is divided according to the pre-established division rules, and the initial data blocks A1, A2, A3, A4 are obtained to form a set of initial data blocks.
[0119] Query 1 is the first range data query after the database is established, and its processing process can be as follows:
[0120] Step 1: Receive a first query request carrying a range query condition. The range query condition is "c to i".
[0121] Step 2, use the crack method to query the data that meets the query condition of the range in the initial data block set, and merge the queried data to obtain the result data block B1.
[0122] Step 3, after judging and determining that the result data block B1 does not meet the preset cost condition, so it is not sorted within the block.
[0123] Step 4, add the result data block B1 to the set of result data blocks.
[0124] The processing of query 2 can be as follows:
[0125] Step 1: Receive a second query request carrying a range query condition. The range query condition is "e to l".
[0126] Step 2: Use the crack method to query the data that meets the range query condition in the result data block set, obtain the result data block B3, split the result data block B2, and use the crack method to query the data that meets the range query condition in the initial data block set Data, merge the queried data to obtain the result data block B4.
[0127] Step 3: After judging, it is determined that the result data blocks B3 and B4 both meet the preset cost conditions and the data in the blocks are disordered, so the result data blocks B3 and B4 are sorted to obtain the result data blocks B3' and B4'.
[0128] Step 4, add the result data blocks B3', B4' to the result data block set.
[0129] In the embodiment of the invention, by setting the cost condition, only the result data blocks whose sorting processing cost meets the requirements can be sorted, and the resource consumption in the query process can be reduced in the early stage of the database. Sorting the data blocks can gradually make the result data blocks in the result data block set orderly, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can improve the query efficiency more effectively. Therefore, the convergence speed of the database can be get improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.
[0130] The embodiment of the present invention also provides a data query method, the data query method adopts the above-mentioned self-adaptive indexing method, in addition to the processing of the above process, the data query method also includes the following processing: corresponding to the query request The data in the first result data block of is fed back as the query result.
[0131] The result feedback step can be performed after obtaining the first result data block corresponding to the query request, or after the result data block set is updated, or after the index of the result data block set is updated.
[0132] In the embodiment of the invention, by setting the cost condition, only the result data blocks whose sorting processing cost meets the requirements can be sorted. In the early stage of the database, the resource consumption in the query process can be reduced, and the query efficiency can be improved at the same time. Moreover, through Sorting the result data blocks that meet a certain cost condition can gradually make the result data blocks in the result data block set orderly, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can more effectively improve the query efficiency. Therefore, the convergence speed of the database can be improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.

Example Embodiment

[0133] Embodiment three
[0134] Based on the same technical idea, the embodiment of the present invention also provides an adaptive indexing device, and the result data block set of the database is indexed, such as image 3 As shown, the device includes:
[0135] A receiving module 310, configured to receive a query request carrying a range query condition;
[0136] An acquisition module 320, configured to acquire at least one first result data block corresponding to the query request according to the range query condition;
[0137] A sorting module 330, configured to, in the first result data block corresponding to the query request, perform intra-block data sorting on the result data blocks whose data in the block is out of order and satisfy a preset cost condition;
[0138] The update module 340 is configured to update the result data block set according to the sorted first result data blocks and the unsorted first result data blocks, and update the index of the result data block set.
[0139] Preferably, the acquiring module 320 is configured to:
[0140] If the data range of the result data block set completely includes the query range, then determine at least one first result data block corresponding to the query request according to the data in the result data block set that meets the query condition of the range; wherein, The query range is the data range of the range query condition;
[0141] If the data range of the result data block set does not include the query range at all, then determine at least one first result data block corresponding to the query request according to the data in the initial data block set that meets the query condition of the range;
[0142] If the data range of the result data block set includes a part of the query range, then determine at least one first result data block corresponding to the query request according to the data in the result data block set that meets the query condition of the range , and, according to the data in the initial data block set that meets the range query condition, at least one first result data block corresponding to the query request is determined, and the query request determined in the result data block set corresponds to At least one first result data block of , and at least one first result data block corresponding to the query request determined in the initial data block set, together serve as the first result data block corresponding to the query request.
[0143] Preferably, the acquiring module 320 is configured to:
[0144] For each result data block in the set of result data blocks, if a part of the data in the block is within the query range, it is judged whether the data in the block is in order, if the data in the block is out of order, then use the split crack method in Query the data that meets the range query condition in the block to form the first result data block corresponding to the query request, if the data in the block is in order, then query the data that meets the range query condition according to the order of the data in the block, forming a first result data block corresponding to the query request;
[0145] For each result data block in the set of result data blocks, if all the data in the block is within the query range, it is used as the first result data block corresponding to the query request.
[0146] Preferably, the acquiring module 320 is configured to:
[0147] In the initial data block set, use the crack method to query data that meets the range query condition, and combine the queried data to obtain the first result data block corresponding to the query request.
[0148] Preferably, the sorting module 330 is used for:
[0149]In the first result data block corresponding to the query request, select a result data block with out-of-order data in the block as the second result data block;
[0150] Obtain the current income and subsequent income corresponding to each second result data block, calculate the preset relationship between the current income and subsequent income corresponding to each second result data block, and select the preset Assume that the result data block whose relationship satisfies the preset condition is used as the third result data block; where the current income corresponding to the second result data block is the amount of resources saved by this query when the data in the block is not currently sorted. The subsequent income corresponding to the second result data block is the amount of resources saved by subsequent queries when the data in the block is currently sorted;
[0151] Intra-block data sorting is performed on each third result data block.
[0152] Preferably, the sorting module 330 is used for:
[0153] According to the data number of each second result data block, determine the corresponding sorting processing duration of each second result data block;
[0154] According to the sorting processing duration corresponding to each second result data block, the current income corresponding to each second result data block is determined.
[0155] Preferably, the sorting module 330 is used for:
[0156] Determine the query processing duration corresponding to each second result data block according to the number of data in each second result data block;
[0157] Obtain the data query frequency of each second result data block;
[0158] The subsequent revenue corresponding to each second result data block is determined according to the query processing time corresponding to each second result data block and the data query frequency of each second result data block.
[0159] In the embodiment of the invention, by setting the cost condition, only the result data blocks whose sorting processing cost meets the requirements can be sorted, and the resource consumption in the query process can be reduced in the early stage of the database. Sorting the data blocks can gradually make the result data blocks in the result data block set orderly, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can improve the query efficiency more effectively. Therefore, the convergence speed of the database can be get improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.
[0160] An embodiment of the present invention also provides a data query device, the data query device includes the above-mentioned self-adaptive indexing device, and the data query device further includes:
[0161] A feedback module, configured to feed back the data in the first result data block corresponding to the query request as a query result.
[0162] In the embodiment of the invention, by setting the cost condition, only the result data blocks whose sorting processing cost meets the requirements can be sorted. In the early stage of the database, the resource consumption in the query process can be reduced, and the query efficiency can be improved at the same time. Moreover, through Sorting the result data blocks that meet a certain cost condition can gradually make the result data blocks in the result data block set orderly, because one sorting can achieve the convergence effect that can only be achieved by several splits, and can more effectively improve the query efficiency. Therefore, the convergence speed of the database can be improved. Therefore, while reducing the resource consumption in the early stage query process of the database, the convergence speed of the database can be improved.
[0163] It should be noted that when the device for adaptive indexing provided by the above-mentioned embodiments updates the index, it only uses the division of the above-mentioned functional modules as an example for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to needs. , that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the apparatus for adaptive indexing provided in the above embodiment and the method embodiment for adaptive indexing belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
[0164] The serial numbers of the above embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Classification and recommendation of technical efficacy words

  • Reduce resource consumption
  • Improve query efficiency

Method and system for freezing/thawing procedures

Owner:GUANGZHOU JIUBANG DIGITAL TECH

Modeling method based on mass laser radar grid point cloud data

Owner:BEIJING UNIVERSITY OF CIVIL ENGINEERING AND ARCHITECTURE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products