A method and apparatus for secure data statistics, a storage medium and an electronic device

By performing sequence partitioning and parallel statistics on dense-state data, the problem of network interaction latency in dense-state data statistics is solved, and a more efficient statistical process is achieved.

CN116232919BActive Publication Date: 2026-06-19ALIPAY (HANGZHOU) INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Filing Date
2022-12-31
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, when dense data statistics need to be performed multiple times within multiple different statistical ranges, network interaction latency is severe, resulting in low efficiency of serial operations.

Method used

A combined serial and parallel approach is adopted. First, the dense data is arranged to obtain a sequence. Then, it is divided according to the preset segmentation and edge segment rules. Dense state statistics are performed in parallel to obtain the statistical values ​​of each subsequence and edge segment, and finally the overall statistical result is determined.

Benefits of technology

Parallel processing reduces network interaction latency, decreases serialization, and improves the efficiency of dense data statistics.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116232919B_ABST
    Figure CN116232919B_ABST
Patent Text Reader

Abstract

This specification discloses a method, apparatus, storage medium, and electronic device for dense-state data statistics. The method includes: first, arranging the determined dense-state data to be statistically analyzed to obtain a dense-state data sequence; then, for each preset segmentation rule, dividing the dense-state data sequence into subsequences according to that rule; and performing dense-state statistics on the dense-state data contained in each subsequence in parallel to obtain the segmented statistical value of each subsequence. Next, for each preset edge segmentation rule, dividing the dense-state data sequence into edges according to that rule; and determining the edge statistical value of each edge segment in parallel based on the segmented statistical value of each subsequence corresponding to each segmentation rule; and finally, determining the statistical result of the dense-state statistics for each dense-state data based on the edge statistical value. Using a combined serial and parallel approach for dense-state data statistics can reduce serialization and decrease network interaction latency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This specification relates to the field of computer technology, and in particular to a method, apparatus, storage medium, and electronic device for dense data statistics. Background Technology

[0002] With the development of trusted encrypted computing technology, encrypted computing has been widely used. By using encrypted data for computation in a trusted environment, privacy data can be better protected. Trusted encrypted computing is a trusted privacy computing technology that performs computation, storage, and transfer of data in encrypted form in a high-speed interconnected trusted node cluster.

[0003] Dense-state data statistics is a computational method that uses a specific strategy to obtain output results from input dense-state data. Typically, dense-state data statistics are not performed only once within a fixed statistical range, but rather multiple times across several different statistical ranges. For example, given a set of data numbered 0-3, and needing to perform statistics on the dense-state data in the ranges 0-1, 2-3, and 0-3, one can perform statistics on all three ranges simultaneously, or one can first perform statistics on the dense-state data in the ranges 0-1 and 2-3, then perform statistics on the dense-state data in the range 0-3. When performing statistics on the dense-state data in the range 0-3, the statistical results for the range 0-3 can be derived from the statistical results in the ranges 0-1 and 2-3.

[0004] Meanwhile, since the encrypted data may be stored on different nodes, encrypted data statistics operations require network interaction with the nodes storing the encrypted data to obtain it. For example, if encrypted statistics need to be performed on elements 1 to 3, but these three elements are stored on nodes 1, 2, and 3 respectively, statistical requests need to be sent to nodes 1 to 3 to perform network interaction and count the elements 1 to 3 stored on nodes 1 to 3. The network interaction latency has a significant impact on serial encrypted operations. For instance, assuming a network interaction latency of 1ms, 1000 serial encrypted operations would require a 1s latency, while 1000 concurrent operations might only require a 2ms latency.

[0005] Therefore, how to perform dense-state data statistics to reduce serialization is an urgent problem to be solved. Summary of the Invention

[0006] This specification provides a method, apparatus, storage medium, and electronic device for dense data statistics, in order to partially solve the aforementioned problems existing in the prior art.

[0007] The following technical solution is adopted in this specification:

[0008] This specification provides a method for dense-state data statistics, including:

[0009] The dense state data to be statistically analyzed are determined, and the dense state data are arranged to obtain a dense state data sequence;

[0010] For each preset segmentation rule, the dense data sequence is divided according to the segmentation rule to obtain each subsequence corresponding to the segmentation rule. Dense state statistics are performed on the dense data contained in each subsequence corresponding to the segmentation rule in parallel to obtain the segmentation statistics value of each subsequence corresponding to the segmentation rule.

[0011] For each preset edge segmentation rule, the dense data sequence is divided according to that edge segmentation rule to obtain each edge segment corresponding to that edge segmentation rule. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the edge segment statistics of each edge segment corresponding to that edge segmentation rule are determined in parallel. The edge segmentation rule includes: rules that take the start position of the dense data sequence as the start position of the edge segmentation and the length of each edge segment being a specified length corresponding to that edge segmentation rule; or rules that take the end position of the dense data sequence as the end position of the edge segmentation and the length of each edge segment being a specified length corresponding to that edge segmentation rule.

[0012] Based on the edge segment statistics of each edge segment corresponding to each edge segment division rule, the statistical results of dense state statistics for each dense state data are determined.

[0013] Optionally, for each preset segmentation rule, the dense-state data sequence is divided according to that segmentation rule to obtain sub-sequences corresponding to that segmentation rule, specifically including:

[0014] For the i-th segmentation rule, obtain the preset specified value k;

[0015] The dense data sequence is divided into k-th segments. i The subsequences of , where i is a positive integer.

[0016] Optionally, dense-state statistics are performed in parallel on the dense-state data contained in each subsequence corresponding to this segmentation rule, specifically including:

[0017] For the i-th segmentation rule, query the segmentation statistics of each subsequence corresponding to the (i-1)-th segmentation rule;

[0018] If found, then based on the segmentation statistics of each subsequence corresponding to the (i-1)th segmentation rule, determine the segmentation statistics of each subsequence corresponding to the i-th segmentation rule in parallel.

[0019] If no match is found, then for each subsequence corresponding to the i-th segmentation rule, perform dense state statistics based on the dense state data contained in the subsequence to obtain the segmentation statistics value of the subsequence.

[0020] Optionally, the dense-state data sequence is divided into segments of length k. i The subsequences specifically include:

[0021] Determine whether the number of each dense state data contained in the dense state data sequence is k. n n is any positive integer;

[0022] If so, then the dense data sequence is divided into segments of length k. i Each subsequence of;

[0023] Otherwise, empty data is added to the dense data sequence so that the number of each dense data type in the supplemented dense data sequence is k. n The supplemented dense-state data sequence is then divided into k-th segments. i of each subsequence.

[0024] Optionally, for each preset edge segment partitioning rule, the dense-state data sequence is partitioned according to that rule to obtain the edge segments corresponding to that rule, specifically including:

[0025] For the i-th edge segment partitioning rule, determine the step size k corresponding to the i-th edge segment partitioning rule. n-i-1 According to the step size k n-i-1 , m×k n-i-1 The specified lengths corresponding to the i-th type of edge segment division rule are determined, where m is a positive integer;

[0026] The dense data sequence is divided into segments with the starting position of the segment division rule as the starting position and the length of each segment being a specified length corresponding to the i-th segment division rule. Alternatively, the dense data sequence is divided into segments with the ending position of the segment division rule as the ending position and the length of each segment being a specified length corresponding to the i-th segment division rule.

[0027] Optionally, based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the segmentation statistics of each segment corresponding to this segmentation rule are determined in parallel, specifically including:

[0028] For the i-th edge segment partitioning rule, query the edge segment statistics of each edge segment corresponding to the (i-1)-th edge segment partitioning rule;

[0029] If found, the segment statistics of each subsequence corresponding to each segmentation rule and the segment statistics of each segment corresponding to the (i-1)th segmentation rule are determined in parallel.

[0030] If no match is found, the segment statistics of each segment corresponding to the i-th segment division rule are determined in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule.

[0031] Optionally, determine the step size k corresponding to the i-th edge segment partitioning rule. n-i-1 Specifically, it includes:

[0032] When ni-1 is not greater than 0, mk+j is determined as the specified lengths corresponding to the i-th edge segment partitioning rule, where j is a positive integer less than k.

[0033] Optionally, based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the segmentation statistics of each segment corresponding to this segmentation rule are determined in parallel, specifically including:

[0034] When ni-1 is not greater than 0, for each edge segment corresponding to the i-th edge segment partitioning rule, the subsequence contained in the edge segment is determined as the specified subsequence, and the dense state data contained in the edge segment other than the subsequence is determined as the specified dense state data.

[0035] The edge statistics of the edge segment are determined based on the segmented statistics of the specified subsequence and the specified dense state data.

[0036] This specification provides an apparatus for dense-state data statistics, comprising:

[0037] The determination module is used to determine each dense state data to be statistically analyzed, and to arrange the dense state data to obtain a dense state data sequence;

[0038] The segmentation module is used to divide the dense data sequence according to each preset segmentation rule to obtain each subsequence corresponding to the segmentation rule, and to perform dense state statistics on the dense data contained in each subsequence corresponding to the segmentation rule in parallel to obtain the segmentation statistics value of each subsequence corresponding to the segmentation rule.

[0039] The edge segment module is used to divide the dense data sequence according to each preset edge segmentation rule to obtain each edge segment corresponding to the edge segmentation rule. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the module determines the edge segment statistics of each edge segment corresponding to the edge segmentation rule in parallel. The edge segmentation rule includes: rules that take the start position of the dense data sequence as the start position of the edge segment and the length of each edge segment is a specified length corresponding to the edge segmentation rule; or rules that take the end position of the dense data sequence as the end position of the edge segment and the length of each edge segment is a specified length corresponding to the edge segmentation rule.

[0040] The results module is used to determine the statistical results of dense state statistics for each dense state data based on the edge statistics values ​​of each edge segment corresponding to each edge segment division rule.

[0041] Optionally, the segmentation module is specifically used to: obtain a preset specified value k for the i-th segmentation rule; and divide the dense data sequence into segments of length k. i The subsequences of , where i is a positive integer.

[0042] Optionally, the segmentation module is specifically used to: query the segmentation statistics of each subsequence corresponding to the (i-1)th segmentation rule for the i-th segmentation rule; if found, determine the segmentation statistics of each subsequence corresponding to the i-th segmentation rule in parallel based on the segmentation statistics of each subsequence corresponding to the (i-1)th segmentation rule; if not found, perform dense state statistics on each subsequence corresponding to the i-th segmentation rule in parallel based on the dense state data contained in the subsequence to obtain the segmentation statistics of the subsequence.

[0043] Optionally, the segmentation module is specifically used to determine whether the number of each dense state data contained in the dense state data sequence is k. n n is any positive integer; if so, then the dense data sequence is divided into k... i The sequence is divided into subsequences; otherwise, empty data is added to the dense data sequence so that the number of each dense data in the supplemented dense data sequence is k. n The supplemented dense-state data sequence is then divided into k-th segments. i of each subsequence.

[0044] Optionally, the edge segment module is specifically used to determine the step size k corresponding to the i-th edge segment partitioning rule. n-i-1 According to the step size k n-i-1 , m×k n-i-1The specified lengths corresponding to the i-th edge segment partitioning rule are determined, where m is a positive integer; the dense data sequence is partitioned with the starting position of the dense data sequence as the starting position of the partitioned edge segments and the lengths corresponding to the i-th edge segment partitioning rule as the specified lengths, to obtain the edge segments corresponding to the i-th edge segment partitioning rule; or, the dense data sequence is partitioned with the ending position of the dense data sequence as the ending position of the partitioned edge segments and the lengths corresponding to the i-th edge segment partitioning rule as the specified lengths, to obtain the edge segments corresponding to the i-th edge segment partitioning rule.

[0045] Optionally, the edge segment module is specifically used to: query the edge segment statistics of each edge segment corresponding to the (i-1)th edge segment division rule for the i-th edge segment division rule; if found, determine the edge segment statistics of each edge segment corresponding to the i-th edge segment division rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule and the edge segment statistics of each edge segment corresponding to the (i-1)th edge segment division rule; if not found, determine the edge segment statistics of each edge segment corresponding to the i-th edge segment division rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule.

[0046] Optionally, the edge segment module is specifically used to determine mk+j as the specified lengths corresponding to the i-th edge segment division rule when ni-1 is not greater than 0, where j is a positive integer less than k.

[0047] Optionally, the edge segment module is specifically used to, when ni-1 is not greater than 0, in parallel for each edge segment corresponding to the i-th edge segment partitioning rule, determine the subsequence contained in the edge segment as a specified subsequence, and determine the dense state data contained in the edge segment other than the subsequence as a specified dense state data; and determine the edge segment statistics value of the edge segment based on the segmentation statistics value of the specified subsequence and the specified dense state data.

[0048] This specification provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method for dense data statistics.

[0049] This specification provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the above-described method for dense data statistics.

[0050] The above-mentioned technical solutions adopted in this specification can achieve the following beneficial effects:

[0051] The method for dense-state data statistics provided in this specification first determines the dense-state data to be statistically analyzed and arranges them to obtain a dense-state data sequence. Then, for each preset segmentation rule, the dense-state data sequence is divided according to that rule to obtain subsequences corresponding to that segmentation rule. Dense-state statistics are then performed on the dense-state data contained in each subsequence corresponding to that segmentation rule in parallel to obtain the segmentation statistical value for each subsequence corresponding to that segmentation rule. Next, for each preset edge segmentation rule, the dense-state data sequence is divided according to that rule to obtain each edge segment corresponding to that rule. Based on the segmentation statistical value of each subsequence corresponding to each segmentation rule, the edge segment statistical value of each edge segment corresponding to that edge segmentation rule is determined in parallel. Finally, based on the edge segment statistical values ​​of each edge segment corresponding to each edge segmentation rule, the statistical result of the dense-state statistics for each dense-state data is determined.

[0052] As can be seen from the above method, this method first arranges the determined dense-state data to be statistically analyzed to obtain a dense-state data sequence. For each preset segmentation rule, the dense-state data sequence is divided into subsequences according to that rule. Dense-state statistics are then performed in parallel on the dense-state data contained in each subsequence to obtain the segmented statistical values ​​of each subsequence. Then, for each preset edge segmentation rule, the dense-state data sequence is divided into edges according to that rule. Based on the segmented statistical values ​​of each subsequence corresponding to each segmentation rule, the edge statistical values ​​of each edge are determined in parallel. Finally, based on the edge statistical values, the statistical results of the dense-state statistics for each dense-state data are determined. Using a combined serial and parallel approach for dense-state data statistics can reduce serialization and decrease network interaction latency. Attached Figure Description

[0053] The accompanying drawings, which are included to provide a further understanding of this specification and form part of this specification, illustrate exemplary embodiments and their descriptions, serving to explain this specification and do not constitute an undue limitation thereof.

[0054] In the picture:

[0055] Figure 1 This is a flowchart illustrating a method for statistical analysis of dense-state data provided in this specification.

[0056] Figure 2 This is a schematic diagram illustrating the segmentation of a dense-state data sequence provided in this specification;

[0057] Figure 3 This is a schematic diagram illustrating the segmentation of a dense-state data sequence provided in this specification.

[0058] Figure 4This is a schematic diagram illustrating a method for filtering the divided edge segments provided in this specification;

[0059] Figure 5 This is a schematic diagram of a dense data sequence after it has been partitioned, as provided in this specification.

[0060] Figure 6 A schematic diagram of a device for dense-state data statistics provided in this specification;

[0061] Figure 7 The corresponding information provided in this specification Figure 1 A schematic diagram of an electronic device. Detailed Implementation

[0062] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, and not all of them. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this specification.

[0063] The embodiments of this specification provide a method, apparatus, storage medium, and electronic device for dense data statistics. The technical solutions provided by the embodiments of this specification are described in detail below with reference to the accompanying drawings.

[0064] Figure 1 This is a flowchart illustrating a method for statistical analysis of dense-state data as described in this specification, specifically including the following steps:

[0065] S100: Determine each dense state data to be statistically analyzed, and arrange the dense state data to obtain a dense state data sequence.

[0066] Encrypted data may be stored on different nodes, with each node storing only a portion of the encrypted data. When performing statistical analysis on the encrypted data, network interaction with the nodes storing the encrypted data is required to retrieve the encrypted data stored on different nodes. Encrypted data refers to data where plaintext information has been encrypted using any encryption technology.

[0067] The device used for dense-state data statistics determines each dense-state data to be statistically analyzed and arranges the dense-state data to obtain a dense-state data sequence. The device used for dense-state data statistics can be a node that stores the dense-state data to be analyzed, or it can be a node that does not store the dense-state data to be analyzed. For ease of explanation, the following description uses a node as the execution subject. When arranging the dense-state data to obtain the dense-state data sequence, the dense-state data can be arranged randomly, or it can be arranged according to preset rules; this specification does not specify a particular limitation.

[0068] For example, suppose the dense data to be counted is data 0 to 15. The node determines each dense data to be counted (i.e. data 0 to 15) and arranges the data 0 to 15 to obtain a dense data sequence. Suppose that the data 0 to 15 are arranged in the order of 0 to 15 to obtain a dense data sequence with sequence number 0 to 15.

[0069] S102: For each preset segmentation rule, the dense data sequence is divided according to the segmentation rule to obtain each subsequence corresponding to the segmentation rule. Dense state statistics are performed on the dense data contained in each subsequence corresponding to the segmentation rule in parallel to obtain the segmentation statistics value of each subsequence corresponding to the segmentation rule.

[0070] For each preset segmentation rule, the node divides the dense data sequence into subsequences corresponding to that segmentation rule. In parallel, it performs dense-state statistics on the dense data contained in each subsequence, obtaining the segmentation statistics value for each subsequence. Specifically, for the i-th segmentation rule, the node can obtain a preset specified value k and divide the dense data sequence into subsequences of length k. i For each subsequence of the i-th segmentation rule, dense state statistics are performed in parallel on the dense state data contained in each subsequence, resulting in the segmentation statistics value for each subsequence corresponding to the i-th segmentation rule. Here, i is a positive integer, and the dense state data sequence is divided into subsequences of length k. i When k represents each subsequence, i The value of k cannot exceed the length of the dense data sequence, which means it cannot exceed the number of dense data. k can be 2 or any positive integer. For ease of explanation, this specification will use a value of 2 as an example in the following process.

[0071] Continuing with the previous example, combined with Figure 2 , Figure 2This diagram illustrates a method for segmenting a dense data sequence as provided in this specification. Assuming a preset value k is 2, a node can obtain the preset value k (i.e., 2) for the i-th segmentation rule, thus dividing the dense data sequence into segments of length 2. i For each subsequence of the i-th segmentation rule, dense state statistics are performed in parallel on the dense state data contained in each subsequence of the i-th segmentation rule to obtain the segmentation statistics value of each subsequence of the i-th segmentation rule. For example, when i is 2, for the second segmentation rule, the dense state data sequence is divided into subsequences of length 2. 2 (i.e., each subsequence of 4), that is Figure 2 The dense state statistics of the dense state data contained in each of the four subsequences 0-3, 4-7, 8-11 and 12-15 corresponding to the second segmentation rule are performed in parallel to obtain the segmentation statistics value of each subsequence corresponding to the second segmentation rule.

[0072] When performing dense-state statistics on the dense-state data contained in each subsequence corresponding to a segmentation rule in parallel, a node can query the segmentation statistics value of each subsequence corresponding to the (i-1)th segmentation rule for the i-th segmentation rule. If found, the node determines the segmentation statistics value of each subsequence corresponding to the i-th segmentation rule in parallel based on the segmentation statistics value of each subsequence corresponding to the (i-1)th segmentation rule. If not found, the node performs dense-state statistics on the dense-state data contained in each subsequence corresponding to the i-th segmentation rule in parallel to obtain the segmentation statistics value of that subsequence. Continuing with the previous example, a node can query the segmentation statistics value of each subsequence corresponding to the first segmentation rule for the second segmentation rule. If found, the node determines the segmentation statistics value of each subsequence corresponding to the second segmentation rule in parallel based on the segmentation statistics value of each subsequence corresponding to the first segmentation rule. If no match is found, then for each subsequence corresponding to the second segmentation rule, dense state statistics are performed in parallel based on the dense state data contained in the subsequence to obtain the segmentation statistics value of the subsequence.

[0073] Assume the nodes first process each subsequence corresponding to the first segmentation rule in parallel (i.e. Figure 2 The dense state data contained in the eight subsequences (0-1, 2-3, 4-5, 6-7, 8-9, 10-11, 12-13, and 14-15) are statistically analyzed to obtain the segmented statistical value of each subsequence corresponding to the first segmentation rule. Therefore, the node can query the segmented statistical value of each subsequence corresponding to the first segmentation rule. Based on the segmented statistical value of each subsequence corresponding to the first segmentation rule, the segmented statistical value of each subsequence corresponding to the second segmentation rule is determined in parallel. For example, the segmented statistical value of subsequence 0-3 can be determined based on the segmented statistical value of subsequence 0-1 and the segmented statistical value of subsequence 2-3 (i.e., the segmented statistical value of the subsequence corresponding to the first segmentation rule).

[0074] Suppose that before performing dense-state statistics on the dense-state data contained in each subsequence corresponding to the second segmentation rule to obtain the segmentation statistics value of each subsequence corresponding to the second segmentation rule, the node did not perform dense-state statistics on the dense-state data contained in each subsequence corresponding to the first segmentation rule, and therefore did not obtain the segmentation statistics value of each subsequence corresponding to the first segmentation rule. So the node cannot find the segmentation statistics value of each subsequence corresponding to the first segmentation rule. Then, in parallel, for each subsequence corresponding to the second segmentation rule (i.e., subsequences 0-3, 4-7, 8-11, and 12-15), it performs dense-state statistics on the dense-state data contained in the subsequence to obtain the segmentation statistics value of the subsequence.

[0075] The node can also determine the order of segmentation rules according to each preset segmentation rule, and then divide the dense data sequence according to each segmentation rule to obtain the subsequences corresponding to that segmentation rule. In parallel, dense statistics are performed on the dense data contained in each subsequence corresponding to that segmentation rule to obtain the segmentation statistics value of each subsequence corresponding to that segmentation rule. The process of dividing the dense data sequence according to segmentation rules and performing dense-state statistics on the dense data contained in each subsequence in parallel is similar to the process described above. The only difference is that the nodes divide the dense data sequence according to each segmentation rule in sequence, and then perform dense-state statistics on the dense data contained in each subsequence in parallel to obtain the segmentation statistics value of each subsequence. Therefore, when performing dense-state statistics on the dense data contained in each subsequence in parallel, except for the subsequences corresponding to the first segmentation rule, the segmentation statistics value of each subsequence corresponding to other segmentation rules can be determined in parallel using the segmentation statistics value of each subsequence corresponding to the previous segmentation rule. The order of the segmentation rules can be determined based on the value of i; the segmentation rules can be sorted in ascending order of i to obtain the order of the segmentation rules.

[0076] S104: For each preset edge segmentation rule, the dense state data sequence is divided according to the edge segmentation rule to obtain each edge segment corresponding to the edge segmentation rule. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the edge segment statistics of each edge segment corresponding to the edge segmentation rule are determined in parallel.

[0077] The edge segment division rule includes: a rule that takes the starting position of the dense data sequence as the starting position of the edge segment and the length of each edge segment is a specified length corresponding to the edge segment division rule; or a rule that takes the ending position of the dense data sequence as the ending position of the edge segment and the length of each edge segment is a specified length corresponding to the edge segment division rule.

[0078] For each preset edge segmentation rule, the node divides the dense data sequence according to that rule, obtaining each edge segment corresponding to that rule. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the node determines the edge statistics of each edge segment corresponding to that rule in parallel. The edge segmentation rule can be a rule that takes the start position of the dense data sequence as the start position of the edge segment and the length of each edge segment as a specified length corresponding to that rule, or a rule that takes the end position of the dense data sequence as the end position of the edge segment and the length of each edge segment as a specified length corresponding to that rule.

[0079] It should be noted that the edge segment mentioned in this specification refers to a sequence segment with a fixed start position and a variable end position, or a sequence segment with a fixed end position and a variable start position. The fixed start position of the edge segment is the start position of the dense data sequence, and the fixed end position of the edge segment is the end position of the dense data sequence.

[0080] Specifically, a node can determine the step size k corresponding to the i-th edge segment partitioning rule. n-i-1 According to step size k n-i-1 , m×k n-i-1 The specified lengths corresponding to the i-th edge segment partitioning rule are determined, where m is a positive integer, and n can be determined based on the number of dense state data contained in the dense state data sequence, which is k. n To determine, m×k n-i-1 The length cannot exceed the length of the dense data sequence, which means the number of dense data points cannot exceed the length of the dense data sequence. Therefore, it can be within the range of the dense data sequence length, according to m×k. n-i-1 The rule that the value of m is not greater than the length of the dense data sequence determines the range of values ​​for m.

[0081] Subsequently, the node can divide the dense data sequence using the starting position of the dense data sequence as the starting position of the segmentation, with the lengths corresponding to the specified lengths of the i-th segmentation rule, to obtain the segments corresponding to the i-th segmentation rule. The node can also divide the dense data sequence using the ending position of the dense data sequence as the ending position of the segmentation, with the lengths corresponding to the specified lengths of the i-th segmentation rule, to obtain the segments corresponding to the i-th segmentation rule. The node can also divide the dense data sequence using both of the above methods simultaneously to obtain the segments corresponding to the i-th segmentation rule. This specification will subsequently explain the method of dividing the dense data sequence using both of the above methods to obtain the segments corresponding to the i-th segmentation rule. Then, based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the segment statistics of each segment corresponding to the i-th segmentation rule are determined in parallel.

[0082] Continuing with the previous example, combined with Figure 3 , Figure 3 This is a schematic diagram illustrating the partitioning of a dense data sequence into segments, as provided in this specification. The number of dense data points in the dense data sequence can be k. n To determine, that is, based on 2 n Given 16, we get n = 4. Assuming i = 1, the node can determine the step size 2 corresponding to the first edge segment partitioning rule. 4-1-1 Since the step size is 4, m×4 is determined as the specified lengths corresponding to the first type of edge segmentation rule. Based on the rule that m×4 is no greater than the length of the dense data sequence (i.e., 16), the value range of m is determined to be 1 to 4. Therefore, the determined specified lengths are 4, 8, 12, and 16. Afterwards, nodes can use the starting position of the dense data sequence as the starting position for edge segmentation (i.e., the position with sequence number 0 in the dense data sequence), with lengths corresponding to the specified lengths (i.e., 4, 8, 12, and 16) of the first type of edge segmentation rule, to divide the dense data sequence and obtain the edge segments corresponding to the first type of edge segmentation rule (i.e.,...). Figure 3 The first edge segment partitioning rule corresponds to subsequences 0-3, 0-7, 0-11, and 0-15. Nodes can also partition the dense data sequence using the ending position of the dense data sequence as the ending position of the edge segment partitioning (i.e., the position with sequence number 15 in the dense data sequence), with lengths corresponding to the specified lengths (4, 8, 12, and 16) of the first edge segment partitioning rule, to obtain the edge segments corresponding to the first edge segment partitioning rule (i.e.,...). Figure 3 The first type of edge segment partitioning rule corresponds to subsequences 12-15, 8-15, 4-15, and 0-15 in the subsequence.

[0083] Since some subsequences have already been segmented in the above segmentation process and the dense state data contained in the subsequences has already been statistically analyzed, for example, the subsequences 0-3 corresponding to the first segmentation rule have already been segmented using the second segmentation rule and the dense state data contained in subsequences 0-3 has already been statistically analyzed, and the subsequences 0-1 corresponding to the second segmentation rule have already been segmented using the first segmentation rule and the dense state data contained in subsequences 0-1 has already been statistically analyzed, therefore, for the subsequences that have already been segmented and statistically analyzed in the above segmentation process, dense state statistics will not be performed again here. In this example, it is not necessary to perform statistics on subsequences 0-3, 0-7, 0-15, 8-15, and 12-15; only the data contained in subsequences 0-11 and 4-15 need to be statistically analyzed. Figure 4 As shown, Figure 4 This is a schematic diagram of filtering the segmented edges provided in this specification. The edge statistics of each segment (i.e., subsequences 0-11 and 4-15) corresponding to the first edge segmentation rule can be determined in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule. For example, the edge statistics of subsequence 0-11 corresponding to the first edge segmentation rule can be determined based on the segment statistics of subsequence 0-7 corresponding to the third segmentation rule and the segment statistics of subsequence 8-11 corresponding to the second segmentation rule.

[0084] When determining the edge statistics of each edge segmentation rule in parallel based on the segmentation statistics of each subsequence corresponding to each segmentation rule, a node can query the edge statistics of each edge segmentation rule corresponding to the (i-1)th edge segmentation rule for the i-th edge segmentation rule. If the query finds the edge statistics, the node determines the edge statistics of each edge segmentation rule corresponding to the i-th edge segmentation rule in parallel based on the segmentation statistics of each subsequence corresponding to each segmentation rule and the edge statistics of each edge segmentation rule corresponding to the (i-1)th edge segmentation rule. If the query does not find the edge statistics, the node determines the edge statistics of each edge segmentation rule corresponding to the i-th edge segmentation rule in parallel based on the segmentation statistics of each subsequence corresponding to each segmentation rule.

[0085] Continuing with the previous example, assuming i is 2, a node can query the edge statistics of each edge corresponding to the first edge partitioning rule for the second edge partitioning rule. If the query finds a match, the node determines the edge statistics of each edge corresponding to the second edge partitioning rule in parallel, based on the segment statistics of each subsequence corresponding to each partitioning rule and the edge statistics of each edge corresponding to the first edge partitioning rule. If the query does not find a match, the node determines the edge statistics of each edge corresponding to the second edge partitioning rule in parallel, based on the segment statistics of each subsequence corresponding to each partitioning rule.

[0086] Assuming the node has already determined the edge statistics of each edge corresponding to the first edge segmentation rule in parallel, the node can query the edge statistics of each edge corresponding to the first edge segmentation rule. Based on the segment statistics of each subsequence corresponding to each segmentation rule, and the edge statistics of each edge corresponding to the first edge segmentation rule, the node can determine the edge statistics of each edge corresponding to the second edge segmentation rule in parallel. For example, based on the segment statistics of subsequences 2 to 3 and subsequences 4 to 15 corresponding to the first edge segmentation rule, the node can determine the edge statistics of subsequences 2 to 15 corresponding to the second edge segmentation rule.

[0087] Assuming that a node does not determine the edge statistics of each edge corresponding to the first edge partitioning rule in parallel before determining the edge statistics of each edge corresponding to the second edge partitioning rule, the node cannot find the edge statistics of each edge corresponding to the first edge partitioning rule. It can determine the edge statistics of each edge corresponding to the second edge partitioning rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule. For example, the edge statistics of subsequence 0 to 5 corresponding to the second edge partitioning rule can be determined based on the segment statistics of subsequence 4 to 5 corresponding to the first segmentation rule and the segment statistics of subsequence 0 to 3 corresponding to the second segmentation rule.

[0088] The node can also determine the order of edge segmentation rules according to each preset edge segmentation rule, and then divide the dense data sequence according to each edge segmentation rule in turn, so as to obtain each edge segment corresponding to the edge segmentation rule. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the node can determine the edge segment statistics of each edge segment corresponding to the edge segmentation rule in parallel.

[0089] The process of dividing the dense data sequence according to the segment partitioning rules and determining the segment statistics of each segment in parallel is similar to the above process. However, the nodes divide the dense data sequence according to each segment partitioning rule in the order of segment partitioning. Based on the segment statistics of each subsequence corresponding to each segmentation rule, the segment statistics of each segment are determined in parallel. Therefore, when determining the segment statistics of each segment in parallel, except for the segments corresponding to the first segment partitioning rule, the segment statistics of each segment corresponding to other segment partitioning rules can be determined in parallel by using the segment statistics of the segments corresponding to the previous segment partitioning rule and the segment statistics of each subsequence corresponding to each segmentation rule. The segment statistics of each segment corresponding to the first segment partitioning rule are determined based on the segment statistics of each subsequence corresponding to each segmentation rule. The order of the edge segment partitioning rules can be determined based on the value of i. The edge segment partitioning rules can be sorted in ascending order of i to obtain the order of the edge segment partitioning rules.

[0090] S106: Determine the statistical results of dense state statistics for each dense state data based on the edge segment statistical values ​​corresponding to each edge segment division rule.

[0091] Nodes can determine the statistical results of dense state statistics for each dense state data based on the edge statistics values ​​corresponding to each edge segment partitioning rule. Continuing with the previous example, nodes can determine the statistical results for each of the 16 dense state data points within the range from the starting position to the dense state data point and the statistical results within the range from the dense state data point to the ending position, based on the edge statistics values ​​corresponding to each edge segment partitioning rule. In other words, the 32 statistical results, consisting of the two statistical results within the two ranges corresponding to each dense state data point, are taken as the statistical results of dense state statistics.

[0092] Nodes can also determine the statistical results of dense state statistics for each dense state data based on the segmented statistical values ​​of each subsequence corresponding to each segmentation rule, and the statistical results of dense state statistics for each dense state data based on the segmented statistical values ​​of each subsequence corresponding to each segmentation rule and the edge statistical values ​​of each edge segment corresponding to each edge segmentation rule.

[0093] As can be seen from the above method, this method first determines the dense-state data to be statistically analyzed and arranges them to obtain a dense-state data sequence. Then, for each preset segmentation rule, the dense-state data sequence is divided according to that rule to obtain the subsequences corresponding to that segmentation rule. In parallel, dense-state statistics are performed on the dense-state data contained in each subsequence corresponding to that segmentation rule to obtain the segmentation statistical value of each subsequence corresponding to that segmentation rule. Next, for each preset edge segmentation rule, the dense-state data sequence is divided according to that rule to obtain the edges corresponding to that edge segmentation rule. Based on the segmentation statistical values ​​of each subsequence corresponding to each segmentation rule, the edge segment statistical values ​​of each edge segment corresponding to that edge segmentation rule are determined in parallel. Finally, based on the edge segment statistical values ​​of each edge segment corresponding to each edge segmentation rule, the statistical results of the dense-state statistics for each dense-state data are determined. By using a combined serial and parallel approach for dense-state data statistics, the seriality can be reduced, and the network interaction latency can be decreased.

[0094] In step S102 above, the dense data sequence is divided into segments of length k. i When processing each subsequence, the node needs to determine whether the number of each dense state data contained in the dense state data sequence is k. n Let n be any positive integer, and when the number of dense state data is k n When this happens, the dense data sequence is divided into segments of length k. i If the number of subsequences is k, then fill the dense data sequence with empty data, otherwise, fill the dense data sequence with empty data so that the number of each dense data in the supplemented dense data sequence is k. n The supplemented dense-state data sequence is then divided into k-th segments. i Each subsequence. The supplemented empty data can be either dense data or plaintext data; this specification does not specify a particular type. The statistical results of dense-state statistics on the dense-state data sequence after supplementing the empty data must be consistent with the statistical results of dense-state statistics on the dense-state data sequence before supplementing the empty data. The supplemented empty data will not affect the statistical results of the dense-state statistics. Furthermore, for ease of statistical analysis, the number of dense-state data after supplementing the empty data is k. n The difference between the number of dense states and the number of data points before the addition of empty data should be as small as possible, that is, k n The value of n is the smallest integer value that makes the number of dense state data after filling in the empty data greater than the number of dense state data before filling in the empty data.

[0095] Continuing with the previous example, suppose there are 15 dense states to be counted. Obviously, 15 is not 2. n It is possible to supplement the dense data sequence with empty data, so that the number of each dense data point in the supplemented dense data sequence is 2. nIn this example, an empty data point can be added to the dense data sequence, so that the number of each dense data point in the supplemented dense data sequence is 2. n That is, 16.

[0096] When the number of each dense state data is not k n In step S102 above, for each preset segmentation rule, the dense data sequence is divided according to that segmentation rule to obtain subsequences corresponding to that segmentation rule. If the starting position of the divided subsequence is not less than the ending position of the dense data sequence, the subsequence is discarded and no dense state statistics are performed on it. If the starting position of the divided subsequence is less than the ending position of the dense data sequence, and the ending position of the divided subsequence is not greater than the ending position of the dense data sequence, the node can use the subsequence as the subsequence corresponding to that segmentation rule and perform dense state statistics on it. If the starting position of the divided subsequence is less than the ending position of the dense data sequence, and the ending position of the divided subsequence is greater than the ending position of the dense data sequence, the ending position of the subsequence is modified to the ending position of the dense data sequence, the modified subsequence is used as the subsequence corresponding to that segmentation rule, and dense state statistics are performed on the modified subsequence.

[0097] In step S104 above, for each preset edge segment partitioning rule, the dense data sequence is partitioned according to that rule to obtain the corresponding edge segments. If the starting position of the partitioned edge segment is not less than the ending position of the dense data sequence, the edge segment is discarded and no dense state statistics are performed on it. If the starting position of the partitioned edge segment is less than the ending position of the dense data sequence, and the ending position of the partitioned edge segment is not greater than the ending position of the dense data sequence, the node can use the edge segment as the edge segment corresponding to that rule and perform dense state statistics on it. If the starting position of the partitioned edge segment is less than the ending position of the dense data sequence, and the ending position of the partitioned edge segment is greater than the ending position of the dense data sequence, the ending position of the edge segment is modified to the ending position of the dense data sequence, the modified edge segment is used as the edge segment corresponding to that rule, and dense state statistics are performed on the modified edge segment.

[0098] In step S104 above, the step size k corresponding to the i-th type of edge segment partitioning rule is determined. n-i-1When ni-1 is not greater than 0, a node can determine mk+j as the specified lengths corresponding to the i-th edge segment partitioning rule. Then, it partitions the dense data sequence according to these specified lengths to obtain the edges corresponding to the i-th edge segment partitioning rule. Correspondingly, when determining the edge statistics of each edge corresponding to the i-th edge segment partitioning rule in parallel based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the node can simultaneously determine the subsequences contained in each edge corresponding to the i-th edge segment partitioning rule, using them as specified subsequences, and determine the dense data contained in the edge segment excluding the subsequences, using them as specified dense data. Based on the segmentation statistics of the specified subsequences and the specified dense data, the node determines the edge statistics of the edge segment. Here, mk+j cannot be greater than the length of the dense data sequence, that is, it cannot be greater than the number of dense data, and j is a positive integer less than k. Therefore, the range of values ​​for m and j can be determined. Since the segments and edges with even lengths have already been counted in steps S102 and S104 above, in order to avoid repeated counting, the range of values ​​for m and j when mk+j is odd can be determined. In the following process of this specification, mk+j is used as an odd number for explanation.

[0099] Continuing with the previous example, based on the number of dense data (i.e., 16), and the fact that 2m+j is not greater than the number of dense data and 2m+j is odd, we can deduce that when j is 1, 2m+j can be odd, and the value of m can be between 1 and 7. Assuming i is 3, we determine whether n-3-1 is not greater than 0. Obviously, 4-3-1 equals 0. The node can determine 2m+1 as the specified lengths corresponding to this third edge segment partitioning rule, namely 3, 5, 7, 9, 11, 13, and 15. Then, based on each specified length, the dense data sequence is partitioned to obtain the edge segments corresponding to this third edge segment partitioning rule, such as... Figure 5 As shown, Figure 5 This is a schematic diagram of a dense data sequence provided in this specification after being partitioned, for example... Figure 5 The subsequence 0 to 2 corresponding to the third edge segment partitioning rule is partitioned according to a specified length of 3. For example... Figure 5The subsequences 0-4 corresponding to the third segment partitioning rule are partitioned according to a specified length of 5. Correspondingly, when determining the segment statistics of each segment corresponding to this segment partitioning rule in parallel based on the segment statistics of each subsequence according to each segmentation rule, a node can, in parallel, determine the subsequences contained within each segment corresponding to the third segment partitioning rule, as designated subsequences, and determine the dense data contained within the segment other than the subsequences, as designated dense data. Based on the segment statistics of the designated subsequences and the designated dense data, the segment statistics of the segment are determined. For example, if one of the segments corresponding to the third segment partitioning rule is subsequence 1-15, and the subsequences contained within subsequence 1-15 are subsequences 2-15, designated as designated subsequences, the dense data contained outside the subsequences, i.e., data 1, is designated as dense data. Based on the segment statistics of subsequences 2-15 and data 1, the segment statistics of subsequences 1-15 are determined.

[0100] The above describes a method for dense-state data statistics provided by one or more embodiments of this specification. Based on the same idea, this specification also provides a corresponding apparatus for dense-state data statistics, such as... Figure 6 As shown.

[0101] Figure 6 This specification provides a schematic diagram of a device for dense-state data statistics, specifically including:

[0102] The determining module 200 is used to determine each dense state data to be statistically analyzed, and to arrange the dense state data to obtain a dense state data sequence;

[0103] The segmentation module 202 is used to divide the dense data sequence according to each preset segmentation rule to obtain each subsequence corresponding to the segmentation rule, and to perform dense state statistics on the dense data contained in each subsequence corresponding to the segmentation rule in parallel to obtain the segmentation statistics value of each subsequence corresponding to the segmentation rule.

[0104] The edge segment module 204 is used to divide the dense data sequence according to each preset edge segmentation rule to obtain each edge segment corresponding to the edge segmentation rule. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the module determines the edge segment statistics of each edge segment corresponding to the edge segmentation rule in parallel. The edge segmentation rule includes: rules that take the start position of the dense data sequence as the start position of the edge segment and the length of each edge segment being a specified length corresponding to the edge segmentation rule; or rules that take the end position of the dense data sequence as the end position of the edge segment and the length of each edge segment being a specified length corresponding to the edge segmentation rule.

[0105] The result module 206 is used to determine the statistical results of dense state statistics for each dense state data based on the edge statistical values ​​of each edge segment corresponding to each edge segment division rule.

[0106] Optionally, the segmentation module 202 is specifically used to: obtain a preset specified value k for the i-th segmentation rule; and divide the dense data sequence into segments of length k. i The subsequences of , where i is a positive integer.

[0107] Optionally, the segmentation module 202 is specifically used to: for the i-th segmentation rule, query the segmentation statistics of each subsequence corresponding to the (i-1)-th segmentation rule; if found, determine the segmentation statistics of each subsequence corresponding to the i-th segmentation rule in parallel based on the segmentation statistics of each subsequence corresponding to the (i-1)-th segmentation rule; if not found, perform dense state statistics on each subsequence corresponding to the i-th segmentation rule in parallel based on the dense state data contained in the subsequence to obtain the segmentation statistics of the subsequence.

[0108] Optionally, the segmentation module 202 is specifically used to determine whether the number of each dense state data contained in the dense state data sequence is k. n n is any positive integer; if so, then the dense data sequence is divided into k... i The sequence is divided into subsequences; otherwise, empty data is added to the dense data sequence so that the number of each dense data in the supplemented dense data sequence is k. n The supplemented dense-state data sequence is then divided into k-th segments. i of each subsequence.

[0109] Optionally, the edge segment module 204 is specifically used to determine the step size k corresponding to the i-th edge segment partitioning rule for the i-th edge segment partitioning rule. n-i-1 According to the step size k n-i-1 , m×k n-i-1 The specified lengths corresponding to the i-th edge segment partitioning rule are determined, where m is a positive integer; the dense data sequence is partitioned with the starting position of the dense data sequence as the starting position of the partitioned edge segments and the lengths corresponding to the i-th edge segment partitioning rule as the specified lengths, to obtain the edge segments corresponding to the i-th edge segment partitioning rule; or, the dense data sequence is partitioned with the ending position of the dense data sequence as the ending position of the partitioned edge segments and the lengths corresponding to the i-th edge segment partitioning rule as the specified lengths, to obtain the edge segments corresponding to the i-th edge segment partitioning rule.

[0110] Optionally, the edge segment module 204 is specifically used to: query the edge segment statistics of each edge segment corresponding to the (i-1)th edge segment division rule for the i-th edge segment division rule; if found, determine the edge segment statistics of each edge segment corresponding to the i-th edge segment division rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule and the edge segment statistics of each edge segment corresponding to the (i-1)th edge segment division rule; if not found, determine the edge segment statistics of each edge segment corresponding to the i-th edge segment division rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule.

[0111] Optionally, the edge segment module 204 is specifically used to determine mk+j as the specified lengths corresponding to the i-th edge segment division rule when ni-1 is not greater than 0, where j is a positive integer less than k.

[0112] Optionally, the edge segment module 204 is specifically used to, when ni-1 is not greater than 0, in parallel for each edge segment corresponding to the i-th edge segment partitioning rule, determine the subsequence contained in the edge segment as a specified subsequence, and determine the dense state data contained in the edge segment other than the subsequence as a specified dense state data; and determine the edge segment statistics value of the edge segment based on the segmentation statistics value of the specified subsequence and the specified dense state data.

[0113] This specification also provides a computer-readable storage medium storing a computer program that can be used to execute the above-described... Figure 1 The method for statistical analysis of dense-state data is shown.

[0114] This instruction manual also provides Figure 7 The diagram shows a schematic structural representation of the electronic device. Figure 7 At the hardware level, the electronic device includes a processor, internal bus, network interface, memory, and non-volatile memory, and may also include other hardware required for the business operations. The processor reads the corresponding computer program from the non-volatile memory into memory and then runs it to achieve the above-mentioned functions. Figure 1 The method for statistical analysis of dense-state data is shown.

[0115] Of course, in addition to software implementation, this specification does not exclude other implementation methods, such as logic devices or a combination of hardware and software. In other words, the execution subject of the following processing flow is not limited to each logic unit, but can also be hardware or logic devices.

[0116] In the 1990s, improvements to a technology could be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many methodological improvements today can be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that a methodological improvement cannot be implemented using hardware physical modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program and "integrate" a digital system onto a PLD themselves, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed ​​Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should understand that by simply performing some logic programming on the method flow using one of these hardware description languages ​​and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.

[0117] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, ASICs, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.

[0118] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or any combination of these devices.

[0119] For ease of description, the above devices are described in terms of function, divided into various units. Of course, in implementing this specification, the functions of each unit can be implemented in one or more software and / or hardware components.

[0120] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0121] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0122] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0123] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0124] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0125] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0126] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0127] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0128] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this specification may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0129] This specification can be described in the general context of computer-executable instructions that are executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. This specification can also be practiced in distributed computing environments, where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.

[0130] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.

[0131] The above description is merely an embodiment of this specification and is not intended to limit this specification. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this specification.

Claims

1. A method for statistical analysis of dense-state data, comprising: The dense state data to be statistically analyzed are determined, and the dense state data are arranged to obtain a dense state data sequence; For each preset segmentation rule, the dense data sequence is divided according to the segmentation rule to obtain each subsequence corresponding to the segmentation rule. Dense state statistics are performed on the dense data contained in each subsequence corresponding to the segmentation rule in parallel to obtain the segmentation statistics value of each subsequence corresponding to the segmentation rule. For each preset segmentation rule, the dense data sequence is divided according to that rule to obtain segments corresponding to that rule. A segment refers to a sequence segment with a fixed start position and a variable end position, or a sequence segment with a fixed end position and a variable start position. The fixed start position of the segment is the start position of the dense data sequence, and the fixed end position is the end position of the dense data sequence. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the segmentation statistics of each segment corresponding to that rule are determined in parallel. The segmentation rule includes: rules where the start position of the dense data sequence is the start position of the segment, and the length is a specified length corresponding to that rule; or rules where the end position of the dense data sequence is the end position of the segment, and the length is a specified length corresponding to that rule. Based on the edge segment statistics of each edge segment corresponding to each edge segment division rule, the statistical results of dense state statistics for each dense state data are determined.

2. The method as described in claim 1, wherein for each preset segmentation rule, the dense-state data sequence is divided according to that segmentation rule to obtain sub-sequences corresponding to that segmentation rule, specifically including: For the i-th segmentation rule, obtain the preset specified value k; The dense data sequence is divided into k-th segments. i The subsequences of , where i is a positive integer.

3. The method as described in claim 2, wherein dense-state statistics are performed in parallel on the dense-state data contained in each subsequence corresponding to the segmentation rule, specifically including: For the i-th segmentation rule, query the segmentation statistics of each subsequence corresponding to the (i-1)-th segmentation rule; If found, then based on the segmentation statistics of each subsequence corresponding to the (i-1)th segmentation rule, determine the segmentation statistics of each subsequence corresponding to the i-th segmentation rule in parallel. If no match is found, then for each subsequence corresponding to the i-th segmentation rule, perform dense state statistics based on the dense state data contained in the subsequence to obtain the segmentation statistics value of the subsequence.

4. The method as described in claim 2, wherein the dense-state data sequence is divided into segments of length k. i The subsequences specifically include: Determine whether the number of each dense state data contained in the dense state data sequence is k. n n is any positive integer; If so, then the dense data sequence is divided into segments of length k. i Each subsequence of; Otherwise, empty data is added to the dense data sequence so that the number of each dense data type in the supplemented dense data sequence is k. n The supplemented dense-state data sequence is then divided into k-th segments. i of each subsequence.

5. The method as described in claim 4, wherein for each preset edge segment partitioning rule, the dense-state data sequence is partitioned according to that edge segment partitioning rule to obtain each edge segment corresponding to that edge segment partitioning rule, specifically including: For the i-th edge segment partitioning rule, determine the step size k corresponding to the i-th edge segment partitioning rule. n-i-1 According to the step size k n-i-1 , m×k n-i-1 The specified lengths corresponding to the i-th type of edge segment partitioning rule are determined, where m is a positive integer; The dense data sequence is divided into segments with the starting position of the segment division rule as the starting position and the length of each segment being a specified length corresponding to the i-th segment division rule. Alternatively, the dense data sequence is divided into segments with the ending position of the segment division rule as the ending position and the length of each segment being a specified length corresponding to the i-th segment division rule.

6. The method as described in claim 5, wherein the segment statistics of each segment corresponding to each subsequence of each segmentation rule are determined in parallel, specifically including: For the i-th edge segment partitioning rule, query the edge segment statistics of each edge segment corresponding to the (i-1)-th edge segment partitioning rule; If found, the segment statistics of each subsequence corresponding to each segmentation rule and the segment statistics of each segment corresponding to the (i-1)th segmentation rule are determined in parallel. If no match is found, the segment statistics of each segment corresponding to the i-th segment division rule are determined in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule.

7. The method as described in claim 5, determining the step size k corresponding to the i-th side segment partitioning rule. n-i-1 Specifically, it includes: When ni-1 is not greater than 0, mk+j is determined as the specified lengths corresponding to the i-th edge segment partitioning rule, where j is a positive integer less than k.

8. The method as described in claim 7, wherein the segment statistics of each segment corresponding to each subsequence of each segmentation rule are determined in parallel, specifically including: When ni-1 is not greater than 0, for each edge segment corresponding to the i-th edge segment partitioning rule, the subsequence contained in the edge segment is determined as the specified subsequence, and the dense state data contained in the edge segment other than the subsequence is determined as the specified dense state data. The edge statistics of the edge segment are determined based on the segmented statistics of the specified subsequence and the specified dense state data.

9. An apparatus for dense-state data statistics, comprising: The determination module is used to determine each dense state data to be statistically analyzed, and to arrange the dense state data to obtain a dense state data sequence; The segmentation module is used to divide the dense data sequence according to each preset segmentation rule to obtain each subsequence corresponding to the segmentation rule, and to perform dense state statistics on the dense data contained in each subsequence corresponding to the segmentation rule in parallel to obtain the segmentation statistics value of each subsequence corresponding to the segmentation rule. The edge segment module is used to divide the dense data sequence according to each preset edge segmentation rule, obtaining each edge segment corresponding to that edge segmentation rule. Here, an edge segment refers to a sequence segment with a fixed start position and a variable end position, or a sequence segment with a fixed end position and a variable start position. The fixed start position of the edge segment is the start position of the dense data sequence, and the fixed end position of the edge segment is the end position of the dense data sequence. Based on the segmentation statistics of each subsequence corresponding to each segmentation rule, the module determines the edge segment statistics of each edge segment corresponding to that edge segmentation rule in parallel. The edge segmentation rule includes: rules that use the start position of the dense data sequence as the start position of the edge segment and the length of each edge segment is a specified length corresponding to that edge segmentation rule; or rules that use the end position of the dense data sequence as the end position of the edge segment and the length of each edge segment is a specified length corresponding to that edge segmentation rule. The results module is used to determine the statistical results of dense state statistics for each dense state data based on the edge statistics values ​​of each edge segment corresponding to each edge segment division rule.

10. The apparatus of claim 9, wherein the segmentation module is specifically configured to: obtain a preset specified value k for the i-th segmentation rule; and divide the dense data sequence into segments of length k. i The subsequences of , where i is a positive integer.

11. The apparatus of claim 10, wherein the segmentation module is specifically configured to: query the segmentation statistics of each subsequence corresponding to the (i-1)th segmentation rule for the i-th segmentation rule; if found, determine the segmentation statistics of each subsequence corresponding to the i-th segmentation rule in parallel based on the segmentation statistics of each subsequence corresponding to the (i-1)th segmentation rule; if not found, perform dense state statistics on each subsequence corresponding to the i-th segmentation rule in parallel based on the dense state data contained in the subsequence to obtain the segmentation statistics of the subsequence.

12. The apparatus of claim 10, wherein the segmentation module is specifically configured to determine whether the number of each dense state data contained in the dense state data sequence is k. n n is any positive integer; if so, then the dense data sequence is divided into segments of length k. i The sequence is divided into subsequences; otherwise, empty data is added to the dense data sequence so that the number of each dense data in the supplemented dense data sequence is k. n The supplemented dense-state data sequence is then divided into k-th segments. i of each subsequence.

13. The apparatus of claim 12, wherein the edge segment module is specifically configured to, for the i-th edge segment partitioning rule, determine the step size k corresponding to the i-th edge segment partitioning rule. n-i-1 According to the step size k n-i-1 , m×k n-i-1 The specified lengths corresponding to the i-th edge segment partitioning rule are determined, where m is a positive integer; the dense data sequence is partitioned with the starting position of the dense data sequence as the starting position of the partitioned edge segments and the lengths corresponding to the i-th edge segment partitioning rule as the specified lengths, to obtain the edge segments corresponding to the i-th edge segment partitioning rule; or, the dense data sequence is partitioned with the ending position of the dense data sequence as the ending position of the partitioned edge segments and the lengths corresponding to the i-th edge segment partitioning rule as the specified lengths, to obtain the edge segments corresponding to the i-th edge segment partitioning rule.

14. The apparatus of claim 13, wherein the edge segment module is specifically configured to: query the edge segment statistics of each edge segment corresponding to the (i-1)th edge segment division rule for the i-th edge segment division rule; if found, determine the edge segment statistics of each edge segment corresponding to the i-th edge segment division rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule and the edge segment statistics of each edge segment corresponding to the (i-1)th edge segment division rule; if not found, determine the edge segment statistics of each edge segment corresponding to the i-th edge segment division rule in parallel based on the segment statistics of each subsequence corresponding to each segmentation rule.

15. The apparatus of claim 13, wherein the edge segment module is specifically configured to, when ni-1 is not greater than 0, determine mk+j as each specified length corresponding to the i-th edge segment division rule, wherein, j is a positive integer less than k.

16. The apparatus of claim 15, wherein the edge segment module is specifically configured to, when ni-1 is not greater than 0, in parallel for each edge segment corresponding to the i-th edge segmentation rule, determine the subsequence contained in the edge segment as a specified subsequence, and determine the dense state data contained in the edge segment other than the subsequence as specified dense state data; and determine the edge segment statistics value of the edge segment based on the segmentation statistics value of the specified subsequence and the specified dense state data.

17. A computer-readable storage medium storing a computer program that, when executed by a processor, implements the method described in any one of claims 1 to 8.

18. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method described in any one of claims 1 to 8.