HPLC intelligent acquisition terminal based on big data acquisition
By combining the value scheduling, acquisition processing, cause-based re-acquisition, and graded output modules of the HPLC intelligent acquisition terminal, the problems of unreasonable node access and abnormal re-acquisition are solved, realizing the refined organization of big data acquisition and targeted repair of results, and improving the integrity and reliability of acquisition results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HANGZHOU HUALONG ELECTRONIC TECH CO LTD
- Filing Date
- 2026-05-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing HPLC intelligent acquisition terminals have unreasonable node access arrangements in big data acquisition scenarios, and abnormal re-acquisition is crude, resulting in limited link bandwidth and limited access time slots, which affects the integrity, timeliness and availability of acquisition results.
The value scheduling module organizes the prior information of nodes in a unified manner, sorts the value and arranges the access time based on the value judgment factor, generates priority collection objects and candidate collection objects, performs data verification and labeling through the collection and processing module, identifies the source of anomalies and performs differentiated re-collection through the cause judgment module, and performs classification, encapsulation and uploading compilation through the hierarchical output module.
It improves the scheduling accuracy and link resource utilization efficiency in HPLC big data acquisition scenarios, reduces invalid access, enhances the integrity of abnormal data repair and the accuracy of re-acquisition results, and improves the reliability of acquisition results.
Smart Images

Figure CN122247778A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of power information acquisition technology, and in particular to an HPLC intelligent acquisition terminal based on big data acquisition. Background Technology
[0002] With the development of electricity information collection, distribution automation, and digital transformation of distribution areas, intelligent acquisition terminals based on high-speed power line carrier communication have gradually evolved from single reading devices into comprehensive acquisition carriers with communication access, status awareness, data aggregation, and edge processing capabilities. HPLC intelligent acquisition terminals rely on existing power line links to complete terminal node connection, operational status feedback, and multi-source data aggregation, demonstrating good application adaptability in low-voltage distribution area multi-node access, distributed data aggregation, and big data acquisition scenarios. Meanwhile, in HPLC big data acquisition scenarios, a large number of terminal nodes share existing power line links, resulting in limited link bandwidth, restricted concurrent access capabilities, and node online status, response status, and accessibility easily affected by fluctuations in on-site operating conditions. Therefore, terminals need to complete node access, abnormal data collection, and data aggregation under limited link resources.
[0003] Existing terminals have shortcomings. Node access arrangements often rely on fixed polling sequences, preset address sequences, or single state conditions, lacking unified organization and comprehensive judgment of historical communication states, historical acquisition states, business event states, and current communication states. This makes it difficult to prioritize high-value nodes, abnormal nodes, and key nodes in HPLC big data acquisition scenarios. In addition, when the first round of acquisition is abnormal, subsequent processing often adopts a unified re-acquisition or coarse-grained supplementary acquisition method, lacking detailed judgment of different abnormality sources and targeted supplementation, labeling, and classification output of re-acquisition results. This easily leads to a large number of invalid accesses, excessive duplicate re-acquisitions, unclear causes of abnormalities, and insufficient compilation and utilization of results during big data acquisition due to limited link bandwidth and access time slots, affecting the completeness, timeliness, and usability of big data acquisition results. Summary of the Invention
[0004] In view of the aforementioned existing problems, the present invention is proposed.
[0005] Therefore, this invention provides an HPLC intelligent acquisition terminal based on big data acquisition to solve the problems of unreasonable node access and arrangement, crude abnormal re-acquisition, and insufficient compilation and utilization of results.
[0006] To solve the above-mentioned technical problems, the present invention provides the following technical solution:
[0007] This invention provides an HPLC intelligent acquisition terminal based on big data acquisition, comprising: a value scheduling module, used to sort the value of each node and arrange the access sequence based on the node's prior information and acquisition value judgment factors, generating priority acquisition objects, candidate acquisition objects, and corresponding access tags; an acquisition processing module, used to perform HPLC link access and data acquisition on the corresponding nodes according to the priority acquisition objects, candidate acquisition objects, and corresponding access tags, and to verify, organize, and label the acquired data, generating the first round of acquisition results and abnormal node identifiers; a cause-based re-acquisition module, used to identify the abnormal source of abnormal nodes and perform differentiated data re-acquisition based on the first round of acquisition results and abnormal node identifiers, to obtain re-acquisition data, and to supplement the re-acquisition data into the first round of acquisition results, generating enhanced acquisition results and abnormal cause tags; and a hierarchical output module, used to classify and encapsulate the data of each node in the enhanced acquisition results, and to determine the upload and compilation method of each node's data according to the abnormal cause tags, generating big data acquisition results.
[0008] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the value scheduling module includes a priori information management unit, a value sorting unit, and an access orchestration unit, and the value scheduling module is used to perform value scheduling processing.
[0009] The prior information management unit is used to collect and organize the prior information of nodes;
[0010] The value ranking unit is used to determine the collection priority of each node based on the collection value judgment factor;
[0011] The access orchestration unit is used to allocate access order and corresponding access tags according to the collection priority.
[0012] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the acquisition processing module includes a link access unit, a data acquisition unit, and a result generation unit, and the acquisition processing module is used to perform link acquisition processing;
[0013] The link access unit is used to establish HPLC link access to the corresponding node based on the priority acquisition object, the candidate acquisition object and the corresponding access mark;
[0014] The data acquisition unit is used to acquire the data returned by the corresponding node during the HPLC link access process;
[0015] The result generation unit is used to perform integrity verification, time sorting, and anomaly labeling on the collected data, and to generate the first round of collection results and anomaly node identifiers.
[0016] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the cause determination and re-acquisition module includes a cause determination unit, a re-acquisition execution unit, and an integration and generation unit. The cause determination and re-acquisition module is used to perform abnormal cause determination and re-acquisition processing.
[0017] The cause discrimination unit is used to discriminate the source of an anomaly of an abnormal node as a node-related anomaly, a data integrity anomaly, a time status anomaly, or a link access anomaly.
[0018] The re-collection execution unit is used to perform corresponding differentiated re-collection processing on abnormal nodes based on the source of the abnormality, node identifier, and corresponding access marker, so as to obtain re-collected data;
[0019] The integration and generation unit is used to verify the re-collected data, add the re-collected data to the first round of collection results to form enhanced collection results, and generate anomaly cause labels based on the source of the anomaly.
[0020] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the graded output module includes a classification and packaging unit and an upload and compilation unit, and the graded output module is used to perform graded output processing;
[0021] The classification and encapsulation unit is used to classify and encapsulate the data of each node in the enhanced acquisition results;
[0022] The upload and compilation unit is used to determine the upload and compilation method of each node's data based on the anomaly cause label.
[0023] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the prior information of the node includes the node's historical communication status, historical acquisition status, business event status, and current communication status.
[0024] The factors for determining the value of data collection include the degree of node change, the degree of abnormal attention, the degree of historical data collection gaps, and the degree of link occupancy.
[0025] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the value scheduling processing specifically includes the following steps:
[0026] The node prior information of each node is matched with node identifiers, deduplicated records are removed, missing records are marked, time is unified, and state is unified to obtain the sorted node prior information;
[0027] Based on the sorted prior information of the nodes, the collection value judgment factors corresponding to each node are extracted, and the collection value judgment factors are uniformly processed and ranked to obtain the value ranking results of each node.
[0028] Based on the value ranking results, the access sequence of each node is arranged to determine the priority collection objects and the candidate collection objects, and corresponding access tags are assigned to the priority collection objects and the candidate collection objects respectively.
[0029] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the specific steps of the link acquisition processing are as follows:
[0030] Based on the priority acquisition target, the candidate acquisition target and the corresponding access mark, HPLC link access and data acquisition are performed on the corresponding nodes in sequence to obtain the raw acquisition data corresponding to each node;
[0031] The original collected data is checked for node correspondence, integrity, time, and duplicate data to obtain the processed collected data.
[0032] Each node is labeled with its status. The collected data corresponding to the nodes in normal status is used to form the first round of collection results, and abnormal node identifiers are generated for nodes in abnormal status.
[0033] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the specific steps of the anomaly cause determination and re-acquisition processing are as follows:
[0034] The collected data corresponding to the abnormal state nodes are extracted and analyzed to determine the abnormal source of each abnormal state node and obtain the abnormal source discrimination result.
[0035] For each abnormal state node, perform corresponding differentiated data re-collection, and record and verify the re-collected data accordingly to obtain the re-collected data;
[0036] The re-collected data is added to the first round of collection results to generate enhanced collection results, and anomaly cause labels are generated based on the anomaly sources corresponding to the re-collected data.
[0037] As a preferred embodiment of the HPLC intelligent acquisition terminal based on big data acquisition described in this invention, the specific steps of the graded output processing are as follows:
[0038] The data of each node in the enhanced acquisition results are processed by node-by-node correspondence and classification to obtain the classified data of each node.
[0039] The data of each node after classification is classified and packaged separately, and the corresponding upload and compilation method is determined according to the abnormal reason label to obtain the compiled data of each node.
[0040] The compiled data from each node is summarized and output to generate big data collection results.
[0041] The beneficial effects of this invention are as follows: By uniformly organizing the prior information of nodes and combining it with the collection value judgment factor to complete the value ranking and access sequence arrangement, a refined organization of big data collection tasks under the conditions of limited HPLC link bandwidth and limited concurrent access capability is realized. This enables abnormal nodes, changing nodes, and historically missing nodes to enter the collection process first, reducing unresponsive access, inefficient access, and duplicate access, and improving the scheduling accuracy and link resource utilization efficiency in HPLC big data collection scenarios. By identifying the abnormal source of abnormal nodes and performing corresponding differentiated data re-collection for different abnormal sources, targeted repair and result correlation of the first round of abnormal collection results in the big data collection process are realized. This reduces the invalid link occupation caused by unified re-collection, improves the completeness of abnormal data repair, the corresponding accuracy of re-collection results, and the reliability of big data collection result output, and provides a clear basis for subsequent classification, packaging, and uploading compilation. Attached Figure Description
[0042] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0043] Figure 1 This is a schematic diagram of an HPLC intelligent acquisition terminal based on big data collection.
[0044] Figure 2 This is a schematic diagram of the data acquisition terminal.
[0045] Figure 3 This is a flowchart of the value scheduling process.
[0046] Figure 4 This is a flowchart of the link acquisition, anomaly cause determination and re-acquisition, and hierarchical output processing.
[0047] Figure 5 This is a comparison chart of invalid access percentages.
[0048] Figure 6 A comparison chart showing the proportion of data collected for priority data collection targets. Detailed Implementation
[0049] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
[0050] Many specific details are set forth in the following description in order to provide a full understanding of the invention. However, the invention may also be practiced in other ways different from those described herein, and those skilled in the art can make similar extensions without departing from the spirit of the invention. Therefore, the invention is not limited to the specific embodiments disclosed below.
[0051] Secondly, the term "one embodiment" or "embodiment" as used herein refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. The phrase "in one embodiment" appearing in different places in this specification does not necessarily refer to the same embodiment, nor is it a single or selective embodiment that is mutually exclusive with other embodiments.
[0052] Reference Figures 1-6 This is one embodiment of the present invention, which provides an HPLC intelligent acquisition terminal based on big data acquisition, including the following steps:
[0053] The value scheduling module is used to sort the value of each node and arrange the access sequence based on the node's prior information and the collection value judgment factor, and generate priority collection objects, candidate collection objects and corresponding access tags.
[0054] The node prior information of each node is matched with node identifiers, deduplicated records are removed, missing records are marked, time is unified, and state is unified to obtain the sorted node prior information.
[0055] It should be noted that the pre-saved node prior information of each node is retrieved. The pre-saved node prior information of each node is the information formed and saved by the acquisition terminal during the preceding communication and acquisition process. The node prior information includes at least the node's historical communication status, historical acquisition status, service event status and current communication status. Among them, the current communication status is the communication status most recently recorded by the corresponding node when the node prior information is sorted.
[0056] The prior information of each node is compared and merged one by one according to the node identifier. Prior information of nodes with the same node identifier is mapped to the same node record to complete the node identifier mapping. The prior information of nodes with the completed node identifier mapping is deduplicated. Only one prior information of nodes with the same node identifier and the same prior information content is retained. Fields that are missing node historical communication status, historical collection status, business event status or current communication status are marked as missing.
[0057] For node prior information with missing markers, time unification is performed. During time unification, time content from different recording methods is converted into a unified time expression according to the same time field format. The unified time expression is determined by the time record format used for time comparison and sorting in the node prior information. For the node prior information after time unification, state unification is performed. During state unification, state content with different expressions but the same meaning in the node's historical communication state, historical acquisition state, business event state, and current communication state is converted into a unified state expression according to the same state classification result. The same state classification result is the state category formed after the unified classification of the node's historical communication state, historical acquisition state, business event state, and current communication state, resulting in the sorted node prior information.
[0058] Based on the sorted prior information of the nodes, the collection value judgment factors corresponding to each node are extracted, and the collection value judgment factors are uniformly processed and ranked to obtain the value ranking results of each node.
[0059] It should be noted that, for each node, its historical communication status, historical data collection status, service event status, and current communication status are extracted one by one according to the node identifier. The online status, response status, and accessibility status of the node's historical communication status are then compared item by item with those of the current communication status. Items whose status content has changed are extracted, and the degree of node change is formed based on the number and type of changes. The expression is as follows:
[0060] ;
[0061] in, Indicates the first The degree of change of each node; Indicates the node sequence number; Indicates the serial number of the communication status item; Indicates the first The node in the node's historical communication state. One communication status item; Indicates the first The node in the current communication state is the One communication status item; This represents an indicator function that takes the value 1 if the condition within the parentheses is true, and 0 otherwise.
[0062] The event records in the business event status are checked one by one. Event records that cause the current collection to be interrupted, delayed, failed, or have data anomalies are identified as event records that affect the current collection. The remaining event records are identified as event records that do not affect the current collection. Based on the existence and impact of the event records, they are grouped into an anomaly concern level, expressed as follows:
[0063] ;
[0064] in, Indicates the first The degree of abnormal attention to each node; Indicates the sequence number of the business event record; Indicates the first The node of the first A record of the status of each business event.
[0065] The historical data collection records in the historical collection status are arranged chronologically. Records that did not produce valid collection results are identified as incomplete collection records. These incomplete collection records are continuously extracted, and their historical missing data levels are determined based on the order and duration of consecutive incomplete collections. The expression is as follows:
[0066] ;
[0067] in, Indicates the first The historical mining deficit level of each node; Indicates the sequence number of the historical data collection record; Indicates the first The node of the first A record of historical data collection status.
[0068] The current communication status, including online status, response status, and accessibility status, is checked item by item. Status entries that can be directly accessed and those that cannot are merged separately. The link occupancy level is then calculated based on the merged results, expressed as follows:
[0069] ;
[0070] in, Indicates the first The link occupancy level of each node; Indicates the first The online status of each node in its current communication status; Indicates the first The response status of each node in its current communication state; Indicates the first Accessibility status of each node in its communication state; Indicates participation in the The total number of status items calculated for the link occupancy level of each node.
[0071] The total number of status items in the formula corresponds to three status items: online status, response status, and accessibility status; link occupancy level. Total number of state items The difference between the sum of the representation results of the three state items is used to represent the number of state items for which the current node has not met the normal access conditions; The larger the value, the worse the communication status of the current node, the higher the link occupancy, and the more difficult it is to directly initiate HPLC link access.
[0072] By summarizing the degree of node change, degree of abnormal attention, degree of historical lack of data collection, and degree of link occupancy, the data collection value judgment factor corresponding to each node is obtained.
[0073] The degree of node change, degree of abnormal attention, degree of historical data shortage, and degree of link occupancy are processed uniformly. During the uniform processing, the judgment results that should be ranked higher in the degree of node change, degree of abnormal attention, and degree of historical data shortage are adjusted to be ranked higher in the comparison direction, and the judgment results that should be ranked lower in the degree of link occupancy are adjusted to be ranked lower in the comparison direction. This ensures that the degree of node change, degree of abnormal attention, degree of historical data shortage, and degree of link occupancy can participate in the comparison according to the same priority rule. The degree of node change of each node is compared sequentially. When the degree of node change can be distinguished in order, the ranking order of the corresponding nodes is directly determined. When the degree of node change cannot be distinguished in order, the degree of abnormal attention is compared in order. When the degree of abnormal attention cannot be distinguished in order, the degree of historical data shortage is compared in order. When the degree of historical data shortage still cannot be distinguished in order, the degree of link occupancy is compared in order. The node order after all comparisons is completed is output as the value ranking result of each node.
[0074] Based on the value ranking results, the access sequence of each node is arranged to determine the priority collection objects and the candidate collection objects, and corresponding access tags are assigned to the priority collection objects and the candidate collection objects respectively.
[0075] It should be noted that the access sequence is formed according to the order of nodes in the value ranking result, and the nodes with higher values in the value ranking result correspond to the earlier access positions in the access sequence, and the nodes with lower values in the value ranking result correspond to the later access positions in the access sequence. Based on the access positions of each node in the access sequence, the objects are divided, and the nodes located at the beginning of the access sequence are determined as priority collection objects, and the nodes located at the end of the access sequence are determined as candidate collection objects, thus completing the object division of each node in the access sequence.
[0076] For priority collection objects, determine the corresponding access markers corresponding to the first access positions. For candidate collection objects, determine the corresponding access markers corresponding to the subsequent access positions. The corresponding access markers simultaneously represent the priority collection object or candidate collection object category to which the node belongs, as well as the order of the node in the access sequence. Output the priority collection objects, candidate collection objects, and corresponding access markers after the corresponding access markers have been determined.
[0077] Figure 5 The comparison of invalid access ratios under different scheduling strategies is shown. It can be seen that as the total number of nodes increases, the invalid access ratio of each scheme increases, but the scheme of this invention remains at a low level. This indicates that by uniformly organizing the prior information of nodes and combining the degree of node change, degree of abnormal attention, degree of historical missing data collection, and degree of link occupancy to complete the value ranking and access timing arrangement, access resources can be more effectively allocated to nodes that are more suitable for current collection, thereby reducing unresponsive access, inefficient access, and duplicate access. Compared with fixed polling order, preset address order, and other static sorting methods, the scheme of this invention shows better stability under different node scales, indicating that it can effectively reduce the probability of invalid access and improve the overall scheduling rationality in big data collection scenarios.
[0078] Figure 6 The study compared the changes in the completion rate of priority collection objects with the collection cycle under different scheduling strategies. The upper figure shows the overall trend of each strategy within the complete cycle. It can be seen that the curve corresponding to the present invention is generally at a higher position with relatively small fluctuations. This indicates that the present invention, by driving the access timing arrangement through value ranking results, can enable priority collection objects to maintain a high completion rate during continuous collection. The lower figure is a magnified view. Further observation of the local interval shows that the difference between the present invention's scheme and other strategies is clearer in the stage with more obvious fluctuations. Especially in the position where the difference between the two curves is large, the present invention's scheme can still maintain a high completion rate, indicating that it still has better priority processing capability and scheduling accuracy under local complex working conditions or increased collection pressure.
[0079] Figure 5 and Figure 6 The fixed polling order in this context refers to node access being performed sequentially according to a pre-set fixed polling order, without dynamically adjusting based on the node's historical communication status, historical data collection status, business event status, and current communication status. This serves as a contrast scheme to "accessing in a predetermined order." While this scheme is simple to implement, when the number of nodes increases or the node status varies significantly, it can easily prevent abnormal nodes, changing nodes, and nodes with historical missing data from entering the data collection process first, leading to an increase in invalid accesses and insufficient priority processing capabilities.
[0080] Figure 5 and Figure 6The preset address order in the system refers to the arrangement of node access according to node address, number, or pre-registered order. It does not uniformly organize the prior information of nodes, nor does it sort them according to the collection value judgment factor. This serves as a comparison scheme with "access according to static address order". This scheme can maintain a fixed access organization form, but it is difficult to reflect the current collection value difference of nodes. In big data collection scenarios, it is easy for high-value nodes to be suppressed by low-value nodes.
[0081] Figure 5 and Figure 6 The single-state sorting in this context refers to sorting nodes based solely on a single state condition, such as the current communication state or a certain type of abnormal state, without comprehensively comparing the degree of node change, the degree of abnormal attention, the degree of historical missing data collection, and the degree of link occupancy. This serves as the baseline scheme for "single-factor driven access." This scheme is an improvement over the fixed polling order, but due to the single judgment dimension, it is still difficult to fully reflect the comprehensive collection priority of nodes.
[0082] Figure 5 and Figure 6 The present invention refers to a value scheduling processing method that involves: matching node identifiers, deduplicating records, marking missing data, unifying time and status for each node's prior information, resulting in organized node prior information; extracting node change degree, abnormal attention degree, historical missing data degree, and link occupancy degree based on the organized node prior information, and uniformly processing and comparing the order of each collection value judgment factor to obtain a value ranking result; finally, completing the access timing arrangement based on the value ranking result, determining priority collection objects, candidate collection objects, and corresponding access tags, and implementing node access and data collection accordingly. This scheme enables high-value nodes and abnormal nodes to enter the collection process earlier, thereby reducing the proportion of invalid accesses and improving the completion rate of priority collection objects and the overall scheduling accuracy.
[0083] It should also be noted that existing technologies typically arrange node access according to a fixed polling order, a preset address order, or a single state information, resulting in problems such as messy node information, one-sided sorting, and difficulty in prioritizing key nodes. This solution unifies the prior information of nodes and combines it with the collection value judgment factor to complete the value sorting and access arrangement, determining the priority collection objects, candidate collection objects, and corresponding access tags. This improves the consistency and comparability of node information, makes the collection order more in line with the actual state of the nodes, and prioritizes abnormal nodes, changed nodes, and historically missing nodes in the collection process, reducing invalid access and order conflicts, and improving the accuracy, relevance, and consistency of access arrangements.
[0084] The data acquisition and processing module is used to perform HPLC link access and data acquisition on the corresponding nodes based on the priority acquisition objects, candidate acquisition objects and corresponding access tags, and to verify, organize and label the acquired data, and generate the first round of acquisition results and abnormal node identifiers.
[0085] Based on the priority acquisition target, the candidate acquisition target and the corresponding access marker, HPLC link access and data acquisition are performed on the corresponding nodes in sequence to obtain the raw acquisition data corresponding to each node.
[0086] It should be noted that the priority acquisition targets, candidate acquisition targets, and corresponding access markers are arranged sequentially according to the access order represented by the corresponding access markers, forming the access order of the corresponding nodes. This access order serves as the execution order for subsequent HPLC link accesses. Each node is selected sequentially according to the access order, its corresponding access marker is read, and an HPLC link access request is initiated to the node based on the node identifier and corresponding access marker. Upon receiving the link response information returned by the node, the node identifier, corresponding access marker, and access order are recorded accordingly. After completing the HPLC link access, a data acquisition request is sent to the node, and the acquired data returned by the node is received. The acquired data is recorded in correspondence with the node identifier, corresponding access marker, and data acquisition time, forming the acquisition record for the corresponding node. For nodes that do not return link response information, the correspondence between the node identifier, corresponding access marker, and access order is retained, forming a no-response acquisition record. All acquisition records corresponding to the corresponding nodes are sequentially summarized according to the access order, ensuring that the node identifier, corresponding access marker, and acquired data of each node maintain a corresponding relationship, thus obtaining the original acquisition data for each node.
[0087] The original collected data is checked for node correspondence, integrity, time, and duplicate data to obtain the sorted collected data.
[0088] It should be noted that the original collected data is checked for node correspondence according to node identifier, corresponding access mark, and access order. Original collected data whose node identifier, corresponding access mark, and access order can correspond to each other are retained as valid original collected data. Original collected data whose node identifier, corresponding access mark, and access order cannot correspond to each other are retained in the position corresponding to the original access order and marked as node correspondence abnormal data. The completeness of the collected data content is checked. Original collected data with complete content and continuous corresponding field records are retained as complete collected data. Original collected data with missing content, interrupted records, or incomplete fields are retained in the position corresponding to the original access order and marked as incomplete collected data.
[0089] The data collection time in the original data is checked. Original data with valid collection time, consistent data time format, and consistent access order are retained as valid data. Original data with missing collection time, abnormal data time format, or inconsistent access order are retained at the original access order position and marked as abnormal data. Original data with the same node identifier, corresponding access mark, and data content are processed for duplicate data. Only one duplicate original data is retained, and the remaining duplicate records are marked as duplicate records and will not participate in subsequent sorting. The original data after deduplication is rearranged according to the access order to obtain the sorted data.
[0090] Each node is labeled with its status. The collected data corresponding to the nodes in normal status is used to form the first round of collection results, and abnormal node identifiers are generated for nodes in abnormal status.
[0091] It should be noted that the processed collected data is merged node by node according to the node identifier, and the correspondence between the processed collected data under each node is checked in conjunction with the corresponding access mark. Processed collected data belonging to the same node and with the same access mark are mapped to the same node. The status of the processed collected data under the same node is checked to see if there are any records under the same node marked as abnormal data, incomplete data, or abnormal time data corresponding to the node, and whether the link response information was received during the previous HPLC link access corresponding to the current node. When none of the processed collected data under the same node is marked as abnormal data, incomplete data, or abnormal time data corresponding to the node, and the link response information was received during the previous HPLC link access, the corresponding node is marked as a normal node. When there are records in the processed collected data under the same node marked as abnormal data, incomplete data, or abnormal time data corresponding to the node, or the link response information was not received during the previous HPLC link access, the corresponding node is marked as an abnormal node.
[0092] After processing, the collected data corresponding to nodes in normal state are summarized in order of access, and the correspondence between node identifier, corresponding access mark and collected data is maintained to form the first round of collection results; for nodes in abnormal state, their corresponding node identifier and corresponding access mark are retained, and the node identifier and corresponding access mark of the nodes in abnormal state constitute the abnormal node identifier.
[0093] The cause-based re-collection module is used to identify the source of anomalies and re-collect differentiated data based on the results of the first round of collection and the anomaly node identifier. The re-collected data is then added to the results of the first round of collection to generate enhanced collection results and anomaly cause labels.
[0094] The collected data corresponding to the abnormal state nodes are extracted and analyzed to determine the source of the abnormality for each abnormal state node and obtain the result of the abnormality source discrimination.
[0095] It should be noted that, based on the abnormal node identifier, the processed data corresponding to each abnormal state node is extracted one by one from the processed data. The processed data belonging to the same abnormal state node are merged into the analysis record corresponding to the same abnormal state node according to the node identifier and the corresponding access mark, forming the record to be analyzed corresponding to the abnormal state node.
[0096] Anomaly analysis is performed on the records to be analyzed. Specifically, the analysis involves checking for abnormal data corresponding to nodes. If such abnormal data exists, the source of the anomaly for the corresponding abnormal state node is determined as a node-related anomaly. The analysis also checks for incomplete data acquisition. If incomplete data exists, the source of the anomaly for the corresponding abnormal state node is determined as a data integrity anomaly. Furthermore, the analysis checks for time-related data acquisition. If time-related data acquisition exists, the source of the anomaly for the corresponding abnormal state node is determined as a time-related anomaly. If no node-related abnormal data, incomplete data acquisition, or time-related abnormal data exists in the record to be analyzed, but the corresponding abnormal state node generates a non-responsive acquisition record during the preceding HPLC link access, the source of the anomaly for the corresponding abnormal state node is determined as a link access anomaly. When multiple types of abnormal data exist simultaneously in the same record to be analyzed, the source of the anomaly is determined in the order of node-related anomaly, data integrity anomaly, time-related anomaly, and link access anomaly. The node identifier, corresponding access marker, and determined source of the anomaly for each abnormal state node are associated and saved to form the anomaly source discrimination result.
[0097] For each abnormal state node, perform corresponding differentiated data re-collection, and record and verify the re-collected data accordingly to obtain the re-collected data.
[0098] It should be noted that, based on the anomaly source identification result, each abnormal state node is selected one by one according to the node identifier and corresponding access mark, and the anomaly source corresponding to each abnormal state node is read; when the anomaly source is a link access anomaly, an HPLC link access request is re-initiated to the corresponding abnormal state node according to the node identifier and corresponding access mark. After receiving the link response information, a data acquisition request is sent again to obtain the acquisition data returned by the corresponding abnormal state node. When no link response information is received again, the node identifier, corresponding access mark, and anomaly source of the corresponding abnormal state node are retained, and a re-acquisition anomaly record is formed; when the anomaly source is an anomaly corresponding to a node, according to the node... After reconfirming the correspondence between the collected data and the corresponding abnormal state node using the node identifier and corresponding access marker, the HPLC link access and data acquisition are initiated again to obtain the re-corresponding collected data. When the source of the anomaly is data integrity, a new data acquisition request is initiated for the corresponding abnormal state node based on the node identifier and corresponding access marker to obtain collected data with complete content. When the source of the anomaly is time status, a new data acquisition request is initiated for the corresponding abnormal state node based on the node identifier and corresponding access marker to obtain collected data with existing data acquisition time, consistent data acquisition time format, and consistent data acquisition time and access order.
[0099] The collected data is matched with the node identifier, corresponding access mark, and anomaly source of the corresponding abnormal state node to form a differentiated data re-collection record. The collected data in the differentiated data re-collection record is verified, including node identifier consistency check, collected data content integrity check, and data collection time validity check. Differentiated data re-collection records that simultaneously meet the requirements of consistent node identifier, complete collected data content, and valid data collection time are retained as valid re-collection records. Differentiated data re-collection records that fail the verification are retained in their original record positions and marked as re-collection abnormal records. The collected data in the valid re-collection records are summarized according to node identifier, corresponding access mark, and anomaly source to obtain the re-collected data.
[0100] The re-collected data is added to the first round of collection results to generate enhanced collection results, and anomaly cause labels are generated based on the anomaly sources corresponding to the re-collected data.
[0101] It should be noted that the re-collected data is located one by one according to the node identifier and corresponding access mark. The target record corresponding to each re-collected data is found in the first round of collection results, and the correspondence between the re-collected data and the target record is maintained. The re-collected data that has been located is used as the basis for supplementation. When there is abnormal data corresponding to the node, incomplete data, or abnormal data collected at different times in the target record, the abnormal data in the target record is replaced with the corresponding re-collected data. When there is a target record in the first round of collection results with the same node identifier and corresponding access mark but missing data, the missing content is supplemented with the corresponding re-collected data. When there is no target record in the first round of collection results with the same node identifier and corresponding access mark, the re-collected data is supplemented into the corresponding node record in the first round of collection results according to the node identifier, corresponding access mark, and access order. The supplementation of re-collected data is completed, and an enhanced collection result containing the first round of collection results and re-collected data is formed.
[0102] For the re-collected data added to the enhanced collection results, a corresponding relationship is established according to the node identifier, the corresponding access mark, and the source of the anomaly, so that the anomaly source corresponding to each piece of re-collected data forms an anomaly reason label that is consistent with the node identifier and the corresponding access mark.
[0103] It should also be noted that existing technologies typically involve a unified re-collection after the first round of collection fails or data anomalies. This approach suffers from problems such as unclear differentiation of anomaly sources, a single re-collection method, and difficulty in accurately matching the re-collection results with the first round of collection results. This solution identifies the source of the anomaly, performs corresponding differentiated data re-collection, and supplements the re-collected data into the first round of collection results. Simultaneously, it generates anomaly cause tags, thereby reducing invalid re-collection, improving the targeting of anomaly handling and the accuracy of the corresponding re-collection results, enhancing the completeness and reliability of anomaly data repair, and providing a clear basis for subsequent classification, packaging, and uploading compilation.
[0104] The hierarchical output module is used to classify and encapsulate the data of each node in the enhanced acquisition results, and determine the upload and compilation method of the data of each node based on the anomaly cause label, so as to generate big data acquisition results.
[0105] The data of each node in the enhanced acquisition results are processed by node-by-node correspondence and classification to obtain the classified node data.
[0106] It should be noted that the data of each node in the enhanced acquisition results is extracted one by one according to the node identifier, and the data of nodes with the same node identifier are merged into the same node, thus completing the node-by-node correspondence between the data of each node in the enhanced acquisition results and the corresponding node. Taking the node data with the completed node-by-node correspondence as the classification processing object, the data of each node under the same node is checked one by one. The check includes whether the data of each node is formed by supplementing the data with re-collected data, and whether the data of each node corresponds to the abnormal cause label formed by re-collected data. When the data of each node under the same node is neither formed by supplementing the data with re-collected data nor corresponds to the abnormal cause label, the corresponding data of each node is determined as normal classification data. When the data of each node under the same node is formed by supplementing the data with re-collected data, or corresponds to the abnormal cause label, the corresponding data of each node is determined as abnormal classification data. The correspondence between the node identifier, the corresponding access mark and the data content of the data of each node determined as normal classification data and abnormal classification data under the same node is retained respectively, and the data is classified and merged according to the node identifier to obtain the classified data of each node.
[0107] The data of each node after classification is classified and packaged separately, and the corresponding upload and compilation method is determined according to the abnormal reason label to obtain the compiled data of each node. The compiled data of each node is summarized and output to generate big data collection results.
[0108] It should be noted that normal classification data is classified and encapsulated according to node identifier, corresponding access tag, and data content, while abnormal classification data is classified and encapsulated according to node identifier, corresponding access tag, data content, and abnormal reason tag, resulting in encapsulated node data.
[0109] Using the encapsulated node data as the compilation object, the encapsulated node data corresponding to normal classification data is determined to be uploaded and compiled in the normal data manner. For the encapsulated node data corresponding to the abnormal cause label, the upload and compilation method is determined according to the abnormal source corresponding to the abnormal cause label. Specifically, when the abnormal source corresponding to the abnormal cause label is a link access abnormality, it is determined to be a pre-upload compilation method, and the corresponding data is output before the normal data. When the abnormal source corresponding to the abnormal cause label is a node-specific abnormality, it is determined to be an independent upload compilation method, and the corresponding data is output as an independent data segment. When the abnormal source corresponding to the abnormal cause label is a data integrity abnormality, it is determined to be a supplementary upload compilation method, and the corresponding data is output after the normal data of the corresponding node. When the abnormal source corresponding to the abnormal cause label is a time status abnormality, it is determined to be a time-series upload compilation method, and the data is output in the order of data collection time.
[0110] The encapsulated node data is sequentially arranged and categorized according to the determined upload and compilation method to obtain the compiled node data. Using the compiled node data as the output basis, the compiled node data is summarized and output according to the node identifier and upload and compilation method, so that the compiled node data maintains the correspondence between node identifier, corresponding access mark, data content and abnormal reason label, and generates big data collection results.
[0111] In summary, this invention achieves refined organization of big data acquisition tasks under conditions of limited HPLC link bandwidth and concurrent access capabilities by: uniformly organizing prior information of nodes and combining it with acquisition value judgment factors to complete value ranking and access timing arrangement; thereby enabling abnormal nodes, changing nodes, and historically missing nodes to enter the acquisition process first, reducing unresponsive access, inefficient access, and duplicate access, and improving scheduling accuracy and link resource utilization efficiency in HPLC big data acquisition scenarios; by identifying the source of abnormality of abnormal state nodes and performing corresponding differentiated data re-acquisition for different sources of abnormality, targeted repair and result correlation of the first round of abnormal acquisition results in the big data acquisition process are achieved, thereby reducing the invalid link occupation caused by unified re-acquisition, improving the completeness of abnormal data repair, the corresponding accuracy of re-acquisition results, and the reliability of big data acquisition result output, and providing a clear basis for subsequent classification, packaging, and uploading compilation.
[0112] It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.
Claims
1. An HPLC intelligent acquisition terminal based on big data acquisition, characterized in that, include: The value scheduling module is used to sort the value of each node and arrange the access sequence based on the node's prior information and the collection value judgment factor, and generate priority collection objects, candidate collection objects and corresponding access tags. The data acquisition and processing module is used to perform HPLC link access and data acquisition on the corresponding nodes according to the priority acquisition objects, candidate acquisition objects and corresponding access tags, and to verify, organize and label the acquired data, and generate the first round of acquisition results and abnormal node identifiers. The cause-based re-collection module is used to identify the source of anomalies and re-collect differentiated data based on the first-round collection results and anomaly node identifiers, obtain re-collected data, and supplement the re-collected data into the first-round collection results to generate enhanced collection results and anomaly cause labels; The hierarchical output module is used to classify and encapsulate the data of each node in the enhanced acquisition results, and determine the upload and compilation method of the data of each node based on the anomaly cause label, so as to generate big data acquisition results.
2. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 1, characterized in that, The value scheduling module includes a priori information management unit, a value sorting unit, and an access orchestration unit. The value scheduling module is used to perform value scheduling processing. The prior information management unit is used to collect and organize the prior information of nodes; The value ranking unit is used to determine the collection priority of each node based on the collection value judgment factor; The access orchestration unit is used to allocate access order and corresponding access tags according to the collection priority.
3. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 1, characterized in that, The acquisition and processing module includes a link access unit, a data acquisition unit, and a result generation unit. The acquisition and processing module is used to perform link acquisition and processing. The link access unit is used to establish HPLC link access to the corresponding node based on the priority acquisition object, the candidate acquisition object and the corresponding access mark; The data acquisition unit is used to acquire the data returned by the corresponding node during the HPLC link access process; The result generation unit is used to perform integrity verification, time sorting, and anomaly labeling on the collected data, and to generate the first round of collection results and anomaly node identifiers.
4. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 1, characterized in that, The cause determination and re-collection module includes a cause determination unit, a re-collection execution unit, and an integration and generation unit. The cause determination and re-collection module is used to perform abnormal cause determination and re-collection processing. The cause discrimination unit is used to discriminate the source of an anomaly of an abnormal node as a node-related anomaly, a data integrity anomaly, a time status anomaly, or a link access anomaly. The re-collection execution unit is used to perform corresponding differentiated re-collection processing on abnormal nodes based on the source of the abnormality, node identifier, and corresponding access marker, so as to obtain re-collected data; The integration and generation unit is used to verify the re-collected data, add the re-collected data to the first round of collection results to form enhanced collection results, and generate anomaly cause labels based on the source of the anomaly.
5. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 1, characterized in that, The hierarchical output module includes a classification and encapsulation unit and an upload and compilation unit. The hierarchical output module is used to perform hierarchical output processing. The classification and encapsulation unit is used to classify and encapsulate the data of each node in the enhanced acquisition results; The upload and compilation unit is used to determine the upload and compilation method of each node's data based on the anomaly cause label.
6. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 2, characterized in that, The node prior information includes the node's historical communication status, historical data collection status, service event status, and current communication status; The factors for determining the value of data collection include the degree of node change, the degree of abnormal attention, the degree of historical data collection gaps, and the degree of link occupancy.
7. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 2, characterized in that, The specific steps of the value scheduling process are as follows: The node prior information of each node is matched with node identifiers, deduplicated records are removed, missing records are marked, time is unified, and state is unified to obtain the sorted node prior information; Based on the sorted prior information of the nodes, the collection value judgment factors corresponding to each node are extracted, and the collection value judgment factors are uniformly processed and ranked to obtain the value ranking results of each node. Based on the value ranking results, the access sequence of each node is arranged to determine the priority collection objects and the candidate collection objects, and corresponding access tags are assigned to the priority collection objects and the candidate collection objects respectively.
8. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 3, characterized in that, The specific steps for the link acquisition and processing are as follows: Based on the priority acquisition target, the candidate acquisition target and the corresponding access mark, HPLC link access and data acquisition are performed on the corresponding nodes in sequence to obtain the raw acquisition data corresponding to each node; The original collected data is checked for node correspondence, integrity, time, and duplicate data to obtain the processed collected data. Each node is labeled with its status. The collected data corresponding to the nodes in normal status is used to form the first round of collection results, and abnormal node identifiers are generated for nodes in abnormal status.
9. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 4, characterized in that, The specific steps for anomaly detection and re-sampling are as follows: The collected data corresponding to the abnormal state nodes are extracted and analyzed to determine the abnormal source of each abnormal state node and obtain the abnormal source discrimination result. For each abnormal state node, perform corresponding differentiated data re-collection, and record and verify the re-collected data accordingly to obtain the re-collected data; The re-collected data is added to the first round of collection results to generate enhanced collection results, and anomaly cause labels are generated based on the anomaly sources corresponding to the re-collected data.
10. The HPLC intelligent acquisition terminal based on big data acquisition as described in claim 5, characterized in that, The hierarchical output processing involves the following steps: The data of each node in the enhanced acquisition results are processed by node-by-node correspondence and classification to obtain the classified data of each node. The data of each node after classification is classified and packaged separately, and the corresponding upload and compilation method is determined according to the abnormal reason label to obtain the compiled data of each node. The compiled data from each node is summarized and output to generate big data collection results.