Method and device for environmental monitoring based on joint self-learning of wireless sensor network
By symbolizing time series data and using energy-driven aggregation in wireless sensor networks, the balance between detection accuracy and communication cost is addressed, enabling efficient environmental anomaly detection, extending the lifespan of sensor networks, and improving detection accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA UNIV OF GEOSCIENCES (WUHAN)
- Filing Date
- 2024-05-29
- Publication Date
- 2026-06-19
AI Technical Summary
Existing methods for detecting anomalies in wireless sensor networks struggle to balance detection accuracy and communication costs simultaneously. This is especially true on resource-constrained sensor nodes, where traditional methods suffer from either insufficient detection reliability or excessive computational burden.
By performing time series symbolization, anomaly probability calculation, and energy-driven aggregation on the original time series at edge nodes, a discrete finite symbol sequence is generated. Anomaly probability is calculated using a first-order Markov model, and weighted fusion is performed at the convergence node to generate a weighted threshold, thereby achieving local anomaly detection.
While reducing storage and computing costs, it improves detection accuracy, balances the energy consumption of edge nodes, extends the lifespan of sensor networks, and reduces the risk of missed and false detections.
Smart Images

Figure CN118484764B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of Internet of Things (IoT) monitoring, and in particular to an environmental monitoring method and apparatus based on joint self-learning of wireless sensor networks. Background Technology
[0002] Wireless sensor networks (WSNs) are widely used in real-time environmental monitoring, aiming to accurately and quickly detect changing trends and anomalies in environmental parameters to provide timely warnings. Traditional methods often rely on empirical thresholds for anomaly identification. While these methods offer good real-time performance, their reliability is somewhat lacking, primarily due to the potential for inaccurate or erroneous data caused by circuit faults and environmental noise.
[0003] In addition, machine learning and deep learning have also been applied to the detection of environmental parameter anomalies based on time series data. These methods are highly accurate and reliable, but most of them have a large computational burden and are centralized processing solutions, which are not suitable for real-time applications of wireless sensor networks.
[0004] Another distributed anomaly detection method transforms observation data into symbol sequences and determines the presence of anomalies by evaluating the anomaly probability of these sequences. This method has low computational complexity and is feasible for front-end deployment, but it faces the challenge of establishing an effective multi-node fusion mechanism to simultaneously balance detection accuracy and communication costs. Summary of the Invention
[0005] The purpose of this invention is to address the problem that existing methods for detecting environmental anomalies in wireless sensor networks cannot simultaneously balance detection accuracy and communication costs, and to provide an environmental monitoring method and apparatus based on joint self-learning of wireless sensor networks.
[0006] The above-mentioned objective of this application is achieved through the following technical solution:
[0007] S1: Obtain the original time series and determine the total number of iterations for each edge node based on the original time series;
[0008] S2: Time series symbolization part: At each edge node, the original time series is normalized, segmented and aggregated approximate calculated and symbolized according to the total number of iterations to generate a discrete finite symbol sequence;
[0009] S3: Edge learning part: Calculates the anomaly probability of discrete finite symbol sequences to generate a single local anomaly detection threshold;
[0010] S4: Energy-driven aggregation section: At the aggregation node, the single local anomaly detection threshold is weighted and fused to obtain a weighted threshold;
[0011] S5: Update the single local anomaly detection threshold by weighted threshold, repeat step S3 until the iteration ends, output the local anomaly detection threshold, and complete the anomaly detection of the environment.
[0012] Optionally, step S1 includes:
[0013] S11: Obtain the original time series of each edge node;
[0014] S12: Determine the step size and sliding window size based on the length of the original time series;
[0015] S13: Based on the step size and the size of the sliding window, calculate the number of iterations to generate the total number of iterations for each edge node. The calculation formula is as follows:
[0016]
[0017] Where α represents the total number of iterations for generating each edge node, L represents the original time series length of each edge node, l represents the size of the sliding window, and Δl represents the step size.
[0018] Optionally, step S2 includes:
[0019] S21: The original time series of each edge node is moved into the sliding window in stages according to the step size and the total number of iterations; the original time series is normalized to generate a discrete sequence with an average value of 0 and a standard deviation of 1 for each edge node.
[0020] S22: Perform piecewise aggregation approximation calculation on the discrete sequences of each edge node to generate an approximate sequence for each edge node. The piecewise aggregation approximation calculation formula is as follows:
[0021]
[0022] in c represents an approximate sequence. j Let n represent the discrete sequence, m represent the size of the elements in the approximate sequence, and i represent the i-th element in the discrete sequence.
[0023] S23: Symbolize the approximate sequence of each edge node to generate a discrete finite symbol sequence of each edge node.
[0024] Optionally, step S23 includes:
[0025] S23a: Based on the characteristic that the normalized time series has a highly Gaussian distribution, determine each breakpoint in the time series with equal probability, that is, a discrete interval with equal area.
[0026] S23b: Based on the variation range of the approximate sequence of each edge node, determine the number of breakpoints and the breakpoint lookup table for each edge node;
[0027] S23c: Determine the number of letters in a string based on the number of breakpoints;
[0028] S23e: Map different letters in the string to different breakpoints; based on the element values in the approximate sequence of each edge node and the lookup table, map the letters at the breakpoints to the approximate sequence, generating a discrete finite symbol sequence for each edge node.
[0029] Optionally, step S3 includes:
[0030] S31: Based on a first-order Markov model, anomaly probabilities are calculated for strings and discrete finite symbol sequences to generate anomalous transition probabilities between symbols. The calculation formula is as follows:
[0031]
[0032] Where, α i ,α j These represent elements in the string, s t ,s t+1 p(s) represents adjacent elements in a discrete finite symbol sequence; t+1 =α j |s t =α i ) represents the transition probability between two adjacent symbols; π A (α i ,α j () represents the probability of anomalous transitions between symbols in the string;
[0033] The anomalous transition probabilities of discrete finite symbol sequences are calculated using the anomalous transition probabilities between symbols. The calculation formula is as follows:
[0034] p A (s)=p(s1)π A (s1,s2)π A (s2,s3)...π A (s k-1 ,s k )
[0035] Where p(s1) represents the prior probability of the first symbol in the discrete finite symbol sequence, p A (s) represents the anomalous transition probability of a discrete finite symbol sequence;
[0036] S32: Based on the total number of iterations, determine the number of iterations per edge node;
[0037] S33: After each edge node is iterated according to the number of single iterations, a single local anomaly probability matrix is generated for each edge node; the single local anomaly probability matrix is composed of the anomaly transition probabilities of discrete finite symbol sequences;
[0038] S34: Take the maximum value in the single local anomaly probability matrix of each edge node as the single local anomaly detection threshold of each edge node.
[0039] Optionally, step S4 includes:
[0040] S41: Obtain the remaining energy and initial energy of each edge node, calculate the energy factor using the energy factor formula, normalize the initial energy and remaining energy, and generate the energy factor for each edge node. The energy factor formula is as follows:
[0041]
[0042] Where e i E represents the energy factor. th This represents the remaining energy threshold for each edge node, to prevent the rapid depletion of remaining energy in each edge node; E i,0 E represents the initial energy of each edge node. i This represents the remaining energy at each edge node;
[0043] S42: Based on the spatial pattern of each edge node, determine the center of the spatial pattern; based on the distance from each edge node to the center and the remaining energy of each edge node, calculate the fusion weight of each edge node using the fusion factor formula, as follows:
[0044]
[0045] Among them, e i The energy factor w represents the energy factor w. i d represents the fusion weight of each edge node. i δ(e) represents the distance from each edge node to the center. i ) indicates whether each edge node participates in the fusion. That is, given a random number between 0 and 1, if the random number is less than the energy factor of each edge node, then δ(e) i If the value is 1, then the value is 0; otherwise, the value is 0.
[0046] S43: At the convergence node, based on the fusion weights of each edge node, the single local anomaly detection threshold is weighted and fused to generate a weighted threshold.
[0047] The formula for weighted fusion is as follows:
[0048]
[0049] in, TH represents the weighted threshold. i This represents the threshold for detecting a local anomaly in a single instance.
[0050] Optionally, the step of obtaining the remaining energy and initial energy of each edge node includes:
[0051] Initialize the initial energy of a preset number of edge nodes, the energy consumption of processing data in a single iteration, and the energy consumption of transmitting data in a single transmission;
[0052] Based on the initial energy of the edge nodes, determine the remaining energy threshold E of the edge nodes. th ;
[0053] Based on the energy consumption of processing data in a single iteration and the energy consumption of transmitting data in a single transmission, the initial energy of each edge node is calculated to obtain the remaining energy of each edge node. The calculation formula is as follows:
[0054] E i =E i,0 -[αE d +2β i E t ]
[0055] Among them, E i,0 E represents the initial energy of each edge node. i E represents the remaining energy of each edge node. d E represents the energy consumption for processing data in a single iteration. t The energy consumption for a single data transmission is represented by α, where α represents the total number of iterations for each edge node, and β represents the energy consumption for a single data transmission. i This represents the number of communications between each edge node and the sink node, calculated from the total number of iterations.
[0056] Optionally, step S5 includes:
[0057] The weighted threshold is sent back to each edge node and compared with the single local anomaly detection threshold of each edge node. If the weighted threshold is greater than the single local anomaly detection threshold, the single local anomaly detection threshold is replaced by the weighted threshold. If the weighted threshold is less than or equal to the single local anomaly detection threshold, the single local anomaly detection threshold of the edge node is retained.
[0058] An environmental monitoring device based on joint self-learning of wireless sensor networks, the device includes: a sensor module, a main control module, and a communication module;
[0059] The sensor module is connected to the communication module; the main control module is connected to the communication module.
[0060] The sensor module is used to acquire raw time series data.
[0061] The communication module is used to input the acquired raw time series data to the main control module;
[0062] The main control module is used to normalize, perform piecewise aggregation approximation calculations, and symbolize the input original time series to generate a discrete finite symbol sequence.
[0063] The main control module is also used to calculate the anomaly probability of discrete finite symbol sequences and generate a single local anomaly detection threshold.
[0064] The main control module is also used to perform weighted fusion of the single local anomaly detection threshold at the aggregation node to obtain a weighted threshold;
[0065] The main control module is also used to update the single local anomaly detection threshold through weighted threshold, perform edge iterative learning, and output the detection threshold.
[0066] A computer-readable storage medium storing instructions that, when executed, perform an environmental monitoring method based on joint self-learning of a wireless sensor network.
[0067] The beneficial effects of the technical solution provided in this application are:
[0068] 1. It can operate effectively on resource-constrained sensor nodes; it effectively reduces storage and computing costs through time-series symbolization; it achieves inter-node communication through energy-driven aggregation, and uses information from other nodes for iterative training, thereby improving detection accuracy while balancing the energy consumption of edge nodes.
[0069] 2. The energy-driven aggregation component weights and fuses local thresholds into a global threshold, enabling information fusion between edge nodes and improving detection accuracy. During fusion, an energy factor is introduced to balance the energy consumption between edge nodes and extend the lifespan of the sensor network.
[0070] 3. The anomaly probability calculation transforms the monitoring of environmental anomalies into the conversion probability of each character in the discrete finite symbol sequence observed, thus better distinguishing between normal data and data when anomalies occur; a first-order Markov model is used, avoiding the computational burden brought by higher-order Markov models. Attached Figure Description
[0071] The present application will be further described below with reference to the accompanying drawings and embodiments. In the accompanying drawings:
[0072] Figure 1This is a block diagram of an environmental monitoring method based on joint self-learning of wireless sensor networks in the embodiments of this application;
[0073] Figure 2 This is a flowchart illustrating the steps of the environmental monitoring method based on joint self-learning of wireless sensor networks in the embodiments of this application.
[0074] Figure 3 This is a transformation diagram of the original time series to discrete symbolic sequence of the environmental monitoring method based on joint self-learning of wireless sensor networks in the embodiments of this application;
[0075] Figure 4 This is a schematic diagram of an environmental monitoring method based on joint self-learning of wireless sensor networks in this application, which uses a miniature wireless sensor network to monitor indoor light intensity.
[0076] Figure 5 This is a map of ambient light intensity information collected by the first edge node of the environmental monitoring method based on joint self-learning of wireless sensor network in this embodiment of the application.
[0077] Figure 6 This is the light intensity information map of the environment collected by the second edge node through the sensor in the environmental monitoring method based on joint self-learning of wireless sensor network in the embodiments of this application;
[0078] Figure 7 This is the light intensity information map of the environment collected by the third edge node of the environmental monitoring method based on joint self-learning of wireless sensor network in the embodiment of this application;
[0079] Figure 8 This is a diagram illustrating the change in the anomaly detection threshold during the iterative training of the first edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks, as described in this application embodiment.
[0080] Figure 9 This is a diagram illustrating the change in the anomaly detection threshold during the iterative training of the second edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks, as described in this application embodiment.
[0081] Figure 10 This is a diagram illustrating the change in the anomaly detection threshold during the iterative training of the third edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks, as described in this application embodiment.
[0082] Figure 11 This is the final anomaly detection result of the first edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks in this embodiment of the application.
[0083] Figure 12This is the final anomaly detection result of the second edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks in the embodiments of this application.
[0084] Figure 13 This is the final anomaly detection result of the third edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks in this embodiment of the application.
[0085] Figure 14 This is the remaining energy map of each edge node in the environmental monitoring method based on joint self-learning of wireless sensor networks in the embodiments of this application after iteration;
[0086] Figure 15 This is a graph showing the change in the BF value of each edge node of the sensor network during the iteration process of the environmental monitoring method based on joint self-learning of wireless sensor networks in the embodiments of this application. Detailed Implementation
[0087] To provide a clearer understanding of the technical features, objectives, and effects of this application, the specific embodiments of this application will now be described in detail with reference to the accompanying drawings.
[0088] Embodiments of this application provide an environmental monitoring method based on joint self-learning of wireless sensor networks.
[0089] Figure 1 This is a block diagram of an environmental monitoring method based on joint self-learning of wireless sensor networks in an embodiment of this application.
[0090] Please refer to Figure 2 , Figure 2 This is a flowchart illustrating the steps of an environmental monitoring method based on joint self-learning of wireless sensor networks in an embodiment of this application, including:
[0091] S1: Obtain the original time series and determine the total number of iterations for each edge node based on the original time series;
[0092] Step S1 includes:
[0093] S11: Obtain the original time series of each edge node;
[0094] S12: Determine the step size and sliding window size based on the length of the original time series;
[0095] S13: Based on the step size and the size of the sliding window, calculate the number of iterations to generate the total number of iterations for each edge node. The calculation formula is as follows:
[0096]
[0097] Where α represents the total number of iterations for generating each edge node, L represents the original time series length of each edge node, l represents the size of the sliding window, and Δl represents the step size.
[0098] S2: Time series symbolization part: At each edge node, the original time series is normalized, segmented and aggregated approximate calculated and symbolized according to the total number of iterations to generate a discrete finite symbol sequence;
[0099] Step S2 includes:
[0100] S21: The original time series of each edge node is moved into the sliding window in stages according to the step size and the total number of iterations; the original time series is normalized to generate a discrete sequence with an average value of 0 and a standard deviation of 1 for each edge node.
[0101] S22: Perform piecewise aggregation approximation calculation on the discrete sequences of each edge node to generate an approximate sequence for each edge node. The piecewise aggregation approximation calculation formula is as follows:
[0102]
[0103] in c represents an approximate sequence. j Let n represent the discrete sequence, m represent the size of the elements in the approximate sequence, and i represent the i-th element in the discrete sequence.
[0104] S23: Symbolize the approximate sequence of each edge node to generate a discrete finite symbol sequence of each edge node.
[0105] Step S23 includes:
[0106] S23a: Based on the characteristic that the normalized time series has a highly Gaussian distribution, determine each breakpoint in the time series with equal probability, that is, a discrete interval with equal area.
[0107] S23b: Based on the variation range of the approximate sequence of each edge node, determine the number of breakpoints and the breakpoint lookup table for each edge node;
[0108] S23c: Determine the number of letters in a string based on the number of breakpoints;
[0109] S23e: Map different letters in the string to different breakpoints; based on the element values in the approximate sequence of each edge node and the lookup table, map the letters at the breakpoints to the approximate sequence, generating a discrete finite symbol sequence for each edge node.
[0110] S3: Edge learning part: Calculates the anomaly probability of discrete finite symbol sequences to generate a single local anomaly detection threshold;
[0111] Step S3 includes:
[0112] S31: Based on a first-order Markov model, anomaly probabilities are calculated for strings and discrete finite symbol sequences to generate anomalous transition probabilities between symbols. The calculation formula is as follows:
[0113]
[0114] Where, α i ,α j These represent elements in the string, s t ,s t+1 p(s) represents adjacent elements in a discrete finite symbol sequence; t+1 =α j |s t =α i ) represents the transition probability between two adjacent symbols; π A (α i ,α j () represents the probability of anomalous transitions between symbols in the string;
[0115] The anomalous transition probabilities of discrete finite symbol sequences are calculated using the anomalous transition probabilities between symbols. The calculation formula is as follows:
[0116] p A (s)=p(s1)π A (s1,s2)π A (s2,s3)…π A (s k-1 ,s k )
[0117] Where p(s1) represents the prior probability of the first symbol in the discrete finite symbol sequence, p A (s) represents the anomalous transition probability of a discrete finite symbol sequence;
[0118] S32: Based on the total number of iterations, determine the number of iterations per edge node;
[0119] S33: After each edge node is iterated according to the number of single iterations, a single local anomaly probability matrix is generated for each edge node; the single local anomaly probability matrix is composed of the anomaly transition probabilities of discrete finite symbol sequences;
[0120] S34: Take the maximum value in the single local anomaly probability matrix of each edge node as the single local anomaly detection threshold of each edge node.
[0121] Specifically, when the edge learning part iterates at each edge node according to the single iteration number, it effectively extracts the environmental information collected by each edge node itself, while reducing the number of communication times and lowering the energy consumption caused by communication in the wireless sensor network.
[0122] Specifically, based on the first-order Markov model, the abnormal transition probability between strings and discrete finite symbol sequences is calculated. The abnormal probability of the discrete finite symbol sequence is calculated to generate the abnormal transition probability of the symbol sequence. The maximum value of the abnormal transition probability of each edge node is taken as the abnormal detection threshold of each edge node.
[0123] S4: Energy-driven aggregation section: At the aggregation node, the single local anomaly detection threshold is weighted and fused to obtain a weighted threshold;
[0124] Step S4 includes:
[0125] S41: Obtain the remaining energy and initial energy of each edge node, calculate the energy factor using the energy factor formula, normalize the initial energy and remaining energy, and generate the energy factor for each edge node. The energy factor formula is as follows:
[0126]
[0127] Where e i E represents the energy factor. th This represents the remaining energy threshold for each edge node, to prevent the rapid depletion of remaining energy in each edge node; E i,0 E represents the initial energy of each edge node. i This represents the remaining energy at each edge node;
[0128] S42: Based on the spatial pattern of each edge node, determine the center of the spatial pattern; based on the distance from each edge node to the center and the remaining energy of each edge node, calculate the fusion weight of each edge node using the fusion factor formula, as follows:
[0129]
[0130] Among them, e i The energy factor w represents the energy factor w. i d represents the fusion weight of each edge node. i δ(e) represents the distance from each edge node to the center. i) indicates whether each edge node participates in the fusion. That is, given a random number between 0 and 1, if the random number is less than the energy factor of each edge node, then δ(e) i If the value is 1, then the value is 0; otherwise, the value is 0.
[0131] S43: At the convergence node, based on the fusion weights of each edge node, the single local anomaly detection threshold is weighted and fused to generate a weighted threshold.
[0132] The formula for weighted fusion is as follows:
[0133]
[0134] in, TH represents the weighted threshold. i This represents the threshold for detecting a local anomaly in a single instance.
[0135] The steps for obtaining the remaining energy and initial energy of each edge node include:
[0136] Initialize the initial energy of a preset number of edge nodes, the energy consumption of processing data in a single iteration, and the energy consumption of transmitting data in a single transmission;
[0137] Based on the initial energy of the edge nodes, determine the remaining energy threshold E of the edge nodes. th ;
[0138] Based on the energy consumption of processing data in a single iteration and the energy consumption of transmitting data in a single transmission, the initial energy of each edge node is calculated to obtain the remaining energy of each edge node. The calculation formula is as follows:
[0139] E i =E i,0 -[αE d +2β i E t ]
[0140] Among them, E i,0 E represents the initial energy of each edge node. i E represents the remaining energy of each edge node. d E represents the energy consumption for processing data in a single iteration. t The energy consumption for a single data transmission is represented by α, where α represents the total number of iterations for each edge node, and β represents the energy consumption for a single data transmission. i This represents the number of communications between each edge node and the sink node, calculated from the total number of iterations.
[0141] S5: Update the single local anomaly detection threshold by weighted threshold, repeat step S3 until the iteration ends, output the local anomaly detection threshold, and complete the anomaly detection of the environment.
[0142] Step S5 includes:
[0143] The weighted threshold is sent back to each edge node and compared with the single local anomaly detection threshold of each edge node. If the weighted threshold is greater than the single local anomaly detection threshold, the single local anomaly detection threshold is replaced by the weighted threshold. If the weighted threshold is less than or equal to the single local anomaly detection threshold, the single local anomaly detection threshold of the edge node is retained.
[0144] An environmental monitoring device based on joint self-learning of wireless sensor networks, the device includes: a sensor module, a main control module, and a communication module;
[0145] The sensor module is connected to the communication module; the main control module is connected to the communication module.
[0146] The sensor module is used to acquire raw time series data.
[0147] The communication module is used to input the acquired raw time series data to the main control module;
[0148] The main control module is used to normalize, perform piecewise aggregation approximation calculations, and symbolize the input original time series to generate a discrete finite symbol sequence.
[0149] The main control module is also used to calculate the anomaly probability of discrete finite symbol sequences and generate a single local anomaly detection threshold.
[0150] The main control module is also used to perform weighted fusion of the single local anomaly detection threshold at the aggregation node to obtain a weighted threshold;
[0151] The main control module is also used to update the single local anomaly detection threshold through weighted threshold, perform edge iterative learning, and output the detection threshold.
[0152] In this application, the aggregation node can be any type of terminal or a data acquisition node including sensors; the aggregation node can store and process the data acquired by the sensors, and the edge nodes and the aggregation node can communicate with each other.
[0153] For example, Figure 3 A transformation diagram is given for converting the original time series into a discrete symbolic sequence. The length of the original time series is n=128, the size of the symbolic sequence is m=8, the number of characters is k=6, and the data size is reduced from 128 to 16, which greatly reduces the storage and computational requirements.
[0154] Please refer to Figure 4This method is applied to indoor light intensity detection using a miniature wireless sensor network. This sensor network consists of one aggregation node and three edge nodes, connected via Wi-Fi and using the MQTT protocol for data transmission. The ESP32, acting as the aggregation node, is responsible for data communication and aggregation. Each edge node, composed of an ESP32 and a light sensor module, is responsible for data acquisition, time-series symbolization, edge training, and communication. Its output voltage range is 0–3.3V, corresponding to an analog-to-digital conversion range of 0–4095. The darker the ambient light, the higher the output voltage.
[0155] Figure 5 , Figure 6 , Figure 7 The data presented here are ambient light intensity information collected by sensors from three edge nodes. Each edge node collects 1600 sets of light intensity data at fixed time intervals for this example, with the first 1000 sets used as the training set and the last 600 sets as the test set. During data collection, an anomaly occurred when the three edge nodes collected the 1530th set of data. After the anomaly, the sensors continued to collect 30 more sets of data before restoring the light source. The light intensity curves collected by nodes 1 and 2 were affected by human and environmental factors, resulting in varying degrees of sharp spikes, especially in node 1.
[0156] For example, in order to evaluate the performance of this method, it is compared with two other methods that are exactly the same as this method in terms of time series symbolization and edge learning. The differences are: 1) Method A: This method uses the average aggregation method and does not consider the residual energy; 2) Method B: This method uses majority voting fusion.
[0157] For example, to better illustrate the impact of the energy system on nodes with different initial energies, in a specific instance, the nodes are randomly initialized with different initial energies: extremely low initial energy, relatively low initial energy, and relatively high initial energy. The initial energy E of edge node 1 is... 1,0 =7.2, the initial energy E of edge node 2 2,0 =5.1, the initial energy E of edge node 3 3,0 =3.2, the remaining energy threshold E of each edge node th =1, Energy consumption E for edge node data processing d =0.004, Energy consumption E of a single data transmission at an edge node t =0.01.
[0158] For example, in a specific instance, the window size l = 40, the symbol sequence size m = 8, the number of characters k = 6, the step size Δl = 4, the number of iterations per edge node q = 4, and the number of communications between each edge node and the sink node β = 60.
[0159] Figure 8 , Figure 9 , Figure 10 The changes in the anomaly detection threshold during iterative training of three edge nodes are presented. For better comparison and observation, both Method A and Method B use the number of communications as the observation interval. The solid line represents the change in the detection threshold when using this method, the dashed dot represents the change in the detection threshold when using Method A, and the dashed line represents the change in the detection threshold when using Method B. The horizontal axis represents the number of communications, and the vertical axis represents the anomaly detection threshold.
[0160] For example, in a specific instance, for edge node 1, the detection threshold using this method reaches 0.2182 in the 14th communication and remains unchanged thereafter; the detection threshold using scheme A reaches 0.2182 in the 14th communication and remains unchanged thereafter; the detection threshold using scheme B reaches 0.2182 in the 14th communication and remains unchanged thereafter; for edge node 2, the detection threshold using this method reaches 0.2181 in the 42nd communication and remains unchanged thereafter; the detection threshold using scheme A reaches 0.2182 in the 31st communication and remains unchanged thereafter; the detection threshold using scheme B reaches 0.2078 in the 2nd communication and remains unchanged thereafter; for edge node 3, the detection threshold using this method reaches 0.2149 in the 26th communication and remains unchanged thereafter; the detection threshold using scheme A reaches 0.2182 in the 31st communication and remains unchanged thereafter; the detection threshold using scheme B reaches 0.1670 in the 29th communication and remains unchanged thereafter. Since this method uses energy-driven aggregation, the change process of the anomaly detection threshold of each edge node is different from that of methods A and B.
[0161] Figure 11 , Figure 12 , Figure 13The final anomaly detection results for the three edge nodes are shown. The left side is the original illumination intensity map, where the horizontal axis represents the number of samples and the vertical axis represents the illumination intensity. The right side is the detection map transformed using the time series symbolization method, where the horizontal axis represents the number of samples and the vertical axis represents the anomaly detection threshold. As can be seen in the left image, due to short-term human interference, the illumination intensity maps of edge node 1 and edge node 2 show different degrees of sharp spikes, especially edge node 1, which exhibits a higher degree of sharp spikes. In the right image, the solid line represents the anomaly detection threshold when using this method, the dashed dots represent the anomaly detection threshold when using method A, and the dashed line represents the anomaly detection threshold when using method B. The portion exceeding the horizontal line indicates detected anomalies in the ambient illumination.
[0162] For example, in a specific instance, when three edge nodes independently complete edge training, the final detection thresholds output by the three edge nodes are different. Because the illumination intensity data collected by edge node 1 contains significant sharp noise, its output detection threshold is much higher than that of the other two nodes. When the three edge nodes are aggregated using an energy-driven aggregation method, their detection thresholds will tend to be similar as they are continuously fused. For edge node 1, since the detection threshold during independent edge training is greater than the detection threshold returned after fusion, the detection threshold obtained using this method will be basically the same as that obtained using methods A and B (0.2182), meaning that... Figure 11 The three detection lines overlap; for node 2, since the detection threshold during independent edge training is lower than the detection threshold returned after fusion, the detection threshold obtained using this method (0.2181) and the detection threshold obtained using method A (0.2182) will be higher than the detection threshold obtained using method B (0.2078). Figure 12 The detection lines obtained using this method and method A are higher than those obtained using method B. For node 2, since the detection threshold during independent edge training is lower than the detection threshold returned after fusion, the detection threshold obtained using this method (0.2140) and the detection threshold obtained using method A (0.2182) are higher than the detection threshold obtained using method B (0.1670). Figure 13 The detection line obtained using this method and method A is higher than the detection line obtained using method B. Compared with method A, this method uses energy-driven aggregation, so when using method A, the detection thresholds of the three nodes converge to the same value during training iterations, while when using this method, the detection thresholds of the three nodes are different.
[0163] For example, in specific instances, time-series symbolization and edge learning can, to some extent, reduce the impact of random noise on detection performance. Figure 12However, when there is sharp noise interference with large fluctuations, there may be missed detections (such as...). Figure 11 ), or when there are relatively gentle fluctuations, false positives may occur (such as Figure 13 Energy-driven aggregation allows nodes to utilize information from other nodes to obtain the overall anomaly threshold of the environment. At the same time, energy-driven aggregation also connects all edge nodes into a whole. When an edge node detects an anomaly, it indicates that the entire environment is abnormal. This greatly reduces the risk of missed detections and false detections and improves the detection accuracy of the detection system.
[0164] For example, in a specific instance, to better demonstrate the efficiency of the energy system, a balance factor (BF) is introduced to calculate the overall balance of energy consumption in the system. A higher BF value results in a longer sensor network lifetime. The formula for calculating the BF value is as follows:
[0165]
[0166] Where N represents the number of nodes. This represents the remaining energy at each node.
[0167] Figure 14 The remaining energy of each edge node after iteration is given, where the horizontal axis represents the edge node name and the vertical axis represents the remaining energy value. When using this method, the remaining energy of each node after training is higher than when using this method. After training, when using this method, the remaining energy of edge node 1 is 5.70, the remaining energy of edge node 2 is 3.76, and the remaining energy of edge node 3 is 2.04. When using scheme A, the remaining energy of edge node 1 is 5.04, the remaining energy of edge node 2 is 2.94, and the remaining energy of edge node 3 is 1.04.
[0168] Figure 15 The changes in the system's BF value for each edge node during the iteration process are presented. To relate this to the communication process, the horizontal axis represents the number of communications, and the vertical axis represents the BF value. The solid line represents the change in the system's BF value when using this method, and the dashed dots represent the change in the system's BF value when using method A. The BF value using this method is consistently higher than the BF value using method A. During the training iterations, when using this method, the overall system's BF value remained above 0.85, while when using method A, the overall system's BF value decreased to 0.77. This demonstrates that using this method significantly increases the sensor's lifetime.
[0169] The above are merely exemplary embodiments of this disclosure and should not be construed as limiting the scope of this disclosure. Any equivalent changes and modifications made in accordance with the teachings of this disclosure shall still fall within the scope of this disclosure. Other embodiments of this disclosure will readily conceive of those skilled in the art upon consideration of the specification and the disclosure of practical truths.
[0170] This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not described in this disclosure. The specification and embodiments are to be considered exemplary only, and the scope and spirit of this disclosure are defined by the claims.
Claims
1. A method for environment monitoring based on joint self-learning of wireless sensor networks, characterized in that, The method includes the following steps: S1: Obtain the original time series and determine the total number of iterations for each edge node based on the original time series; S2: Time series symbolization part: At each edge node, the original time series is normalized, segmented and aggregated approximate calculated and symbolized according to the total number of iterations to generate a discrete finite symbol sequence; S3: Edge learning part: Calculates the anomaly probability of discrete finite symbol sequences to generate a single local anomaly detection threshold; S4: Energy-driven aggregation section: At the aggregation node, the single local anomaly detection threshold is weighted and fused to obtain a weighted threshold; Step S4 includes: S41: Obtain the remaining energy and initial energy of each edge node, calculate the energy factor using the energy factor formula, normalize the initial energy and remaining energy, and generate the energy factor for each edge node. The energy factor formula is as follows: in Indicates energy factor, This represents the remaining energy threshold of each edge node, in order to prevent the rapid depletion of the remaining energy of each edge node; This represents the initial energy of each edge node. This represents the remaining energy at each edge node; S42: Based on the spatial pattern of each edge node, determine the center of the spatial pattern; based on the distance from each edge node to the center and the remaining energy of each edge node, calculate the fusion weight of each edge node using the fusion factor formula, as follows: in, This refers to the energy factor. This represents the fusion weight of each edge node. This represents the distance from each edge node to the center. This indicates whether each edge node participates in the fusion process. Specifically, given a random number between 0 and 1, if the random number is less than the energy factor of each edge node, then... It is 1 if it is true, otherwise it is 0. S43: At the convergence node, based on the fusion weights of each edge node, the single local anomaly detection threshold is weighted and fused to generate a weighted threshold. The formula for weighted fusion is as follows: in, Indicates the weighted threshold. This represents the threshold for detecting a local anomaly in a single instance. S5: Update the single local anomaly detection threshold by weighted threshold, repeat step S3 until the iteration ends, output the local anomaly detection threshold, and complete the anomaly detection of the environment.
2. The environmental monitoring method based on joint self-learning of wireless sensor networks as described in claim 1, characterized in that, Step S1 includes: S11: Obtain the original time series of each edge node; S12: Determine the step size and sliding window size based on the length of the original time series; S13: Based on the step size and the size of the sliding window, calculate the number of iterations to generate the total number of iterations for each edge node. The calculation formula is as follows: in, This represents the total number of iterations required to generate each edge node, and L represents the original time series length of each edge node. This indicates the size of the sliding window. Indicates the step size.
3. The method of claim 2, wherein the method further comprises: Step S2 includes: S21: The original time series of each edge node is moved into the sliding window in stages according to the step size and the total number of iterations; the original time series is normalized to generate a discrete sequence with an average value of 0 and a standard deviation of 1 for each edge node. S22: Perform piecewise aggregation approximation calculation on the discrete sequences of each edge node to generate an approximate sequence for each edge node. The piecewise aggregation approximation calculation formula is as follows: in Represents an approximate sequence. Represents a discrete sequence. Represents the number of elements in a discrete sequence. Indicates the size of the elements in the approximate sequence; Represents the first in a discrete sequence One element; S23: Symbolize the approximate sequence of each edge node to generate a discrete finite symbol sequence of each edge node.
4. The environment monitoring method based on the joint self-learning of the wireless sensor network according to claim 3, characterized in that, Step S23 includes: S23a: Based on the characteristic that the normalized time series has a highly Gaussian distribution, determine each breakpoint in the time series with equal probability, that is, a discrete interval with equal area. S23b: Based on the variation range of the approximate sequence of each edge node, determine the number of breakpoints and the breakpoint lookup table for each edge node; S23c: Determine the number of letters in a string based on the number of breakpoints; S23e: Map different letters in the string to different breakpoints; based on the element values in the approximate sequence of each edge node and the lookup table, map the letters at the breakpoints to the approximate sequence, generating a discrete finite symbol sequence for each edge node.
5. The method of claim 3, wherein the method further comprises: Step S3 includes: S31: Based on a first-order Markov model, anomaly probabilities are calculated for strings and discrete finite symbol sequences to generate anomalous transition probabilities between symbols. The calculation formula is as follows: in, These represent the elements in the string. These represent adjacent elements in a discrete finite symbol sequence; This represents the transition probability between two adjacent symbols; This represents the probability of anomalous transitions between symbols in the string; The anomalous transition probabilities of discrete finite symbol sequences are calculated using the anomalous transition probabilities between symbols. The calculation formula is as follows: in, This represents the prior probability of the first symbol in the discrete finite symbol sequence. Represents the anomalous transition probability of a discrete finite symbol sequence; S32: Based on the total number of iterations, determine the number of iterations per edge node; S33: After each edge node is iterated according to the number of single iterations, a single local anomaly probability matrix is generated for each edge node; the single local anomaly probability matrix is composed of the anomaly transition probabilities of discrete finite symbol sequences; S34: Take the maximum value in the single local anomaly probability matrix of each edge node as the single local anomaly detection threshold of each edge node.
6. The environmental monitoring method based on joint self-learning of wireless sensor networks as described in claim 1, characterized in that, The steps for obtaining the remaining energy and initial energy of each edge node include: Initialize the initial energy of a preset number of edge nodes, the energy consumption of processing data in a single iteration, and the energy consumption of transmitting data in a single transmission; Based on the initial energy of the edge nodes, determine the remaining energy threshold of the edge nodes. ; Based on the energy consumption of processing data in a single iteration and the energy consumption of transmitting data in a single transmission, the initial energy of each edge node is calculated to obtain the remaining energy of each edge node. The calculation formula is as follows: in, This represents the initial energy of each edge node. This represents the remaining energy of each edge node. This represents the energy consumption for processing data in a single iteration. This indicates the energy consumption of a single data transmission. This represents the total number of iterations for each edge node. This represents the number of communications between each edge node and the sink node, calculated from the total number of iterations.
7. The method of claim 1, wherein the method further comprises: Step S5 includes: The weighted threshold is sent back to each edge node and compared with the single local anomaly detection threshold of each edge node. If the weighted threshold is greater than the single local anomaly detection threshold, the single local anomaly detection threshold is replaced by the weighted threshold. If the weighted threshold is less than or equal to the single local anomaly detection threshold, the single local anomaly detection threshold of the edge node is retained.
8. An environment monitoring device based on joint self-learning of wireless sensor networks, characterized in that, The device includes: a sensor module, a main control module, and a communication module; The sensor module is connected to the communication module; the main control module is connected to the communication module. The sensor module is used to acquire raw time series data. The communication module is used to input the acquired raw time series data to the main control module; The main control module is used to normalize, perform piecewise aggregation approximation calculations, and symbolize the input original time series to generate a discrete finite symbol sequence. The main control module is also used to calculate the anomaly probability of discrete finite symbol sequences and generate a single local anomaly detection threshold. The main control module is also used to perform weighted fusion of single local anomaly detection thresholds at the aggregation node to obtain a weighted threshold. The steps include: S41: Obtain the remaining energy and initial energy of each edge node, calculate the energy factor using the energy factor formula, normalize the initial energy and remaining energy, and generate the energy factor for each edge node. The energy factor formula is as follows: in Indicates energy factor, This represents the remaining energy threshold of each edge node, in order to prevent the rapid depletion of the remaining energy of each edge node; This represents the initial energy of each edge node. This represents the remaining energy at each edge node; S42: Based on the spatial pattern of each edge node, determine the center of the spatial pattern; based on the distance from each edge node to the center and the remaining energy of each edge node, calculate the fusion weight of each edge node using the fusion factor formula, as follows: in, This refers to the energy factor. This represents the fusion weight of each edge node. This represents the distance from each edge node to the center. This indicates whether each edge node participates in the fusion process. Specifically, given a random number between 0 and 1, if the random number is less than the energy factor of each edge node, then... It is 1 if it is true, otherwise it is 0. S43: At the convergence node, based on the fusion weights of each edge node, the single local anomaly detection threshold is weighted and fused to generate a weighted threshold. The formula for weighted fusion is as follows: wherein, denotes a weighted threshold value, denotes a single local anomaly detection threshold value; The main control module is also used to update the single local anomaly detection threshold through weighted threshold, perform edge iterative learning, and output the detection threshold.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed by a computer, perform the steps of the method as described in any one of claims 1-7.