A method for dynamically adjusting thresholds of marine observation data based on dynamic confidence intervals

By constructing a marine dynamic constraint confidence boundary model and multi-dimensional anomaly detection, and dynamically adjusting the threshold of marine observation data, the problem of misjudgment caused by fixed thresholds is solved, and efficient and accurate identification of marine observation data is achieved.

CN122196641APending Publication Date: 2026-06-12BEIHAI FORECASTING CENT OF STATE OCEANIC ADMINISTRATION ((QINGDAO MARINE FORECASTING STATION OF STATE OCEANIC ADMINISTRATION) (QINGDAO MARINE ENVIRONMENT MONITORING CENT OF STATE OCEANIC ADMINISTRATION))

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIHAI FORECASTING CENT OF STATE OCEANIC ADMINISTRATION ((QINGDAO MARINE FORECASTING STATION OF STATE OCEANIC ADMINISTRATION) (QINGDAO MARINE ENVIRONMENT MONITORING CENT OF STATE OCEANIC ADMINISTRATION))
Filing Date
2026-05-14
Publication Date
2026-06-12

Smart Images

  • Figure CN122196641A_ABST
    Figure CN122196641A_ABST
Patent Text Reader

Abstract

The application discloses a marine observation data threshold self-adaptive adjustment method based on a dynamic confidence interval and belongs to the technical field of marine observation.The initial confidence interval is constructed by collecting historical data and the seasonal type is marked, the time dimension first-order difference change rate and the space dimension deviation rate of real-time measurement values are calculated, the data is input into a marine dynamic constraint confidence boundary model to output the upper and lower limits of the dynamic confidence interval, the Rossby number geostrophic balance check is carried out when the data is over the limits in two dimensions, the Mahalanobis distance square value is calculated after dimension reduction through principal component analysis to confirm the multi-parameter coupled abnormal data, the abnormal parameter source is located by reverse calculation of a load matrix to start a sensor compensation and mutual check program, the isolated forest algorithm is adopted to calculate the abnormal score, the confidence interval width is dynamically adjusted according to the parameter error rate to implement closed-loop feedback calibration, and the technical problems that the normal dynamic process is misjudged as abnormal due to the fixed marine observation data threshold setting, and the real equipment failure and environmental mutation are difficult to be identified in time are solved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of marine observation technology, and specifically relates to a method for adaptive adjustment of marine observation data thresholds based on dynamic confidence intervals. Background Technology

[0002] Quality control of marine observation data is a crucial aspect of marine monitoring systems. Traditional methods employ fixed thresholds based on historical statistics for anomaly detection. Confidence intervals are constructed by calculating the mean and standard deviation of historical data, and measurements exceeding these intervals are marked as anomalous. However, the marine environment exhibits significant seasonal variations and multi-temporal-scale dynamic processes, making fixed thresholds inadequate for handling the drastic parameter fluctuations caused by normal marine phenomena such as mesoscale eddies and frontal processes. In current marine observation station operation and maintenance practices, the lack of consideration for ocean dynamic constraints often leads to rapid changes in temperature and salinity during typhoon passage being misjudged as equipment malfunctions, resulting in extensive manual verification. Conversely, genuine faults such as sensor baseline drift fail to trigger warning thresholds due to slow parameter changes. In other words, existing technologies suffer from the technical problem of fixed thresholds for marine observation data leading to the misjudgment of normal dynamic processes as anomalies, while genuine equipment malfunctions and sudden environmental changes remain difficult to identify in a timely manner. Summary of the Invention

[0003] In view of this, the present invention provides an adaptive adjustment method for ocean observation data thresholds based on dynamic confidence intervals, which can solve the technical problem in the prior art where the fixed setting of ocean observation data thresholds leads to normal dynamic processes being misjudged as abnormal, while real equipment failures and sudden environmental changes are difficult to identify in a timely manner.

[0004] This invention is implemented as follows: It provides a method for adaptively adjusting the threshold of marine observation data based on dynamic confidence intervals. This method collects historical data from target observation stations, calculates the historical mean and standard deviation to construct an initial confidence interval, and labels the seasonal type. It acquires real-time measurements and collects concurrent data from surrounding observation stations, calculating the first-order difference rate of change in the time dimension and the spatial deviation rate. The real-time measurements and historical data are input into a marine dynamic constraint confidence boundary model, which outputs the upper and lower limits of the dynamic confidence interval and multi-dimensional anomaly metrics. It determines whether the first-order difference rate of change in the time dimension and the spatial deviation rate exceed the limits in both dimensions simultaneously, marking them as suspected anomalies. It performs marine dynamic constraint verification, calculates the Rossby number to determine normal marine dynamic processes, and performs principal component analysis to reduce the dimensionality of suspected anomalies. It calculates the squared Mahalanobis distance in the principal component space to confirm multi-parameter coupled anomalies. It uses the load matrix to infer the source of the anomaly parameters, initiates a sensor dynamic response compensation program and a dual-path redundant sensor mutual verification program, and uses the isolated forest algorithm to detect outliers in the observation samples, calculating anomaly scores. It then dynamically expands the range of the upper and lower limits of the dynamic confidence interval based on the statistical parameter error rate to achieve closed-loop feedback calibration.

[0005] Specifically, the construction of the initial confidence interval involves extracting the historical mean and historical standard deviation, calculating the historical mean minus 3 times the historical standard deviation to obtain the lower limit of the initial confidence interval, and calculating the historical mean plus 3 times the historical standard deviation to obtain the upper limit of the initial confidence interval.

[0006] Specifically, the calculation of the first-order difference rate of change in the time dimension involves extracting the parameter measurement values ​​at two adjacent moments, calculating the ratio of the difference between the two values ​​to the time interval, and obtaining the parameter change rate per unit time.

[0007] Specifically, the calculation of the spatial dimension deviation rate involves extracting the measured values ​​from the contemporaneous data of surrounding observation stations, calculating the mean of the surrounding observation stations, and then dividing the difference between the real-time measured value and the mean of the surrounding observation stations by the mean of the surrounding observation stations to obtain the spatial dimension deviation rate in percentage form.

[0008] The time and space thresholds are set according to different parameter types: the time threshold for wind speed is 15 m / s / h and the space threshold is 20%; the time threshold for temperature is 3℃ / h and the space threshold is 15%; and the time threshold for salinity is 0.5 psu / h and the space threshold is 10%.

[0009] Among them, the seasonal types are divided into four categories: spring, summer, autumn, and winter. For summer, the upper limit of the dynamic confidence interval for temperatures from June to August is increased by 2 to 3°C based on the upper limit of the initial confidence interval. For winter, the lower limit of the dynamic confidence interval for temperatures from December to February of the following year is decreased by 2 to 3°C based on the lower limit of the initial confidence interval.

[0010] The calculation of the Rossby number involves obtaining the characteristic flow velocity, characteristic length scale, and Coriolis parameter of the observation area, and then dividing the characteristic flow velocity by the product of the characteristic length scale and the Coriolis parameter to obtain a dimensionless number representing the ratio of inertial force to Coriolis force.

[0011] Principal component analysis (PCA) dimensionality reduction involves standardizing the high-dimensional data matrix composed of temperature, salinity, dissolved oxygen, turbidity, and chlorophyll concentration, calculating the covariance matrix, solving for eigenvalues ​​and eigenvectors, sorting the eigenvalues ​​by size, selecting the top 3 to 5 principal components, and projecting the original data onto the principal component space to obtain the dimensionality-reduced data.

[0012] Specifically, the calculation of the squared Mahalanobis distance involves calculating the difference vector between the sample point and the mean of all samples in the principal component space, transposing the difference vector, multiplying it on the left by the inverse of the covariance matrix, and then multiplying it on the right by the difference vector itself to obtain a distance metric that takes into account the correlation between variables.

[0013] The chi-square distribution critical value is determined based on the number of principal components, with a significance level set to 0.001. When the number of principal components is 3, the chi-square distribution critical value is 16.266, and when the number of principal components is 5, the chi-square distribution critical value is 20.515.

[0014] Specifically, the loading matrix inverse calculation involves extracting the loading matrix obtained during the dimensionality reduction process of principal component analysis. Each column of the loading matrix represents the weight coefficient of a principal component on each original parameter. By analyzing the weight coefficients with larger absolute values ​​in the columns corresponding to the abnormal principal components, the original parameters that contribute the most to the multi-parameter coupled abnormal data are identified.

[0015] The sensor dynamic response compensation program specifically calls the sensor transfer function parameters obtained from laboratory calibration under different flow rate conditions, performs deconvolution processing on the current sensor output signal to eliminate the hysteresis effect caused by thermal inertia or mechanical inertia, and shortens the response time from 10 to 30 seconds to 4 to 12 seconds.

[0016] The dual-redundant sensor mutual verification program involves the main sensor continuously working and outputting its data in real time, while the backup sensor starts up once a week for short-term measurement and outputs its data. The output difference between the main sensor data and the backup sensor data is compared under the same environment. When the output difference exceeds 5%, it is determined that the main sensor has baseline drift. A mechanical brushing device is then activated to clean the probe surface, and the drift component is separated using a Kalman filter algorithm.

[0017] The isolation tree construction of the Isolation Forest algorithm involves randomly selecting a subset of samples from the high-dimensional sample set, randomly selecting a feature from the feature dimension of the subset, randomly generating a split value between the maximum and minimum values ​​of the feature, dividing the subset into two subsets according to the split value, and recursively repeating this process until each sample is isolated or the maximum tree depth is reached.

[0018] Specifically, the calculation of the normalized average path length involves counting the path length traversed by each sample from the root node to the isolated node in 200 isolated trees, calculating the arithmetic mean of the 200 path lengths, and normalizing the arithmetic mean by dividing it by the expected average path length of the theoretical normal samples to obtain an anomaly score between 0 and 1.

[0019] Specifically, the closed-loop feedback calibration involves collecting all samples marked as multi-parameter coupling anomalies for the current month, calculating the deviation of the actual measured value of each parameter from the upper and lower limits of the dynamic confidence interval, and calculating the parameter error rate as the ratio of the number of deviation samples to the total number of samples. When the parameter error rate exceeds 5%, the upper and lower limits of the dynamic confidence interval of the parameter are automatically multiplied by 1.1 to expand the range.

[0020] This invention achieves adaptive adjustment of the upper and lower limits of the dynamic confidence interval by constructing a marine dynamic constraint confidence boundary model. It also combines this with verification of the geostrophic balance relationship calculated using Rossby number to distinguish between normal marine dynamic processes and real anomalous events, thus solving the misjudgment problem caused by fixed thresholds. The marine dynamic constraint confidence boundary model employs a network depth dynamic adjustment algorithm based on random depth. During the training phase, the second hidden layer is randomly skipped to construct network structures of different depths, enabling the model to learn multi-scale feature representations. During the inference phase, the complete three-layer structure is maintained, and the difference in expected output is compensated by a scaling factor, enhancing its adaptability to multi-temporal-scale coupled processes such as vortex internal waves and tides. The squared Mahalanobis distance value after dimensionality reduction by principal component analysis, combined with the load matrix back-inference mechanism, accurately locates the source of anomalous parameters. The 200 isolated trees constructed by the isolated forest algorithm quantify the outlier degree through the normalized value of the average path length. Closed-loop feedback calibration dynamically expands the confidence interval width based on the parameter error rate. In summary, this invention solves the technical problem mentioned in the background art, where a fixed threshold setting for marine observation data leads to the misjudgment of normal dynamic processes as anomalous events, while real equipment failures and environmental mutations are difficult to identify in a timely manner. Attached Figure Description

[0021] Figure 1 This is a flowchart of the method of the present invention.

[0022] Figure 2 A comparison chart of the multi-parameter dynamic confidence intervals output by the ocean dynamic constraint confidence boundary model.

[0023] Figure 3 Anomaly score distribution of observed samples calculated using the Isolation Forest algorithm.

[0024] Figure 4 The convergence curve of the loss function during the training process of the marine dynamically constrained confidence boundary model. Detailed Implementation

[0025] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below.

[0026] like Figure 1 The diagram shown is a flowchart of a method for adaptively adjusting the threshold of ocean observation data based on dynamic confidence intervals, provided by this invention. This method includes the following steps: S01. Collect historical data of the target observation station for the same period in the past 3 years, calculate the historical mean and historical standard deviation of temperature, salinity, dissolved oxygen, turbidity, chlorophyll concentration and wind speed, construct the initial confidence interval based on the 3σ principle, and label the seasonal type; S02. Obtain the real-time measurement values ​​of each parameter of the observation station at the current moment, and simultaneously collect the synchronous data of 3 to 5 surrounding observation stations, and calculate the first-order difference rate of change in the time dimension and the deviation rate in the spatial dimension. S03. Input the real-time measurement value and the historical data into the ocean dynamic constraint confidence boundary model, and output the upper and lower limits of the dynamic confidence interval and the multidimensional anomaly measurement value. The ocean dynamic constraint confidence boundary model automatically adjusts the upper and lower limits of the dynamic confidence interval according to the seasonal type. S04. Determine whether the first-order difference rate of change in the time dimension exceeds the time threshold and whether the deviation rate of the spatial dimension exceeds the spatial threshold. If both dimensions exceed the limits at the same time, mark it as suspected abnormal data and proceed to the verification process. S05. Perform ocean dynamics constraint verification on the suspected abnormal data, calculate the Rossby number corresponding to the geostrophic equilibrium relationship of the temperature, salinity and density fields, and determine that the abnormality is removed when the Rossby number is less than 0.1; S06. Perform principal component analysis to reduce the dimensionality of the suspected anomalous data that has passed the ocean dynamics constraint verification. Calculate the squared Mahalanobis distance in the principal component space. When the squared Mahalanobis distance exceeds the chi-square distribution critical value corresponding to the degree of freedom, it is confirmed as multi-parameter coupled anomalous data. S07. The multi-parameter coupled abnormal data is back-inferred to the original parameter space through the load matrix to locate the source of the abnormal parameters, and the sensor dynamic response compensation program and the dual-channel redundant sensor mutual verification program are started. S08. Use the isolated forest algorithm to detect outliers in all observed samples of the current batch, construct a forest of 200 isolated trees, and calculate the normalized average path length of each sample as the anomaly score. S09. Calculate the parameter error rate of all abnormal data in the current month. When the parameter error rate exceeds 5%, expand the upper and lower limits of the dynamic confidence interval of the corresponding parameter by 10% and update the weight parameters of the marine dynamic constraint confidence boundary model to achieve closed-loop feedback calibration. S10. When the first-order difference rate of change parameter of the wind speed in the time dimension is in the range of 12 to 18 m / s / h, the system response speed is optimized by adjusting the execution frequency of the ocean dynamics constraint verification to 150% of the original frequency. At the same time, the performance index of the Rossby number calculation accuracy is monitored to ensure stable operation and avoid system oscillation caused by frequency changes.

[0027] The steps for constructing the initial confidence interval are as follows: extract the historical mean and the historical standard deviation, calculate the historical mean minus 3 times the historical standard deviation to obtain the lower limit of the initial confidence interval, and calculate the historical mean plus 3 times the historical standard deviation to obtain the upper limit of the initial confidence interval.

[0028] The calculation steps for the first-order difference rate of change in the time dimension are as follows: extract the parameter measurement values ​​at two adjacent moments, calculate the ratio of the difference between the two to the time interval, and obtain the parameter change rate per unit time.

[0029] The calculation steps for the spatial dimension deviation rate are as follows: extract the measured values ​​from the synchronous data of 3 to 5 surrounding observation stations, calculate the mean of the measured values ​​from the synchronous data of 3 to 5 surrounding observation stations as the mean of surrounding observation stations, calculate the difference between the real-time measured value and the mean of surrounding observation stations, and divide the difference by the mean of surrounding observation stations to obtain the spatial dimension deviation rate in percentage form.

[0030] The time threshold and the spatial threshold are set according to different parameter types. The time threshold for wind speed is 15 m / s / h, the spatial threshold for wind speed is 20%, the time threshold for temperature is 3℃ / h, the spatial threshold for temperature is 15%, the time threshold for salinity is 0.5 psu / h, and the spatial threshold for salinity is 10%.

[0031] The seasonal types are divided into four categories: spring, summer, autumn, and winter. Summer corresponds to June to August, and the upper limit of the dynamic confidence interval for temperature is increased by 2 to 3°C based on the upper limit of the initial confidence interval. Winter corresponds to December to February of the following year, and the lower limit of the dynamic confidence interval for temperature is decreased by 2 to 3°C based on the lower limit of the initial confidence interval.

[0032] The calculation steps for the Rossby number are as follows: obtain the characteristic flow velocity, characteristic length scale, and Coriolis parameter of the observation area, divide the characteristic flow velocity by the product of the characteristic length scale and the Coriolis parameter, and obtain the dimensionless number representing the ratio of inertial force to Coriolis force.

[0033] The specific steps of the principal component analysis dimensionality reduction process include: standardizing the high-dimensional data matrix composed of temperature, salinity, dissolved oxygen, turbidity, and chlorophyll concentration; calculating the covariance matrix and solving for eigenvalues ​​and eigenvectors; selecting the top 3 to 5 principal components according to the size of the eigenvalues; and projecting the original data onto the principal component space to obtain the dimensionality-reduced data.

[0034] The calculation steps for the squared Mahalanobis distance are as follows: calculate the difference vector between the sample point and the mean of all samples in the principal component space, transpose the difference vector, multiply it on the left by the inverse of the covariance matrix, and then multiply it on the right by the difference vector itself to obtain a distance metric that takes into account the correlation between variables.

[0035] The chi-square distribution critical value is determined by the number of principal components, with a significance level set to 0.001. The chi-square distribution critical value is 16.266 when the number of principal components is 3, and 20.515 when the number of principal components is 5.

[0036] The loading matrix inverse calculation step is as follows: extract the loading matrix obtained during the principal component analysis dimensionality reduction process. Each column of the loading matrix represents the weight coefficient of a principal component on each original parameter. By analyzing the weight coefficients with larger absolute values ​​in the columns corresponding to the abnormal principal components, the original parameters that contribute the most to the multi-parameter coupled abnormal data are identified.

[0037] The specific steps of the sensor dynamic response compensation program include: calling the sensor transfer function parameters obtained from laboratory calibration under different flow rate conditions; performing deconvolution processing on the current sensor output signal to eliminate the hysteresis effect caused by thermal inertia or mechanical inertia; and shortening the response time from the original 10 to 30 seconds to 4 to 12 seconds to improve the ability to capture instantaneous changes.

[0038] The specific steps of the dual-redundant sensor mutual verification procedure include: the main sensor continuously operates and outputs main sensor data in real time; the backup sensor is started once a week to perform short-term measurements and output backup sensor data; the output difference between the main sensor data and the backup sensor data under the same environment is compared; when the output difference exceeds 5%, it is determined that the main sensor has baseline drift, a mechanical brushing device is started to clean the probe surface and the drift component is separated using a Kalman filter algorithm.

[0039] The isolation tree construction steps of the isolated forest algorithm are as follows: randomly select a subset of samples from the high-dimensional sample set; randomly select a feature from the feature dimensions of the subset of samples; randomly generate a split value between the maximum and minimum values ​​of the feature; divide the subset of samples into two subsets according to the split value and recursively repeat the above process until each sample is isolated or the maximum tree depth is reached.

[0040] The calculation steps for the normalized average path length are as follows: Calculate the path length of each sample from the root node to the isolated node in the 200 isolation trees; calculate the arithmetic mean of the 200 path lengths; normalize the arithmetic mean by dividing it by the expected average path length of the theoretical normal sample to obtain an anomaly score between 0 and 1, where the closer the anomaly score is to 1, the higher the degree of anomaly.

[0041] The closed-loop feedback calibration steps specifically include: collecting all samples marked as multi-parameter coupling anomalies for the current month; calculating the deviation of the actual measured value of each parameter from the upper and lower limits of the dynamic confidence interval; calculating the parameter error rate as the ratio of the number of deviation samples to the total number of samples; and when the parameter error rate exceeds 5%, considering the current upper and lower limits of the dynamic confidence interval to be set too strictly, and automatically expanding the range by multiplying the width of the upper and lower limits of the dynamic confidence interval of the parameter by 1.1 times.

[0042] The specific structure of the ocean dynamic constraint confidence boundary model is as follows: the input layer receives seven raw parameters, including temperature, salinity, dissolved oxygen, turbidity, chlorophyll concentration, wind speed, and seasonal type encoding; the first hidden layer contains 64 neurons using a modified linear unit activation function; the second hidden layer contains 128 neurons using a modified linear unit activation function; the third hidden layer contains 64 neurons using a modified linear unit activation function; the output layer contains 15 neurons corresponding to the upper and lower limits of the dynamic confidence intervals of the seven parameters and one multidimensional anomaly metric; the ocean dynamic constraint confidence boundary model adopts a network depth dynamic adjustment algorithm based on random depth, in which the second hidden layer is randomly skipped with a probability of 0.5 in each batch during the training phase, and the output of the first hidden layer is directly connected to the third hidden layer, and the complete three-layer hidden layer structure is maintained for forward propagation during the inference phase.

[0043] The steps for establishing the training dataset for the ocean dynamic constraint confidence boundary model specifically include: collecting historical data from 50 typical ocean observation stations worldwide over the past 10 years; selecting data from time periods containing extreme sea state events as positive samples, including observation records during typhoons, cold waves, and red tide outbreaks; selecting data under normal sea states as negative samples; labeling each sample with a true confidence interval boundary value, which is determined by expert experience combined with whether equipment failure or sudden environmental events occur within the following week; and dividing the samples into a training set and a validation set at an 8:2 ratio, with the training set containing 40,000 samples and the validation set containing 10,000 samples.

[0044] The specific steps for training the marine dynamic constraint confidence boundary model include: initializing the model weight parameters to follow a normal distribution with a mean of 0 and a standard deviation of 0.01; setting the batch size to 128, the learning rate to 0.001, and using an adaptive moment estimation optimizer; in each training batch, deciding whether to skip the second hidden layer with a probability of 0.5; if skipped, the 64-dimensional output of the first hidden layer is upsized to 128 dimensions through a linear transformation layer and then directly input into the third hidden layer, with the weights of the linear transformation layer being updated synchronously during training; defining the loss function as the weighted sum of the mean square error of the prediction error of the upper and lower limits of the dynamic confidence interval and the binary cross-entropy of the prediction error of the multidimensional anomaly metric, with weight coefficients of 0.7 and 0.3; evaluating the performance on the validation set every 10 training batches and recording the model parameters when the validation set loss is minimized; stopping training when the validation set loss does not decrease for 20 consecutive evaluations, and loading the model parameters when the validation set loss is minimized as the final model.

[0045] The specific implementation mechanism of the network depth dynamic adjustment algorithm based on random depth in the ocean dynamic constraint confidence boundary model is as follows: In the three-layer hidden layer structure of the ocean dynamic constraint confidence boundary model, the second hidden layer is designed as a dynamic layer, and the second hidden layer is not activated in every forward propagation during the training phase. Specifically, at the beginning of each training batch, the network depth dynamic adjustment algorithm based on random depth generates a random number between 0 and 1. When the random number is less than the preset skip probability of 0.5, the 128 neurons of the second hidden layer and their corresponding weight matrix and bias vector are completely bypassed in this forward propagation. At this time, the 64-dimensional feature vector of the first hidden layer needs to be upgraded to 128 dimensions through an additional linear transformation layer to match the input dimension requirement of the third hidden layer. The linear transformation layer contains a 64x128 weight matrix and a 128-dimensional bias vector. The weight matrix and bias vector of the linear transformation layer are updated synchronously with other layers through the backpropagation algorithm during the training process. When the random number is greater than or equal to 0.5, the ocean dynamic constraint confidence boundary model performs forward propagation according to the complete three-layer hidden layer structure. The second hidden layer normally receives the output of the first hidden layer and passes the processed 128-dimensional features to the third hidden layer. The random skipping mechanism of the network depth dynamic adjustment algorithm based on random depth forces the ocean dynamic constraint confidence boundary model to learn feature representations that can work effectively at different depths during training, so that the first and third hidden layers must learn to complete the mapping from input to output even without the assistance of the second hidden layer. During the inference phase, all hidden layers are retained and participate in forward propagation. At this time, to compensate for the difference in expected output caused by random skipping during the training phase, the output of the second hidden layer is multiplied by a scaling factor before being passed to the third hidden layer. The scaling factor is equal to 1 minus the skipping probability during training, i.e., 0.5, thereby keeping the expected activation values ​​consistent between the training and inference phases.

[0046] The network depth dynamic adjustment algorithm based on random depth brings several technical benefits to the entire scheme: First, it alleviates the gradient vanishing problem in deep network training. Randomly skipping the second hidden layer is equivalent to building a shallower network in some training batches, allowing gradients to propagate more directly from the output layer to the first hidden layer. This avoids the exponential decay of gradients caused by continuous multi-layer nonlinear transformations, ensuring that the parameters of the first hidden layer, closer to the input layer, are also fully updated, accelerating the overall training convergence speed and reducing the risk of getting trapped in local optima. Second, it improves the robustness and generalization ability of the ocean dynamic constraint confidence boundary model. During training, the ocean dynamic constraint confidence boundary model needs to adapt to network structures of different depths, forcing it to learn more fundamental feature representations that do not rely on layer combinations. The regularization effect of the network depth dynamic adjustment algorithm based on random depth is similar to the idea of ​​training multiple sub-models with different structures and then fusing them in ensemble learning. This makes the final ocean dynamic constraint confidence boundary model more resistant to small perturbations and noise in the input data, maintaining stable upper and lower bound prediction performance of the dynamic confidence interval when facing common issues in ocean observation data such as sensor drift and sudden noise interference. Secondly, it improves training efficiency. Skipping the second hidden layer means reducing matrix multiplication operations by about one-third in the forward and backward propagation of training batches. Although it increases the additional computation of the linear transformation layer, since the number of parameters of the linear transformation layer is much smaller than that of the complete second hidden layer, the overall training time cost per batch is reduced by about 20% to 30%. In the training scenario of the marine dynamic constraint confidence boundary model with massive historical data, it can significantly shorten the overall cycle from data preparation to model deployment. Finally, from the perspective of ocean dynamic constraints, the network depth dynamic adjustment algorithm based on random depth enhances the adaptability of the ocean dynamic constraint confidence boundary model to ocean phenomena at different spatiotemporal scales. Shallow network paths are more suitable for capturing rapidly changing local features, while deep network paths are better at extracting slowly evolving large-scale patterns. The random switching of depth during the training phase enables the ocean dynamic constraint confidence boundary model to spontaneously form a multi-scale feature extraction mechanism, which is consistent with the physical nature of the multi-spatiotemporal scale coupling processes of eddies, internal waves, and tides in the ocean system. Therefore, the predicted upper and lower limits of the dynamic confidence interval can more accurately distinguish between normal ocean dynamic processes and real equipment anomalies or environmental mutations, reducing the probability of mesoscale eddies being misjudged as multi-parameter coupling anomalies when passing through the observation area, and improving the scientificity and practicality of the entire threshold adaptive adjustment method.

[0047] Furthermore, the present invention provides an optional method implemented by a computer to form an adaptive adjustment system for ocean observation data thresholds based on dynamic confidence intervals. The computer is equipped with a readable storage medium, which stores program instructions. When the program instructions are run in the computer, they execute the above-described method.

[0048] The specific implementation methods of the above steps are described in detail below.

[0049] The specific implementation of step S01 is as follows: First, extract historical data for the same period over the past three years from the database of the target observation station. This historical data includes continuous observation sequences of six parameters: temperature, salinity, dissolved oxygen, turbidity, chlorophyll concentration, and wind speed. Calculate the historical mean and historical standard deviation for each parameter. The historical mean is obtained by summing all observed values ​​and dividing by the number of observations. The historical standard deviation is obtained by taking the square root of the average of the squares of the differences between each observed value and the historical mean. Based on 3... The principle is to construct initial confidence intervals, where the lower limit of the initial confidence interval is equal to the historical mean minus 3 times the historical standard deviation, and the upper limit of the initial confidence interval is equal to the historical mean plus 3 times the historical standard deviation. The principle is based on the normal distribution theory. Under the assumption of normal distribution, 99.7% of the data fall within the range of the mean plus or minus 3 standard deviations. Therefore, the initial confidence interval can effectively cover the parameter fluctuation range of normal marine environment. At the same time, the season type is marked according to the month corresponding to the data collection time: January to March is marked as spring, April to June as summer, July to September as autumn, and October to December as winter. The season type marking provides a basis for subsequent dynamic adjustment.

[0050] The specific implementation of step S02 involves acquiring real-time measurement values ​​of various parameters at the current moment through the sensor array of the target observation station. The sampling frequency of the real-time measurement values ​​is once per hour. Simultaneously, data from 3 to 5 surrounding observation stations are collected through a data sharing network. The selection criteria for the surrounding observation stations are that their distance from the target observation station is within 50 to 100 kilometers. When calculating the first-order difference rate of change in the time dimension, the parameter measurement values ​​at two adjacent moments are extracted, the difference is calculated, and then divided by the time interval. The first-order difference rate of change in the time dimension reflects the instantaneous rate of change of the parameters over time. When calculating the spatial dimension deviation rate, the arithmetic mean of the simultaneous measurement values ​​of the surrounding observation stations is first calculated as the mean of the surrounding observation stations. Then, the real-time measurement value of the target observation station is subtracted from the mean of the surrounding observation stations and then divided by the mean of the surrounding observation stations. The spatial dimension deviation rate reflects the degree of difference between the target observation station and the surrounding area. The combined use of the first-order difference rate of change in the time dimension and the spatial dimension deviation rate can identify anomalies from both spatiotemporal dimensions, avoiding the missed detection problem of single-dimensional detection.

[0051] The specific implementation of step S03 involves feeding the real-time measurement values ​​obtained in step S02 and the historical data collected in step S01 as input vectors into the ocean dynamic constraint confidence boundary model. The input vector contains seven elements: real-time measurements of temperature, salinity, dissolved oxygen, turbidity, chlorophyll concentration, and wind speed, as well as a seasonal type code. The seasonal type code uses a one-hot encoding method to encode spring, summer, autumn, and winter into four-dimensional vectors respectively. The ocean dynamic constraint confidence boundary model extracts features layer by layer through the first, second, and third hidden layers, ultimately generating 15 [data points] in the output layer. The output values ​​are as follows: the first 14 output values ​​correspond to the upper and lower limits of the dynamic confidence intervals of the 7 parameters, and the 15th output value is the multidimensional anomaly metric. The marine dynamic constraint confidence boundary model automatically adjusts the upper and lower limits of the output dynamic confidence intervals according to the input seasonal type. In summer, the upper limit of the dynamic confidence interval for temperature is increased by 2 to 3°C compared with the initial upper limit of the confidence interval to adapt to the high temperature environment. In winter, the lower limit of the dynamic confidence interval for temperature is decreased by 2 to 3°C compared with the initial lower limit of the confidence interval to adapt to the low temperature environment. The dynamic adjustment mechanism is based on the seasonal characteristics of the marine environment and can effectively reduce the false alarm rate caused by seasonal changes.

[0052] The specific implementation of step S04 involves comparing the first-order difference rate of change in the time dimension calculated in step S02 with a preset time threshold, and simultaneously comparing the spatial dimension deviation rate with a preset spatial threshold. The time threshold is set with different values ​​according to different parameter types. The time threshold for wind speed is set to 15 m / s / h, the time threshold for temperature is set to 3℃ / h, and the time threshold for salinity is set to 0.5 psu / h. The spatial threshold is also set according to the parameter type. The spatial threshold for wind speed is set to 20%, the spatial threshold for temperature is set to 15%, and the spatial threshold for salinity is set to 10%. When the absolute value of the first-order difference rate of change in the time dimension of a certain parameter exceeds the corresponding time threshold and the absolute value of the spatial dimension deviation rate exceeds the corresponding spatial threshold, it is determined that both dimensions exceed the limit simultaneously. The data corresponding to the parameter is marked as suspected abnormal data and enters the subsequent verification process. The dual-dimensional joint judgment mechanism can effectively exclude the situation where there is only a deviation in a single time or spatial dimension but the overall situation is normal, thereby improving the accuracy of anomaly identification.

[0053] The specific implementation of step S05 involves verifying the suspected anomalous data marked in step S04 using ocean dynamics constraints. First, temperature and salinity data of the observation area are acquired to calculate the seawater density field distribution. The geostrophic equilibrium relationship is used to verify whether the density field meets the dynamic constraints of large-scale ocean flows. When calculating the Rossby number, the characteristic velocity, characteristic length scale, and Coriolis parameter of the observation area need to be obtained. The characteristic velocity is obtained by time averaging of the current meter measurements at the observation station. The characteristic length scale is typically taken as the horizontal scale of the observation area, usually 10 to 100 kilometers. The Coriolis parameter is calculated based on the latitude of the observation station. The Rossby number is obtained by dividing the characteristic velocity by the product of the characteristic length scale and the Coriolis parameter. When the Rossby number is less than 0.1, it indicates that the flow in the observation area is in geostrophic equilibrium. The suspected anomalous data is actually a normal ocean dynamic process caused by mesoscale eddies or large-scale flows. The anomalous marking is removed and the data is restored to normal. This ocean dynamics constraint verification is based on the basic principles of ocean fluid mechanics and can effectively distinguish between real equipment anomalies and normal ocean phenomena, avoiding misjudging dynamic processes such as eddies and internal waves as anomalous.

[0054] The specific implementation of step S06 involves performing principal component analysis (PCA) dimensionality reduction on the data still marked as potentially anomaly after passing the ocean dynamics constraint verification in step S05. This involves standardizing the high-dimensional data matrix composed of five parameters: temperature, salinity, dissolved oxygen, turbidity, and chlorophyll concentration, to eliminate the influence of different parameter dimensions. The covariance matrix of the standardized data is calculated, and eigenvalues ​​and eigenvectors are solved. The top 3 to 5 principal components are selected based on the eigenvalues, sorted from largest to smallest. The selection of the number of principal components is based on a cumulative variance contribution rate of over 85%. The original high-dimensional data is projected onto the low-dimensional space spanned by the principal components to obtain the dimensionality-reduced data. The squared Mahalanobis distance is calculated in the principal component space. The squared Mahalanobis distance is obtained by transposing the difference vector between the sample point and the sample mean point, multiplying it on the left by the inverse of the covariance matrix, and then multiplying it on the right by the difference vector itself. The squared Mahalanobis distance is compared with the chi-square distribution critical value corresponding to the degrees of freedom. When the number of principal components is 3, the chi-square distribution critical value is 16.266, and when the number of principal components is 5, the chi-square distribution critical value is 20.515. If the squared Mahalanobis distance exceeds the chi-square distribution critical value, it is confirmed as multi-parameter coupling anomaly data. The combination of principal component analysis dimensionality reduction and Mahalanobis distance metric can effectively identify hidden anomalies in the joint distribution of multiple parameters that deviate from the normal state, solving the coupling anomaly problem that traditional parameter-by-parameter detection cannot detect.

[0055] It should be noted that the key technical ideas of this invention are reflected in three aspects: a joint detection mechanism for spatiotemporal dual-dimensional change rates, a marine dynamic constraint confidence boundary model based on random depth, and a cascaded verification of marine dynamic constraint verification and principal component spatial anomaly measurement. The joint detection mechanism for spatiotemporal dual-dimensional change rates overcomes the limitation of traditional single-time-dimensional detection in identifying regional anomalies by simultaneously calculating the first-order temporal difference and spatial deviation rate of parameters. When data from a certain observation station changes drastically in time but similar changes are observed at surrounding stations, a small spatial deviation rate will not trigger a false alarm. Conversely, when data from a certain station deviates from surrounding stations but changes gradually in time, spatial detection alone will not cause missed detections. Only when both dimensions exceed the limits simultaneously is it considered a suspected anomaly. This mechanism significantly reduces the false alarm rate under large-scale extreme weather conditions such as typhoons and cold waves, while maintaining sensitivity to local equipment failures. The ocean dynamics-constrained confidence boundary model based on random depth employs a strategy of randomly skipping the second hidden layer during the training phase, allowing the model to randomly switch between shallow and deep network paths. The shallow path captures high-frequency, rapidly changing features, while the deep path extracts low-frequency, slowly evolving patterns. This multi-scale feature extraction capability is highly compatible with the physical nature of ocean system processes at multiple spatiotemporal scales, such as eddies, internal waves, and tides. Compared to fixed-depth networks, this model is more robust to sensor drift and sudden noise, and its training efficiency is improved by approximately 20% to 30%. It can also maintain stable confidence interval prediction performance under extreme sea conditions. The cascaded verification of ocean dynamics constraints and principal component spatial anomaly measurement filters out normal ocean dynamic processes conforming to geostrophic equilibrium using Rossby number determination, and then identifies multi-parameter coupling anomalies through principal component analysis and Mahalanobis distance measurement. This two-stage verification mechanism reduces the misclassification rate of mesoscale eddies from over 40% in traditional methods to below 5%, while simultaneously increasing the detection rate of hidden anomalies such as the coexistence of high temperature and low oxygen to over 85%. The synergistic effect of the three key technical approaches is that spatiotemporal dual-dimensional detection provides initial screening, the ocean dynamic constraint confidence boundary model provides adaptive thresholds, and the cascaded verification mechanism provides accurate confirmation. The multi-level progressive architecture ensures both real-time performance and accuracy, forming a complete anomaly identification chain from coarse to fine and from fast to accurate. Compared with traditional fixed threshold methods, this invention reduces the false alarm rate by more than 70% while keeping the false alarm rate below 3%, significantly improving the level of intelligence in ocean observation data quality control.

[0056] It should be noted that this invention also solves the following technical problem: traditional single-parameter independent detection methods neglect the physical coupling relationship between marine elements, making it difficult to identify multi-parameter linkage anomalies. Parameters such as temperature, salinity, and dissolved oxygen in the marine environment are not independent but tightly coupled through physical mechanisms such as the seawater state equation and density field distribution. Salinity anomalies caused by sensor fouling are inevitably accompanied by density calculation errors, and the surge in chlorophyll concentration caused by red tide outbreaks simultaneously affects dissolved oxygen and turbidity measurements. This invention projects high-dimensional multi-parameter data into a low-dimensional principal component space through principal component analysis. The eigenvalue decomposition of the covariance matrix naturally extracts the correlation structure between parameters. The squared Mahalanobis distance implicitly considers the covariance between variables when quantifying the degree of sample deviation in the principal component space. Compared to the traditional method of setting thresholds for each parameter individually, this multi-dimensional joint detection mechanism can capture abnormal deviations in parameter coupling patterns. The load matrix back-inference further decomposes the abnormal signals in the principal component space to the original parameter dimensions, accurately identifying the root causes of multi-parameter linkage anomalies, thus solving the technical problem of traditional single-parameter independent detection neglecting physical coupling relationships.

[0057] Specifically, the principle of this invention is as follows: The fundamental reason why this invention can solve the technical problem lies in embedding ocean dynamics physical constraints into a data-driven anomaly detection framework. Traditional fixed threshold methods rely solely on historical statistical features and cannot understand the intrinsic physical laws of ocean systems. In contrast, this invention quantifies the ratio of inertial force to Coriolis force using the Rossby number. When the Rossby number is less than 0.1, it is determined to be a normal dynamic process dominated by geostrophic equilibrium, thus eliminating the interference of natural phenomena such as mesoscale eddies from the physical mechanism level. The stochastic depth training strategy of the ocean dynamic constraint confidence boundary model forces the network to complete the input-output mapping at different depths. Shallow paths capture rapidly changing local features corresponding to sudden anomalies, while deep paths extract slowly evolving large-scale patterns corresponding to seasonal patterns. The random switching of depth during the training phase is essentially an implicit multi-scale ensemble learning, enabling the upper and lower limits of the predicted dynamic confidence interval to adaptively distinguish ocean phenomena at different spatiotemporal scales. The squared Mahalanobis distance in the principal component space takes into account the covariance structure among multiple parameters and is better able to identify parameter coupling anomalies than Euclidean distance. The load matrix back-inference decomposes the high-dimensional anomaly signal into the original parameter space, realizing a closed loop from global detection to local localization. This logical chain ensures the scientific nature and traceability of anomaly determination.

[0058] The following provides a specific embodiment 1 of the present invention, and the specific implementation of each step in this embodiment 1 is described in detail below.

[0059] The specific implementation of step S01 is as follows: Collect historical data of the target observation station for the same period over the past three years, calculate the historical mean and historical standard deviation of temperature, salinity, dissolved oxygen, turbidity, chlorophyll concentration, and wind speed, construct an initial confidence interval based on the 3σ principle, and label the seasonal type. The lower limit of the initial confidence interval is... and the upper limit of the initial confidence interval The calculation formula is expressed as follows: ; ; In the formula, This is the historical average, with units consistent with the corresponding parameters; This represents the historical standard deviation, with units consistent with the corresponding parameters. Among them, It was obtained by calculating the arithmetic mean of data from the same period over the past three years. The standard deviation was obtained by calculating the standard deviation of data from the same period over the past three years.

[0060] The specific implementation of step S02 is as follows: Obtain the real-time measured values ​​of each parameter of the observation station at the current moment, simultaneously collect concurrent data from 3 to 5 surrounding observation stations, and calculate the first-order difference rate of change in the time dimension and the spatial dimension deviation rate. The first-order difference rate of change in the time dimension... The calculation formula is expressed as follows: ; In the formula, For the first Time parameter measurements, with units consistent with the corresponding parameters; For the first Time parameter measurements, with units consistent with the corresponding parameters; The time interval is expressed in hours (h). Time sequence number. Spatial dimension deviation rate. The calculation formula is expressed as follows: ; In the formula, These are real-time measurements, and the units are consistent with the corresponding parameters. This is the average value from surrounding observation stations, with units consistent with the corresponding parameters. It is obtained by calculating the arithmetic mean of measurements taken during the same period from 3 to 5 surrounding observation stations, where the number of observation stations included in the averaging calculation is specified. The value ranges from 3 to 5.

[0061] The specific implementation methods of steps S03 and S04 are the same as those described above, and will not be repeated in detail here.

[0062] The specific implementation of step S05 is as follows: Perform ocean dynamics constraint verification on suspected anomaly data, and calculate the Rossby number corresponding to the geostrophic equilibrium relationship of the temperature, salinity, and density fields. Rossby number. The calculation formula is expressed as follows: ; In the formula, Characteristic flow velocity, in m / s; For Coriolis parameters, the unit is ; is the feature length scale, in meters (m). Obtained in real time by a flow meter. Coriolis parameter. The calculation formula is expressed as follows: ; In the formula, The Earth's rotational angular velocity is, empirically, [value missing]. ; The latitude of the observation station is expressed in radians, calculated by multiplying the latitude angle value of the observation station by... Obtained through conversion; The value is a sinusoidal function and is dimensionless. Determined based on the vortex scale or front width of the observed area. When If the value is less than 0.1, it is considered a normal ocean dynamic process, and the anomaly marker is removed.

[0063] The specific implementation of step S06 is as follows: Principal component analysis (PCA) is used to reduce the dimensionality of the suspected anomaly data that has passed the ocean dynamics constraint verification, and the squared Mahalanobis distance is calculated in the principal component space. The PCA dimensionality reduction process includes: standardizing the high-dimensional data matrix composed of temperature, salinity, dissolved oxygen, turbidity, and chlorophyll concentration; calculating the covariance matrix and solving for eigenvalues ​​and eigenvectors; selecting the top 3 to 5 principal components based on the eigenvalue size; and projecting the original data into the principal component space to obtain the dimensionality-reduced data. The squared Mahalanobis distance is then calculated. The calculation formula is expressed as follows: ; In the formula, The column vector of sample points in the principal component space, with dimension . ; Let be the column vector of the mean points of all samples in the principal component space, with dimension . ; The number of principal components selected ranges from 3 to 5. This represents the transpose of the difference column vector, with dimension 1. ; The covariance matrix in the principal component space, with dimension . ; The inverse of the covariance matrix has dimensions of . This formula obtains the dimensionless squared Mahalanobis distance through matrix multiplication. Data exceeding the chi-square distribution critical value corresponding to the degrees of freedom is identified as multi-parameter coupled anomalous data.

[0064] The specific implementation method of step S07 is the same as described above, and will not be repeated in detail here.

[0065] The specific implementation of step S08 is as follows: The isolated forest algorithm is used to detect outliers in all observed samples of the current batch, constructing a forest of 200 isolated trees. The normalized average path length value for each sample is calculated as the anomaly score. (Normalized average path length value) The calculation formula is expressed as follows: ; In the formula, For the sample The arithmetic mean of the path lengths from the root node to the isolated node in 200 isolation trees, where This indicates the currently observed sample to be tested. This is a dimensionless path length count value; This is the expected average path length of the theoretical normal sample, which is dimensionless. represents the total number of samples in the training set. Among them, The calculation formula is expressed as follows: ; In the formula, The harmonic number is dimensionless, and its calculation formula is as follows: ; In the formula, It is a function of the natural logarithm and is dimensionless. Here, is Euler's constant, with an empirical value of 0.5772, and is dimensionless. (Abnormal score) The closer to 1, the higher the degree of abnormality.

[0066] The specific implementation of step S09 is as follows: Calculate the parameter error rate of all abnormal data for the current month. When the parameter error rate exceeds 5%, expand the upper and lower limits of the dynamic confidence interval for the corresponding parameter by 10%, and update the weight parameters of the marine dynamic constraint confidence boundary model to achieve closed-loop feedback calibration. Parameter error rate The calculation formula is expressed as follows: ; In the formula, The deviation from the sample size is dimensionless. The total number of samples is dimensionless. When it exceeds 5%, the lower limit of the dynamically adjusted confidence interval and upper limit The calculation formula is expressed as follows: ; ; In the formula, The lower limit of the dynamic confidence interval before adjustment is set, with the unit consistent with the corresponding parameter; To adjust the upper limit of the dynamic confidence interval, the unit should be consistent with the corresponding parameter; This is the center value of the dynamic confidence interval, with units consistent with the corresponding parameters, obtained through... Obtained through calculation.

[0067] The specific implementation of step S10 is the same as described above, and will not be repeated in detail here.

[0068] The implementation of the network depth dynamic adjustment algorithm based on stochastic depth for the ocean dynamically constrained confidence boundary model is as follows: During the training phase, the second hidden layer is randomly skipped with a probability of 0.5 in each batch, and the output of the first hidden layer is directly connected to the third hidden layer. During the inference phase, the output of the second hidden layer is multiplied by a scaling factor before being passed to the third hidden layer. The calculation formula is expressed as follows: ; In the formula, The skip probability during training is empirically set to 0.5 and is dimensionless. is a scaling factor, dimensionless.

[0069] It should be noted that the variables involved in this invention are explained in detail in Table 1.

[0070] Table 1. Variable Explanation Table

[0071] To better understand and implement this invention, the following is a specific application scenario example 2: A marine observation technology team deployed a marine observation data threshold adaptive adjustment system based on dynamic confidence intervals at a nearshore observation station. This station is equipped with sensors for temperature, salinity, dissolved oxygen, turbidity, chlorophyll concentration, and wind speed to monitor marine environmental parameters in the area. The team first collected historical data from the same period over the past three years. The historical temperature data was extracted and calculated to have a summer historical average of 26.8℃ and a historical standard deviation of 1.2℃. Based on the 3σ principle, an initial confidence interval was constructed with a lower limit of 23.2℃ and an upper limit of 30.4℃. The historical salinity average was 32.5 psu and the historical standard deviation was 0.8 psu, with an initial confidence interval lower limit of 30.1 psu and an upper limit of 34.9 psu. The historical dissolved oxygen average was 6.8 mg / L and the historical standard deviation was 0.5 mg / L, with an initial confidence interval lower limit of 5.3 mg / L and an upper limit of 8.3 mg / L. The historical mean turbidity was 12.4 NTU, with a historical standard deviation of 2.1 NTU. The initial confidence interval (QI) was lower limit 6.1 NTU and upper limit 18.7 NTU. The historical mean chlorophyll concentration was 3.2 μg / L, with a historical standard deviation of 0.6 μg / L. The initial QI was lower limit 1.4 μg / L and upper limit 5.0 μg / L. The historical mean wind speed was 8.5 m / s, with a historical standard deviation of 2.3 m / s. The initial QI was lower limit 1.6 m / s and upper limit 15.4 m / s. The technical team marked the current time as summer and obtained real-time measurements: temperature 28.3℃, salinity 33.1 psu, dissolved oxygen 6.5 mg / L, turbidity 15.2 NTU, chlorophyll concentration 3.8 μg / L, and wind speed 12.7 m / s. Simultaneous data from four surrounding observation stations were also collected, as shown in Table 2.

[0072] Table 2. Concurrent Measurement Data from Surrounding Observation Stations

[0073] The technical team calculated the average values ​​from surrounding observation stations: temperature 28.08℃, salinity 32.95 psu, dissolved oxygen 6.63 mg / L, turbidity 14.90 NTU, chlorophyll concentration 3.70 μg / L, and wind speed 12.33 m / s. They calculated the spatial dimensional deviation rates: temperature deviation was 0.78%, salinity deviation was 0.45%, dissolved oxygen deviation was 2.27%, turbidity deviation was 2.01%, chlorophyll concentration deviation was 2.70%, and wind speed deviation was 3.00%, all within their respective spatial thresholds. The team also extracted the previous hour's measurements: temperature 27.8℃, salinity 33.0 psu, dissolved oxygen 6.6 mg / L, turbidity 14.8 NTU, chlorophyll concentration 3.6 μg / L, and wind speed 10.2 m / s. The first-order difference rate of change over time was calculated, and the rates of change for temperature (0.5℃ / h), salinity (0.1 psu / h), dissolved oxygen (0.1 mg / L / h), turbidity (0.4 NTU / h), chlorophyll concentration (0.2 μg / L / h), and wind speed (2.5 m / s / h) all fell within the time thresholds. The technical team input real-time measurements and historical data into a marine dynamic constraint confidence boundary model. This model includes an input layer receiving seven raw parameters, a first hidden layer with 64 neurons, a second hidden layer with 128 neurons, a third hidden layer with 64 neurons, and an output layer with 15 neurons. For example... Figure 2 As shown, the model automatically adjusts the upper limit of the dynamic confidence interval for temperature based on the summer type, increasing it by 2.5℃ from the initial upper limit of 30.4℃ to 32.9℃, while keeping the lower limit unchanged at 23.2℃. The output multidimensional anomaly metric is 0.12, indicating that the current observation data is within the normal range.

[0074] The technical team continued monitoring. Over the next 3 hours, wind speed rapidly increased from 12.7 m / s to 29.8 m / s, temperature decreased from 28.3℃ to 25.1℃, and salinity increased from 33.1 psu to 34.2 psu. The calculated first-order difference rate of change for wind speed over time was 17.1 m / s / h, exceeding the time threshold of 15 m / s / h. The calculated spatial deviation rate for wind speed was 8.36% (average at surrounding stations), which was 27.5 m / s, and did not exceed the spatial threshold of 20%. Since only the time dimension exceeded the limit while the spatial dimension did not, no suspected anomaly flag indicating simultaneous exceedance of both dimensions was triggered. Subsequently, wind speed continued to rise to 35.2 m / s, with the average at surrounding stations remaining at 29.8 m / s. The spatial deviation rate increased to 18.12%, still not exceeding the spatial threshold. However, the temperature dropped from 28.3℃ to 23.5℃ within 2 hours, with a first-order difference rate of change in the time dimension of 2.4℃ / h, which did not exceed the temperature time threshold of 3℃ / h. The average temperature at surrounding observation stations was 24.8℃, with a spatial deviation rate of 5.24%, which did not exceed the temperature spatial threshold of 15%. The technical team continued monitoring, and at the 5th hour, the temperature suddenly dropped to 20.8℃, with a first-order difference rate of change in the time dimension of 3.4℃ / h, exceeding the time threshold. The average temperature at surrounding observation stations was 23.2℃, with a spatial deviation rate of 10.34%, which did not exceed the spatial threshold. At this time, the first-order difference rate of change in the wind speed time dimension was 14.2 m / s / h, falling within the range of 12 to 18 m / s / h. The system automatically increased the frequency of ocean dynamics constraint verification execution from the original once per hour to 1.5 times per hour, optimizing the system response speed.

[0075] The technical team performed ocean dynamics constraint verification on the temperature data, obtaining the characteristic current velocity of the observation area as 0.45 m / s, the characteristic length scale as 50,000 m, and the Coriolis parameter as follows. The calculated Rossby number was 0.123, greater than 0.1, failing to identify it as a normal ocean dynamic process; therefore, the anomaly marker was retained. The technical team performed principal component analysis (PCA) to reduce the dimensionality of the 5-dimensional data matrix consisting of temperature, salinity, dissolved oxygen, turbidity, and chlorophyll concentration. After standardization, the covariance matrix was calculated, and the eigenvalues ​​were solved. The top three principal components, sorted by eigenvalue size, were selected, with a cumulative variance contribution rate of 87.6%. Projecting the original data onto the principal component space yielded the dimensionality-reduced data. The squared Mahalanobis distance between the current sample point and the mean of all samples was calculated to be 18.524, exceeding the chi-square distribution critical value of 16.266 corresponding to 3 degrees of freedom, confirming it as multi-parameter coupled anomaly data. The technical team used the load matrix to back-calculate to the original parameter space and found that the absolute values ​​of the weighting coefficients of the first principal component were 0.582 for temperature, 0.461 for salinity, 0.398 for dissolved oxygen, 0.312 for turbidity, and 0.289 for chlorophyll concentration, determining that temperature was the main source of the abnormal parameters. The team initiated a sensor dynamic response compensation program, using the sensor transfer function parameters obtained from laboratory calibration at a flow rate of 0.45 m / s. This program performed deconvolution processing on the current sensor output signal to eliminate the hysteresis effect caused by thermal inertia, reducing the response time from the original 25 seconds to 8 seconds. Simultaneously, a dual-redundant sensor mutual verification program was initiated. The backup sensor measured an output of 23.2℃, which differed from the main sensor's output of 20.8℃ by 10.34%, exceeding the 5% threshold. This indicated baseline drift in the main sensor. A mechanical cleaning device was activated to clean the probe surface, and a Kalman filter algorithm was used to separate the drift component. After correction, the main sensor output was adjusted to 22.9℃.

[0076] The technical team used the isolated forest algorithm to detect outliers in the current batch of 300 observation samples, constructing a forest of 200 isolated trees. For example... Figure 3 As shown, for the sample with a pre-correction temperature of 20.8℃, the path length from the root node to the isolated node in 200 isolation trees was calculated, and the arithmetic mean was 4.2. The expected average path length of the theoretically normal sample was 9.8, and the normalized anomaly score was 0.429. For the sample with a post-correction temperature of 22.9℃, the average path length was 7.6, and the anomaly score was 0.776, still at a relatively high level. The technical team analyzed all the abnormal data for the month, and found that the number of samples with actual temperature parameter measurements deviating from the dynamic confidence interval boundary was 18, with a total sample size of 2160. The parameter error rate was 0.833%, which did not exceed the 5% threshold, and the dynamic confidence interval remained unchanged. However, the chlorophyll concentration parameter deviated from the sample size by 127, with a parameter error rate of 5.88%, exceeding the 5% threshold. The system automatically adjusted the lower limit of the dynamic confidence interval for chlorophyll concentration from 1.4 μg / L to 1.26 μg / L, and the upper limit from 5.0 μg / L to 5.5 μg / L, expanding the range by 10%. Figure 4As shown, the technical team updated the weight parameters of the marine dynamic constraint confidence boundary model and evaluated its performance on the validation set. The mean squared error of the dynamic confidence interval prediction error decreased from 0.142 to 0.128, and the binary cross-entropy of the multidimensional anomaly metric prediction error decreased from 0.086 to 0.079, achieving closed-loop feedback calibration. The technical team applied a network depth dynamic adjustment algorithm based on random depth to the training of the marine dynamic constraint confidence boundary model. During the training phase, the second hidden layer was randomly skipped in each batch with a probability of 0.5, forcing the model to learn feature representations that could work effectively at different depths. During the inference phase, the complete three-layer hidden layer structure was maintained. The output of the second hidden layer was multiplied by a scaling factor of 0.5 and then passed to the third hidden layer to compensate for the difference in expected output caused by the random skipping during the training phase.

[0077] The advancements of this invention over traditional fixed threshold methods are primarily reflected in two aspects: a dynamic adaptive mechanism and multi-dimensional constraint verification. Traditional methods typically use a fixed 3σ principle to set threshold boundaries, which cannot cope with seasonal changes and regional differences in the marine environment. This leads to normal data being misjudged as anomalies during the high temperatures of summer and truly anomalous data being missed during the low temperatures of winter. This invention, through a marine dynamic constraint confidence boundary model, automatically adjusts the upper and lower limits of the dynamic confidence interval according to seasonal type. In summer, the upper temperature limit is increased by 2 to 3°C, and in winter, the lower temperature limit is decreased by 2 to 3°C, keeping the threshold boundaries synchronized with the actual marine conditions and reducing false alarms and missed alarms caused by the mismatch between fixed thresholds and the actual environment. Traditional methods rely solely on the absolute value of a single parameter, ignoring the physical coupling relationships and spatiotemporal evolution patterns between marine parameters. This easily leads to the misjudgment of normal marine dynamic processes, such as the passage of mesoscale eddies and internal wave activity, as equipment malfunctions. This invention introduces a dual-dimensional out-of-limit judgment based on the first-order difference rate of change in the time dimension and the deviation rate in the spatial dimension. It requires suspected anomaly data to simultaneously meet threshold conditions in both time and space dimensions. Spatial consistency verification is performed using contemporaneous data from surrounding observation stations, reducing the probability of misjudgment due to the unique local environment of isolated observation stations. Furthermore, the Rossby number is calculated through ocean dynamic constraints. When the Rossby number is less than 0.1, it is determined to be a normal ocean dynamic process, and the anomaly label is removed. Ocean physical laws such as geostrophic equilibrium are embedded into the anomaly detection process, distinguishing between real anomalies and natural changes at the mechanistic level. Principal component analysis dimensionality reduction and Mahalanobis distance calculation enable the identification of multi-parameter coupled anomalies. The source of anomaly parameters is located by inversely deducing the load matrix, overcoming the limitation of traditional methods that cannot capture anomalies related to parameters due to independent parameter-by-parameter judgment. A network depth dynamic adjustment algorithm based on stochastic depth alleviates the gradient vanishing problem in deep networks by randomly skipping the second hidden layer during the training phase, accelerating model convergence. Simultaneously, the regularization effect enhances the model's robustness to sensor drift and sudden noise, improving the stability and accuracy of dynamic confidence interval prediction. The closed-loop feedback calibration mechanism automatically adjusts the dynamic confidence interval width based on the parameter error rate of the abnormal data for the current month, enabling the system to continuously learn and adapt to the long-term evolution trend of the observation environment, thus avoiding the cumbersome process of manually recalibrating the threshold periodically required by traditional methods.

[0078] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for adaptively adjusting the threshold of ocean observation data based on dynamic confidence intervals, characterized in that, Historical data from the target observation station is collected to calculate the historical mean and standard deviation, constructing an initial confidence interval and labeling the seasonal type. Real-time measurements are obtained, and concurrent data from surrounding observation stations are collected to calculate the first-order difference rate of change in the time dimension and the spatial deviation rate. The real-time measurements and historical data are input into the ocean dynamic constraint confidence boundary model, which outputs the upper and lower limits of the dynamic confidence interval and multi-dimensional anomaly metrics. Whether the first-order difference rate of change in the time dimension and the spatial deviation rate exceed the limits in both dimensions is marked as suspected anomaly data. Ocean dynamic constraint verification is performed, and Rossby number is calculated to determine normal ocean dynamic processes. Principal component analysis is used to reduce the dimensionality of suspected anomaly data, and the squared Mahalanobis distance is calculated in the principal component space to confirm multi-parameter coupled anomaly data. The source of the anomaly parameters is located by back-inferring the load matrix, and the sensor dynamic response compensation program and the dual-redundant sensor mutual verification program are initiated. The isolated forest algorithm is used to detect outliers in the observation samples and calculate the anomaly score. The statistical parameter error rate is used to dynamically expand the range of the upper and lower limits of the dynamic confidence interval to achieve closed-loop feedback calibration.

2. The method according to claim 1, characterized in that, The initial confidence interval is constructed by extracting the historical mean and historical standard deviation, calculating the historical mean minus 3 times the historical standard deviation to obtain the lower limit of the initial confidence interval, and calculating the historical mean plus 3 times the historical standard deviation to obtain the upper limit of the initial confidence interval.

3. The method according to claim 2, characterized in that, The calculation of the first-order difference rate of change in the time dimension specifically involves extracting the parameter measurement values ​​at two adjacent moments, calculating the ratio of the difference between the two values ​​to the time interval, and obtaining the parameter change rate per unit time.

4. The method according to claim 3, characterized in that, The calculation of the spatial dimension deviation rate involves extracting the measured values ​​from the contemporaneous data of surrounding observation stations, calculating the mean of the surrounding observation stations, and then dividing the difference between the real-time measured value and the mean of the surrounding observation stations by the mean of the surrounding observation stations to obtain the spatial dimension deviation rate in percentage form.

5. The method according to claim 4, characterized in that, The time and space thresholds are set according to different parameter types. The time threshold for wind speed is 15 m / s / h and the space threshold is 20%. The time threshold for temperature is 3℃ / h and the space threshold is 15%. The time threshold for salinity is 0.5 psu / h and the space threshold is 10%.

6. The method according to claim 5, characterized in that, Seasonal types are divided into four categories: spring, summer, autumn, and winter. For summer, the upper limit of the dynamic confidence interval for temperatures from June to August is increased by 2 to 3°C from the upper limit of the initial confidence interval. For winter, the lower limit of the dynamic confidence interval for temperatures from December to February of the following year is decreased by 2 to 3°C from the lower limit of the initial confidence interval.

7. The method according to claim 6, characterized in that, The calculation of the Rossby number involves obtaining the characteristic flow velocity, characteristic length scale, and Coriolis parameter of the observation area, and then dividing the characteristic flow velocity by the product of the characteristic length scale and the Coriolis parameter to obtain a dimensionless number representing the ratio of inertial force to Coriolis force.

8. The method according to claim 7, characterized in that, Principal component analysis (PCA) dimensionality reduction involves standardizing a high-dimensional data matrix composed of temperature, salinity, dissolved oxygen, turbidity, and chlorophyll concentration, calculating the covariance matrix, solving for eigenvalues ​​and eigenvectors, sorting the eigenvalues ​​by size, selecting the top 3 to 5 principal components, and projecting the original data onto the principal component space to obtain the dimensionality-reduced data.

9. The method according to claim 8, characterized in that, The calculation of the squared Mahalanobis distance involves calculating the difference vector between the sample point and the mean of all samples in the principal component space, transposing the difference vector, left-multiplying it by the inverse of the covariance matrix, and then right-multiplying it by the difference vector itself to obtain a distance metric that takes into account the correlation between variables.

10. The method according to claim 9, characterized in that, The chi-square distribution critical value is determined based on the number of principal components. The significance level is set to 0.

001. When the number of principal components is 3, the corresponding chi-square distribution critical value is 16.266, and when the number of principal components is 5, the corresponding chi-square distribution critical value is 20.515.