A fish school integration intelligent prediction method based on NN-CCRF and LSTM
By using an intelligent prediction method for fish school integration based on NN-CCRF and LSTM, the problem of accurately predicting fish school integration in dynamic water environments is solved, realizing intelligent and efficient fish collection operations, reducing resource waste and reliance on manual labor, and enhancing the technological added value of the fish collection and transportation system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUBEI NORMAL UNIV
- Filing Date
- 2026-03-18
- Publication Date
- 2026-06-19
AI Technical Summary
Existing fish collection systems struggle to accurately predict fish aggregation levels in dynamic water environments, leading to reliance on experience in fish collection operations and resulting in high empty tank rates, resource waste, and low efficiency.
An intelligent prediction method for fish swarm integration based on NN-CCRF and LSTM is adopted. By monitoring data from multiple observation points, a time-series dataset is constructed, fish behavior and water quality characteristics are extracted, and the data is optimized using Tomek Links and K-Means SMOTE algorithms. The NN-CCRF neural conditional random field model and LSTM are combined for feature enhancement to construct an end-to-end prediction model, which outputs the future fish swarm integration and provides decision support for intelligent fish collection management.
It has achieved accurate prediction of fish school integration, reduced empty container rate, improved fish collection efficiency and resource utilization, reduced reliance on human experience, and formed an intelligent and green fish collection and transportation system.
Smart Images

Figure CN122241146A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent monitoring and prediction technology for water bodies, and in particular to an intelligent prediction method for fish swarm integration based on NN-CCRF and LSTM. Background Technology
[0002] The fish collection and transportation system is a crucial fish protection project for large-scale water conservancy projects such as the Wudongde Hydropower Station, playing an irreplaceable role in maintaining the ecological balance of the basin and ensuring the sustainability of fish resources. The existing Wudongde fish collection and transportation system adopts an operation mode of "fixed fish collection stations + mobile fish collection boxes," where the mobile fish collection boxes can be transported to different tailrace tunnel outlets by gantry cranes for fish collection operations. However, this mode faces a series of technical and management bottlenecks in actual operation, severely restricting its ecological effectiveness and operational efficiency. For example, the operation relies on experience and lacks intelligent sensing and decision support. Currently, the mobile fish collection boxes are not equipped with real-time monitoring devices for fish quantity, species, and key hydrological conditions (such as flow velocity, water temperature, and water depth), resulting in the inability to promptly grasp the fish condition within the boxes. Fish collection operations are mainly carried out based on fixed frequency and duration, resulting in a high "empty box operation" rate and causing serious waste of manpower, material resources, and energy. Meanwhile, under the dynamic water conditions at the tailrace outlet where the flow velocity reaches 2m / s-5m / s, the behavior of fish is complex and unpredictable, and traditional experience makes it difficult to scientifically determine the best time and place for fish to gather, resulting in a low rate of fish entering the enclosure.
[0003] Against this backdrop, achieving accurate and forward-looking prediction of fish school integration (i.e., the degree to which fish congregate in a specific body of water) has become a core key to improving the intelligence level and operational efficiency of fish collection and transportation systems. If changes in fish school integration in the tailrace tunnel area can be predicted in advance, the placement location and timing of fish collection boxes can be intelligently planned, transforming passive collection into active attraction, thereby significantly reducing empty box rates and improving fish collection effectiveness. However, existing fish behavior analysis and prediction methods are insufficient to meet the needs of this complex scenario:
[0004] Traditional statistical or shallow machine learning methods cannot effectively capture the nonlinear temporal evolution of fish schools in dynamic water environments. Conventional time-series prediction models (such as LSTM) can model single-point historical sequences, but they ignore the spatial correlation of fish schools. For example, the gathering of fish schools at a tailrace opening is often affected by multiple factors such as upstream flow, eddies at adjacent openings, and water temperature gradients. Furthermore, graphical models or spatial statistical methods that use predefined spatial relationships (such as simple distance decay) are difficult to adaptively characterize the complexity and dynamics of fish behavior interactions under the influence of hydrodynamics. Summary of the Invention
[0005] To address the shortcomings of existing technologies, this invention provides an intelligent prediction method for fish school integration based on NN-CCRF and LSTM. The aim is to achieve accurate prediction of the future fish school integration in the target area of the fish collection box, providing intelligent operational decision support for the fish collection and transportation system.
[0006] According to an embodiment of the present invention, a method for intelligent prediction of fish swarm integration based on NN-CCRF and LSTM is provided, comprising the following steps:
[0007] S1: Obtain fish monitoring data from multiple observation points in the water area and construct a time-series dataset;
[0008] S2, extract fish behavior characteristics and water quality indicators;
[0009] S3, Data Preprocessing: The Tomek Links algorithm is used to clean the feature dataset, optimize the boundary division between each category in the dataset, and the K-Means SMOTE algorithm is used to synthesize minority class samples.
[0010] S4, based on the NN-CCRF neural conditional random field model, feature enhancement and context modeling are performed to construct a temporal feature prediction sequence;
[0011] S5. Construct a fish swarm integration prediction model that integrates LSTM and NN-CCRF, and perform end-to-end model inference. Transform unary potential and pairwise potential into neural networks, and learn pairwise correlation matrices at the same time.
[0012] S6 inputs real-time monitoring data into the trained model and outputs a predicted value of fish school integration for future periods; it assesses the future state of the fish school based on the prediction results, providing decision support for intelligent fish collection management.
[0013] Preferably, step S1 includes the following sub-steps:
[0014] S11 is equipped with multi-point monitoring equipment, including underwater camera devices, sonar sensors, infrared monitors, and water quality monitoring equipment.
[0015] S12 collects images of fish schools, their movement trajectories, spatial distribution data, and water quality-related indicators.
[0016] S13, Select monitoring parameters. Based on the needs of intelligent fish collection and the goals of water quality and environmental protection management, select the categories of indicators that need to be monitored.
[0017] S14, Construct a fish swarm time series dataset ,in express The feature matrix at time step, This indicates the integration level label at the corresponding moment.
[0018] More preferably, step S2 includes the following sub-steps:
[0019] S21. Extract the behavioral characteristics of the fish school and the observed values of water quality indicators, and select monitoring parameters, including fish school distribution uniformity, swimming speed, turning frequency, aggregation index, total nitrogen and oxygen content in the water, suspended solids concentration, water turbidity, water pH and temperature, etc.
[0020] S22, Construct the feature vector sequence ,in Indicates the first The first moment Observed values of time-series indicators for individual fish populations express The number of indicators.
[0021] More preferably, step S3 includes the following sub-steps:
[0022] S31, perform mixed class balancing and synthetic sampling on the dataset; for the fish swarm time series dataset, first, perform preprocessing steps such as missing value handling and data standardization, and then use a combination algorithm consisting of the Tomek Links algorithm in undersampling techniques and the K-Means SMOTE algorithm in oversampling techniques to balance the dataset.
[0023] S32 uses the Tomek Links algorithm to refine the dataset. In this process, Euclidean distance is used as the metric for distance between samples. The existence of Tomek Links pairs is determined by calculation, for any two samples belonging to different classes. and If they form Tomek Links pairs, then the following conditions must be met:
[0024]
[0025] in, and These are two samples from the dataset. and They represent the sample number respectively. Each indicator's state characteristic value, Indicates the number of features;
[0026] S33 uses the K-Means SMOTE algorithm to synthesize minority class samples. K-Means clustering is then used to spatially partition the minority class samples, identifying regions with low sample density. Subsequently, within these low-density regions, new samples with features similar to the original minority class samples are generated through linear interpolation. The core operation formula of this algorithm can be expressed as follows:
[0027]
[0028] in, It is a minority class sample. It is its nearest neighbor sample. It is a uniformly distributed random number used to control the number of synthetic samples.
[0029] More preferably, step S4 includes the following sub-steps:
[0030] S41, Construct a prediction model based on NN-CCRF neural conditional random field, and establish an end-to-end mapping model M: X→Y, where X represents the input set of historical fish school integration index and Y represents the output set of future fish school integration prediction.
[0031] S42, Based on SDAE, a univariate feature function and correlation matrix learning module is designed to solve the NN-CCRF model. The probability distribution of the NN-CCRF model is defined as follows:
[0032]
[0033] in, It is a univariate characteristic function. The hidden state of the LSTM network represents the input. To output The mapping relationship is shown in the following formula:
[0034]
[0035]
[0036] These are preliminary estimates that do not take spatial correlation into account. ) represents the Sigmod function; It is a weight matrix. It is updating iteratively At that time, the bias vector provided within the LSTM component is:
[0037]
[0038] in, , These represent the outputs of the forward LSTM output gate and the memory cell, respectively.
[0039] In addition, the paired potential function provides a spatially dependent smoothing term. Similar fish population integration levels are encouraged in relevant areas, as defined below:
[0040]
[0041] In the formula, express and Spatial correlation is used to constrain and smooth the initial estimates. ;
[0042] S43 utilizes the modified SADE framework to learn the spatial correlation matrix used in paired potential functions. ;
[0043] S44: The backpropagation algorithm is used to propagate the calculated error information backward along the network topology to ensure that the error signal can be accurately fed back to each neuron in the model. Based on this, each neuron learns the parameters in an end-to-end (E2E) manner.
[0044] More preferably, in step S43, an improved stacked denoising autoencoder (SDAE) framework is used to learn the spatial correlation matrix. Stacked Denoising Autoencoders (SDAEs) achieve matching between impaired input and true output (ground truth) by sequentially encoding and decoding the original input data. If each encoder-decoder layer uses the same weight matrix, the spatial correlation matrix can be learned by minimizing the following objective function. As shown in the formula:
[0045]
[0046] In the formula This represents the improved n-layer encoder-decoder result of SDAE. Therefore, the proposed MPCI model achieves parameter learning by maximizing the following function:
[0047] .
[0048] More preferably, step S5 includes the following sub-steps:
[0049] S51. After constructing a fish swarm integration prediction model that combines LSTM and NN-CCRF, end-to-end model inference is performed to transform unary potential and pairwise potential into neural networks, while calculating the pairwise correlation matrix.
[0050] S52 introduces mean field theory to approximate the distribution in S43. ;
[0051] The goal of mean field inference is to use a simple distribution that can be decomposed into the product of independent marginal distributions. To approximate the original complex distribution ,Right now:
[0052]
[0053] Where N is the total number of regions, to achieve this approximation, we need to minimize the Kullback-Leibler (KL) divergence between Q and P:
[0054]
[0055] Where h and These are the model parameters in the LSTM and SDAE modules, respectively.
[0056] S53, Solve the above function to obtain a compact iterative update formula for model inference:
[0057]
[0058] in, This is the estimated fish population in the i-th region. During the mean field approximation process, each fish population is iteratively updated using the above formula. To minimize the estimated value Compared with the true value The mean absolute error between them;
[0059] S54, in order to learn the parameters in the single-point feature function and the corresponding correlation matrix The above mean field inference process is transformed into a sequence neural network framework;
[0060] S55 adds a linear combination layer at the end of the process, obtaining all parameters of the model learned in an end-to-end manner by applying a neural network to mean field inference of CCRF.
[0061] More preferably, in step S53, the mean field approximation process includes the following steps:
[0062] Step 1, Preliminary Estimation Stage: The algorithm first uses a univariate feature function (LSTM) to make a preliminary prediction for each region, obtaining an initial estimate. ;
[0063] Step 2, Interactive Propagation Phase: The algorithm then considers pairwise interactions between regions, and applies the preliminary estimates of each region to the correlation matrix. Under the constraints, multiple rounds of propagation and fusion are carried out, and the final estimate of each region is a combination of its own preliminary estimate and the weighted influence from all other regions.
[0064] More preferably, in step S54:
[0065] Use an LSTM module to replace the single-point potential energy to learn the relationship between input and output;
[0066] For paired potential energies, due to the existence For each region, the model needs a... The pairwise correlation matrix is used to constrain each pair of regions The output value is used to learn this using SDAE. The matrix is then applied to paired potentials.
[0067] During iterative updates During the process, correlation As it remains unchanged, each layer of SDAE is exactly the same and all behaves as a single... The matrix is consistent with the logic of the mean field inference algorithm in step S53;
[0068] The number of layers in the SDAE needs to be exactly the same as the number of iterations in the second stage of the mean field inference interaction propagation stage. By changing the number of layers in the SDAE, the application depth of pairwise interactions in mean field inference can be directly controlled.
[0069] In a further preferred embodiment, step S6 includes the following sub-steps:
[0070] S61, Input real-time monitoring data into the trained model, and output the predicted fish population density in region i for the future time period. The formula is as follows:
[0071]
[0072] in, The correlation matrix is learned in step S54; It is the estimated fish population in the i-th region, which is inferred in step S53;
[0073] S62, based on the prediction results Assess the future state of fish populations in region i to provide decision support for intelligent fish collection management.
[0074] Compared with the prior art, the present invention has the following beneficial effects:
[0075] This invention breaks through the traditional extensive fish collection model that relies on fixed frequency and manual experience. By constructing an intelligent prediction model that integrates LSTM and NN-CCRF, it not only accurately captures the nonlinear temporal evolution of fish behavior, but also adaptively learns the complex spatial relationships and interaction mechanisms of fish in dynamic water environments through a data-driven approach. This makes the prediction of future fish integration in the tailrace tunnel area more accurate and reliable, thus providing a scientific quantitative basis for the placement, timing, and operating parameters of fish collection boxes, effectively reducing the probability of "empty box operation," and improving the success rate and resource utilization efficiency of fish collection operations.
[0076] This invention combines predictive models with intelligent sensing terminals such as fish monitoring and hydrological sensors to construct a closed-loop intelligent system of "perception-prediction-decision". The system can automatically generate or recommend optimal fish collection operation plans based on real-time monitoring data and future integration degree predictions, significantly reducing the reliance on on-site manual experience in fish collection operations. By driving an autonomously controllable gantry crane electronic control system, intelligent scheduling and precise deployment of fish collection boxes can be achieved, upgrading the traditional cumbersome and inefficient manual operation process into a highly efficient and safe automated operation process, greatly reducing labor costs and operational risks.
[0077] The implementation of this invention represents a successful application of intelligent and green technologies in the field of ecological protection in traditional hydropower projects. It not only directly enhances the technological added value and operational efficiency of the fish collection and transportation system, but also strengthens the company's competitiveness in the high-end environmental protection equipment market by forming a complete solution with independent intellectual property rights. The successful demonstration of this project can be extended to other water conservancy and hydropower projects, driving overall technological progress in the industry, aligning with the national sustainable development strategy, and creating significant social, ecological, and economic benefits for enterprises. Attached Figure Description
[0078] Figure 1 This is a flowchart of an intelligent prediction method for fish swarm integration based on NN-CCRF and LSTM according to the present invention.
[0079] Figure 2 This is a diagram of the prediction model structure in step S41 of the intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM of the present invention.
[0080] Figure 3 This is a sequence neural network framework diagram in step S54 of the intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM of the present invention. Detailed Implementation
[0081] The technical solutions of the present invention will be further described below with reference to the accompanying drawings and embodiments.
[0082] This invention provides an embodiment, such as... Figure 1As shown, a smart prediction method for fish swarm integration based on NN-CCRF and LSTM includes the following steps:
[0083] S1: Obtain fish population monitoring data from multiple observation points within the water area and construct a time-series dataset; specifically, construct a feature dataset containing t time points. ;
[0084] In a further embodiment, step S1 includes the following sub-steps:
[0085] S11: Deploy monitoring equipment at multiple observation points, including underwater camera devices, sonar sensors, infrared monitors, and water quality monitoring equipment;
[0086] S12: Collect images of fish schools, their movement trajectories, spatial distribution data, and water quality-related indicators.
[0087] S13: Select monitoring parameters. Based on the needs of intelligent fish collection and the goals of water quality and environmental protection management, select the categories of indicators that need to be monitored.
[0088] S14: Constructing a fish swarm time series dataset ,in express The feature matrix at time step, The integration level label indicates the level of integration at the corresponding moment;
[0089] S2: Extract fish behavior characteristics and water quality indicators, including fish distribution uniformity, swimming speed, turning frequency, aggregation index, total nitrogen and oxygen content, suspended solids concentration, water turbidity, water pH and temperature, etc., to form a feature vector sequence;
[0090] In a further embodiment, step S2 includes the following sub-steps:
[0091] S21: Extract fish behavior characteristics and water quality indicators. Select monitoring parameters through step S13. Specifically, these include fish distribution uniformity, swimming speed, turning frequency, aggregation index, total nitrogen and oxygen in the water, suspended solids concentration, water turbidity, water pH and temperature, etc.
[0092] S22: Constructing the feature vector sequence ,in Indicates the first The first moment Observed values of time-series indicators for individual fish populations express The number of indicators;
[0093] S3: Data preprocessing; The Tomek Links algorithm is used to clean the feature dataset and optimize the boundary division between each class in the dataset; The K-Means SMOTE algorithm is used to synthesize minority class samples to solve the class imbalance problem in the dataset;
[0094] In a further embodiment, step S3 includes the following sub-steps:
[0095] S31: Perform mixed class balancing and synthetic sampling on the dataset;
[0096] This invention deeply analyzes the intrinsic mechanisms and application characteristics of undersampling and oversampling techniques, and then innovatively proposes a hybrid algorithm that integrates the concept of class balancing and synthetic sampling strategies. Specifically, for a fish swarm time series dataset, this invention first performs preprocessing steps such as missing value handling and data standardization to ensure data quality and consistency. Subsequently, a combined algorithm consisting of the Tomek Links algorithm in undersampling techniques and the K-Means SMOTE algorithm in oversampling techniques is used to balance the dataset and effectively solve the problem of noisy samples.
[0097] S32: First, the Tomek Links algorithm is used to refine the dataset, aiming to remove overlapping data samples, thereby reducing noise levels and optimizing the boundary division between categories. During this process, Euclidean distance is used as the metric for distance between samples, and the existence of Tomek Links pairs is determined through calculation. Specifically, for any two samples belonging to different categories... and If they form Tomek Links pairs, then the following conditions must be met:
[0098]
[0099] in, and These are two samples from the dataset. and They represent the sample number respectively. Each indicator's state characteristic value, Indicates the number of features;
[0100] S33: Secondly, the K-Means SMOTE algorithm is used to synthesize minority class samples to solve the class imbalance problem in the dataset. As an advanced synthetic sampling technique, this algorithm cleverly combines the advantages of the K-Means clustering algorithm and the SMOTE algorithm, aiming to optimize the class distribution of the dataset by generating new synthetic samples.
[0101] K-Means clustering is used to spatially partition minority class samples, identifying regions with low sample density. Then, within these low-density regions, new samples with features similar to the original minority class samples are generated through linear interpolation, effectively increasing the number of minority class samples. Through this series of operations, the K-Means SMOTE algorithm can significantly improve the balance of the dataset while maintaining the original feature distribution. The core operating formula of this algorithm can be expressed as follows:
[0102]
[0103] in, It is a minority class sample. It is its nearest neighbor sample. These are uniformly distributed random numbers used to control the number of synthetic samples;
[0104] S4: Based on the NN-CCRF (Neural Conditional Random Field) model, feature enhancement and context modeling are performed to construct a temporal feature prediction sequence;
[0105] In a further embodiment, step S4 includes the following sub-steps:
[0106] S41: Construct a prediction model based on NN-CCRF neural conditional random fields, such as Figure 2 As shown;
[0107] Specifically, an end-to-end mapping model M: X→Y is established, where X represents the input set of historical fish swarm integration index and Y represents the output set of future fish swarm integration prediction; each node marked xi in the figure represents the historical fish swarm integration with T time steps, and the nodes in yi give the corresponding future fish swarm integration of i.
[0108] S42: Based on SDAE, a univariate feature function and correlation matrix learning module was designed to solve the NN-CCRF model. The probability distribution of the NN-CCRF model is defined as follows:
[0109]
[0110] in, It is a univariate characteristic function. The hidden state of the LSTM network represents the input. To output The mapping relationship is shown in the following formula:
[0111]
[0112]
[0113] These are preliminary estimates that do not take spatial correlation into account. ) represents the Sigmod function; It is a weight matrix. It is updating iteratively At that time, the bias vector provided within the LSTM component is:
[0114]
[0115] in, , These represent the outputs of the forward LSTM output gate and the memory cell, respectively.
[0116] In addition, the paired potential function provides a spatially dependent smoothing term. Similar fish population integration levels are encouraged in relevant areas, as defined below:
[0117]
[0118] In the formula, express and Spatial correlation is used to constrain and smooth the initial estimates. .
[0119] S43: Using the modified SADE framework to learn the spatial correlation matrix used in paired potential functions ;
[0120] Specifically, this invention employs an improved stacked denoising autoencoder (SDAE) framework to learn the spatial correlation matrix. Stacked Denoising Autoencoders (SDAEs) achieve matching between impaired input and true output (ground truth) by sequentially encoding and decoding the original input data. If each encoder-decoder layer uses the same weight matrix, the spatial correlation matrix can be learned by minimizing the following objective function. As shown in the formula:
[0121]
[0122] In the formula This represents the n-layer encoder-decoder result of the improved SDAE. Therefore, the proposed MPCI model achieves parameter learning by maximizing the following function:
[0123]
[0124] S44: Finally, the backpropagation algorithm is used to propagate the calculated error information backward along the network topology to ensure that the error signal can be accurately fed back to each neuron in the model. On this basis, each neuron learns the parameters in an end-to-end (E2E) manner instead of manually deriving the gradient of each parameter, which reduces a lot of gradient derivation and mathematical analysis.
[0125] S5: Construct a fish swarm integration prediction model that integrates LSTM and NN-CCRF, and perform end-to-end model inference. Transform unary potential and pairwise potential into neural networks, and learn the pairwise correlation matrix at the same time.
[0126] In a further embodiment, step S5 includes the following sub-steps:
[0127] S51: After constructing a fish swarm integration prediction model that combines LSTM and NN-CCRF, end-to-end model inference is performed to transform unary potential and pairwise potential into neural networks, while calculating the pairwise correlation matrix.
[0128] The NN-CCRF model proposed in this invention has made a significant breakthrough in computing the correlation matrix between regions. This invention not only transforms single-point potential energy and pairwise potential energy into neural network components at the same time, but more importantly, it abandons the predefined kernel function and instead uses stacked denoising autoencoders (SDAE) to learn the pairwise correlation matrix, thereby achieving more flexible and adaptable spatial relationship modeling of data.
[0129] Specifically, the end-to-end E2E strategy calculates the gradient information of the error function with respect to the weights, guiding the weight parameters to be updated slightly along the negative gradient direction to gradually reduce the prediction error. To ensure continuous optimization of model performance, the above error calculation, backpropagation and weight update steps are executed repeatedly according to the preset upper limit of the number of iterations. Finally, through continuous iteration and optimization, when the prediction accuracy of the model meets the predetermined performance requirements, the training process is stopped, and a software state prediction model with stable prediction capabilities is obtained.
[0130] S52: Introducing mean field theory to approximate the distribution in S43 ;
[0131] The goal of mean field inference is to use a simple distribution that can be decomposed into the product of independent marginal distributions. To approximate the original complex distribution ,Right now:
[0132]
[0133] Where N is the total number of regions. To achieve this approximation, the Kullback-Leibler (KL) divergence between Q and P needs to be minimized:
[0134]
[0135] Where h and These are the model parameters in the LSTM and SDAE modules, respectively.
[0136] S53: Solve the above function to obtain a compact iterative update formula for model inference:
[0137]
[0138] in, This is the estimated fish population in the i-th region; for regression problems, the goal of the mean field approximation is to iteratively update each... To minimize the estimated value Compared with the true value The mean absolute error between them;
[0139] The approximate mean field inference algorithm is as follows. The entire mean field inference process consists of two main stages:
[0140] Step 1: Preliminary Estimation Stage: The algorithm first uses a univariate feature function (LSTM) to make a preliminary prediction for each region, obtaining an initial estimate. ;
[0141] Step 2: Interactive Propagation Phase: The algorithm then considers the pairwise interactions between regions, and applies the preliminary estimates of each region to the correlation matrix. Under the constraints, multiple rounds of propagation and fusion are carried out; the final estimate of each region is a combination of its own preliminary estimate and the weighted influence from all other regions.
[0142] S54: In order to learn the parameters in the single-point feature function and the corresponding correlation matrix The above mean field inference process is transformed into a sequence neural network framework, such as... Figure 3 As shown; the framework contains two main neural network modules;
[0143] The LSTM module is used to replace and implement single-point potentials, responsible for learning the complex mapping relationship between input features X and output Y; the SDAE module is used to learn and replace the correlation matrix in pairwise potentials. ;
[0144] Specifically, LSTM modules are used to replace single-point potential energy to learn the relationship between input and output; for paired potential energy, due to the existence of For each region, the model needs a... The pairwise correlation matrix is used to constrain each pair of regions The output value; this invention utilizes SDAE to learn this The matrix is then applied to the pairwise potentials; it is particularly noteworthy that during iterative updates... During the process, correlation It remains unchanged; therefore, in the implementation, each layer of the SDAE is exactly the same and behaves as a single... The matrix is consistent with the logic of the mean field inference algorithm in S53; in addition, the number of layers of SDAE needs to be exactly the same as the number of iterations of the second stage of mean field inference, the interaction propagation stage; this means that by changing the number of layers of SDAE, the application depth of pairwise interactions in mean field inference can be directly controlled.
[0145] S55: Finally, to balance the importance of single-point estimation and pairwise interaction estimation, a linear combination layer is added at the end of the process; by applying the neural network to mean field inference of CCRF, all parameters of the model are learned in an end-to-end manner.
[0146] S6: Input real-time monitoring data into the trained model and output the predicted value of fish school integration in the future time period; evaluate the future state of the fish school based on the prediction results and provide decision support for intelligent fish collection management;
[0147] In a further embodiment, step S6 includes the following sub-steps:
[0148] S61: Input real-time monitoring data into the trained model, and output the predicted fish population density in region i for the future time period. The formula is as follows:
[0149]
[0150] in, The correlation matrix was learned in S54; It is the estimated fish population in the i-th region, which is inferred in S53;
[0151] S62: Based on the prediction results Assess the future state of fish populations in region i to provide decision support for intelligent fish collection management.
[0152] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this utility model and are not intended to limit it. Although this utility model has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of this utility model without departing from the spirit and scope of the technical solutions of this utility model, and all such modifications or substitutions should be covered within the scope of the claims of this utility model.
Claims
1. A fish swarm integrated degree intelligent prediction method based on NN-CCRF and LSTM, characterized in that, Includes the following steps: S1: Obtain fish monitoring data from multiple observation points in the water area and construct a time-series dataset; S2, extract fish behavior characteristics and water quality indicators; S3, Data Preprocessing: The Tomek Links algorithm is used to clean the feature dataset, optimize the boundary division between each category in the dataset, and the K-Means SMOTE algorithm is used to synthesize minority class samples. S4, based on the NN-CCRF neural conditional random field model, feature enhancement and context modeling are performed to construct a temporal feature prediction sequence; S5. Construct a fish swarm integration prediction model that integrates LSTM and NN-CCRF, and perform end-to-end model inference. Transform unary potential and pairwise potential into neural networks, and learn pairwise correlation matrices at the same time. S6: Input real-time monitoring data into the trained model and output the predicted value of fish school integration in the future period; The prediction results are used to assess the future state of the fish population and provide decision support for intelligent fish collection management.
2. The fish swarm integrated degree intelligent prediction method based on NN-CCRF and LSTM according to claim 1, characterized in that, Step S1 includes the following sub-steps: S11 is equipped with multi-point monitoring equipment, including underwater camera devices, sonar sensors, infrared monitors, and water quality monitoring equipment. S12 collects images of fish schools, their movement trajectories, spatial distribution data, and water quality-related indicators. S13, Select monitoring parameters. Based on the needs of intelligent fish collection and the goals of water quality and environmental protection management, select the categories of indicators that need to be monitored. S14, constructing a fish school time-series dataset wherein denotes a feature matrix at the time instant, denotes an integration degree label at the corresponding time instant.
3. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 1, characterized in that, Step S2 includes the following sub-steps: S21. Extract the behavioral characteristics of the fish school and the observed values of water quality indicators, and select monitoring parameters, including fish school distribution uniformity, swimming speed, turning frequency, aggregation index, total nitrogen and oxygen content in the water, suspended solids concentration, water turbidity, water pH and temperature, etc. S22, Construct the feature vector sequence ,in Indicates the first The first moment Observed values of time-series indicators for individual fish populations express The number of indicators.
4. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 1, characterized in that, Step S3 includes the following sub-steps: S31, perform mixed class balancing and synthetic sampling on the dataset; for the fish swarm time series dataset, first, perform preprocessing steps such as missing value handling and data standardization, and then use a combination algorithm consisting of the Tomek Links algorithm in undersampling techniques and the K-Means SMOTE algorithm in oversampling techniques to balance the dataset. S32 uses the Tomek Links algorithm to refine the dataset. In this process, Euclidean distance is used as the metric for distance between samples. The existence of Tomek Links pairs is determined by calculation, for any two samples belonging to different classes. and If they form Tomek Links pairs, then the following conditions must be met: in, and These are two samples from the dataset. and They represent the sample number respectively. Each indicator's state characteristic value, Indicates the number of features; S33 uses the K-Means SMOTE algorithm to synthesize minority class samples. K-Means clustering is then used to spatially partition the minority class samples, identifying regions with low sample density. Subsequently, within these low-density regions, new samples with features similar to the original minority class samples are generated through linear interpolation. The core operation formula of this algorithm can be expressed as follows: in, It is a minority class sample. It is its nearest neighbor sample. It is a uniformly distributed random number used to control the number of synthetic samples.
5. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 1, characterized in that, Step S4 includes the following sub-steps: S41, Construct a prediction model based on NN-CCRF neural conditional random field, and establish an end-to-end mapping model M: X→Y, where X represents the input set of historical fish school integration index and Y represents the output set of future fish school integration prediction. S42, Based on SDAE, a univariate feature function and correlation matrix learning module is designed to solve the NN-CCRF model. The probability distribution of the NN-CCRF model is defined as follows: in, It is a univariate characteristic function. The hidden state of the LSTM network represents the input. To output The mapping relationship is shown in the following formula: These are preliminary estimates that do not take spatial correlation into account. ) represents the Sigmod function; It is a weight matrix. It is updating iteratively At that time, the bias vector provided within the LSTM component is: in, , These represent the outputs of the forward LSTM output gate and the memory cell, respectively. In addition, the paired potential function provides a spatially dependent smoothing term. Similar fish population integration levels are encouraged in relevant areas, as defined below: In the formula, express and Spatial correlation is used to constrain and smooth the initial estimates. ; S43 utilizes the modified SADE framework to learn the spatial correlation matrix used in paired potential functions. ; S44: The backpropagation algorithm is used to propagate the calculated error information backward along the network topology to ensure that the error signal can be accurately fed back to each neuron in the model. Based on this, each neuron learns the parameters in an end-to-end (E2E) manner.
6. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 1, characterized in that, In step S43, a modified stacked denoising autoencoder (SDAE) framework is used to learn the spatial correlation matrix. Stacked Denoising Autoencoders (SDAEs) achieve matching between impaired input and true output (ground truth) by sequentially encoding and decoding the original input data. If each encoder-decoder layer uses the same weight matrix, the spatial correlation matrix can be learned by minimizing the following objective function. As shown in the formula: In the formula This represents the improved n-layer encoder-decoder result of SDAE. Therefore, the proposed MPCI model achieves parameter learning by maximizing the following function: 。 7. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 1, characterized in that, Step S5 includes the following sub-steps: S51. After constructing a fish swarm integration prediction model that combines LSTM and NN-CCRF, end-to-end model inference is performed to transform unary potential and pairwise potential into neural networks, while calculating the pairwise correlation matrix. S52 introduces mean field theory to approximate the distribution in S43. ; The goal of mean field inference is to use a simple distribution that can be decomposed into the product of independent marginal distributions. To approximate the original complex distribution ,Right now: Where N is the total number of regions, to achieve this approximation, we need to minimize the Kullback-Leibler (KL) divergence between Q and P: Where h and These are the model parameters in the LSTM and SDAE modules, respectively. S53, Solve the above function to obtain a compact iterative update formula for model inference: in, This is the estimated fish population in the i-th region. During the mean field approximation process, each fish population is iteratively updated using the above formula. To minimize the estimated value Compared with the true value The mean absolute error between them; S54, in order to learn the parameters in the single-point feature function and the corresponding correlation matrix The above mean field inference process is transformed into a sequence neural network framework; S55 adds a linear combination layer at the end of the process, obtaining all parameters of the model learned in an end-to-end manner by applying a neural network to mean field inference of CCRF.
8. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 7, characterized in that, In step S53, the mean field approximation process includes the following steps: Step 1, Preliminary Estimation Stage: The algorithm first uses a univariate feature function (LSTM) to make a preliminary prediction for each region, obtaining an initial estimate. ; Step 2, Interactive Propagation Phase: The algorithm then considers pairwise interactions between regions, and applies the preliminary estimates of each region to the correlation matrix. Under the constraints, multiple rounds of propagation and fusion are carried out, and the final estimate of each region is a combination of its own preliminary estimate and the weighted influence from all other regions.
9. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 7, characterized in that, In step S54: Use an LSTM module to replace the single-point potential energy to learn the relationship between input and output; For paired potential energies, due to the existence For each region, the model needs a... The pairwise correlation matrix is used to constrain each pair of regions The output value is used to learn this using SDAE. The matrix is then applied to paired potentials. During iterative updates During the process, correlation As it remains unchanged, each layer of SDAE is exactly the same and all behaves as a single... The matrix is consistent with the logic of the mean field inference algorithm in step S53; The number of layers in the SDAE needs to be exactly the same as the number of iterations in the second stage of the mean field inference interaction propagation stage. By changing the number of layers in the SDAE, the depth of application of pairwise interactions in mean field inference can be directly controlled.
10. The intelligent prediction method for fish swarm integration degree based on NN-CCRF and LSTM according to claim 7, characterized in that, Step S6 includes the following sub-steps: S61, Input real-time monitoring data into the trained model, and output the predicted value of fish school integration in region i for the future time period. The formula is as follows: in, The correlation matrix is learned in step S54; It is the estimated fish population in the i-th region, which is inferred in step S53; S62, based on the prediction results Assess the future state of fish populations in region i to provide decision support for intelligent fish collection management.