Iot-based intelligent fishery culture monitoring and management method and system
By constructing an aligned training model of behavioral cloning and a large language model, the problem of insufficient utilization of implicit experience in the intelligent fishery system is solved, realizing the mathematical expression of expert intuition and real-time decision-making, and possessing adaptive and continuous evolution capabilities.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SUZHOU AGRICULTURAL SCIENCE & TECHNOLOGY DEVELOPMENT CO LTD
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-12
Smart Images

Figure CN122199191A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of Internet of Things (IoT) technology, and more specifically to a smart aquaculture monitoring and management method and system based on IoT. Background Technology
[0002] Traditional smart fisheries rely on IoT monitoring and threshold control, but cannot utilize the implicit experience of aquaculture experts. Existing AI solutions only learn explicit rules or make simple predictions, lacking a deep understanding and adaptive capabilities for complex aquaculture scenarios. This solution addresses this gap by extracting expert intuition through behavioral cloning, injecting implicit knowledge through large-scale model alignment, achieving real-time decision-making through edge-cloud collaboration, and building a closed-loop feedback mechanism for continuous evolution.
[0003] Existing smart fishery systems are mainly divided into two categories: one is a monitoring and early warning system based on Internet of Things sensors, which sets thresholds for parameters such as dissolved oxygen and pH to trigger alarms or automatic control. This type of system can only handle explicit, preset rules and cannot cope with complex and ever-changing aquaculture environments and emergencies. The other type is a predictive model based on traditional machine learning, which learns from historical data to predict changes in water quality or fish disease risks. However, these models have poor generalization ability and rely on a large amount of labeled data, making it difficult to incorporate the implicit experience accumulated by aquaculture experts over many years, such as "observing water color and fish leaping."
[0004] In recent years, the application of large language models in agriculture has gradually emerged. However, existing solutions are mostly limited to injecting textual knowledge such as breeding manuals and expert Q&A into the model to form an agricultural knowledge question-and-answer system. These solutions fail to address two core issues: first, how to transform the "intuition" that experts cannot articulate into a form that the model can learn; and second, how to enable the large model not only to "speak" but also to directly control equipment to achieve closed-loop management. Furthermore, existing systems are mostly "one-time training, permanent use," lacking the ability to continuously evolve based on the unique environment of the farm and the preferences of the personnel. Summary of the Invention
[0005] In view of the above-mentioned shortcomings of the existing technology, the present invention provides a smart aquaculture monitoring and management method and system based on the Internet of Things, which can effectively solve the problem that implicit experience cannot be digitally utilized in the background technology.
[0006] To solve the above-mentioned technical problems, the present invention adopts the following technical solution: The present invention provides a smart aquaculture monitoring and management method based on the Internet of Things, comprising: S1. Collect multimodal environmental data from the breeding site and operational behavior data of the breeding personnel at corresponding times to construct an original dataset containing the mapping relationship between environmental data and the actions of the breeding personnel.
[0007] The multimodal environmental data includes aquaculture water quality parameters, fish activity parameters, and meteorological parameters.
[0008] S2. Standardize the original dataset to generate training samples, train the behavior cloning model based on the training samples, extract the hidden layer feature vectors, and generate the corresponding behavior representation vectors of the farmers in real time.
[0009] S3. Pair and organize the generated expert intuition feature vectors with the environmental state vectors to construct a training sample set for alignment training, and perform alignment training on the preset large language model.
[0010] S4. Connect the preset large language model to the Internet of Things platform, generate control commands and send them to the execution terminal.
[0011] S5. Incrementally update the behavior clone model and the predefined large model.
[0012] Preferably, the process of constructing the original dataset containing the mapping relationship between environmental data and the actions of aquaculture workers is as follows: Multimodal environmental data of the breeding site are acquired according to the preset collection frequency, and the operational behavior data of the breeding personnel at the corresponding time are recorded simultaneously.
[0013] The multimodal environmental data and operational behavior data are preprocessed, and each operational behavior data is matched with an environmental state window within a preset time period before its occurrence, forming a state-action candidate pair that represents the mapping relationship between the environmental state and the operation of the breeding personnel.
[0014] Feature extraction is performed on the acquired visual data, and the extracted visual semantic information is used as a supplementary dimension of the environment state vector to construct a structured original dataset containing scene labels.
[0015] Preferably, the original dataset is standardized to generate training samples, and the specific process is as follows: The original dataset is standardized, including: Align and interpolate multi-source environmental data with different sampling frequencies on the time axis.
[0016] Extract visual feature vectors from video data.
[0017] The aligned multi-source data is concatenated with the visual feature vector to form an environment state vector.
[0018] Convert the operation records of the breeding personnel into operation behavior tags in a preset format.
[0019] Pair the environmental state vector at each time step with the corresponding operation behavior label to form state-operation training sample pairs, and obtain a standardized training sample set.
[0020] Preferably, a behavioral cloning model is trained based on training samples and hidden layer feature vectors are extracted. The specific process is as follows: Based on a standardized training sample set, a deep neural network with time-series processing capabilities is constructed as a behavior cloning model, and it is trained with the historical environmental state sequence within a sliding time window as input and the current operation behavior as the prediction target.
[0021] After training, the model performance is evaluated on the validation set. If the prediction accuracy of discrete operations reaches a preset threshold and the prediction error of continuous operations is lower than a preset threshold, the behavioral cloning model is considered to have met the requirements of imitating the decision-making behavior of farmers.
[0022] After training, the output layer of the corresponding behavior clone model is removed, and the mapping relationship from the input layer to the last fully connected hidden layer is retained. The output of the fully connected hidden layer is used as the operational behavior representation vector representing the decision-making logic of the breeder. The space in which this vector is located is the implicit experience space.
[0023] Preferably, behavioral representation vectors corresponding to the farmers are generated in real time, and the specific process is as follows: Based on the samples in the original dataset, each sample includes each collection time point, as well as the dissolved oxygen concentration, water temperature, and pH value of the corresponding aquaculture water area at each collection time point. Combined with the preset maximum dissolved oxygen change rate, maximum water temperature change rate, and maximum pH change rate, and the collection time intervals corresponding to dissolved oxygen concentration, water temperature, and pH value, the environmental complexity index corresponding to each sample is calculated.
[0024] Obtain the expert intuition feature value corresponding to each sample from the last fully connected hidden layer of the trained behavior clone model.
[0025] Based on the environmental complexity index corresponding to each sample, the expert intuition feature values are weighted statistically to obtain the weighted mean and weighted standard deviation.
[0026] The quantization range is determined based on the weighted mean and weighted standard deviation, and the full-precision behavioral cloning model is converted into a lightweight model in INT8 integer format using an asymmetric linear quantization method and deployed to edge computing devices.
[0027] The edge computing device receives environmental state vectors from the breeding site in real time, and outputs expert intuition feature vectors in INT8 format through forward propagation calculation of the lightweight model, which serve as real-time behavioral representation vectors for the breeding personnel.
[0028] Preferably, the training sample set for alignment training is constructed, and the specific process is as follows: Collect parallel data streams corresponding to the aquaculture site.
[0029] The parallel data stream includes a standardized environment state vector generated every second and an INT8 format expert intuition feature vector output in real time by the edge quantization model at the corresponding moment.
[0030] Align the two data streams by timestamp to obtain paired samples.
[0031] The continuously collected paired samples are arranged in chronological order to form a training set for large model alignment training.
[0032] Preferably, a pre-defined large language model is trained, and the specific process is as follows: Using a pre-defined large language model as a base, a state encoder is added to the front end of the model to map the environmental state vector into input features that match the word embedding dimension of the model.
[0033] Input the environmental state vector into the modified preset large language model, and obtain the internal feature vector output by the last Transformer layer of the preset large language model.
[0034] The gap between the internal feature vector and the expert intuition feature vector of the supervised target is calculated, and training is performed with minimizing this gap as the optimization objective.
[0035] When the average difference between the internal feature vectors on the validation set and the expert intuition feature vectors of the supervised target is less than a preset threshold, the preset large language model is determined to have completed alignment training.
[0036] Preferably, control commands are generated based on the real-time environmental state vector and sent to the execution terminal. The specific process is as follows: The real-time generated environmental state vector is input into a pre-set large language model that has completed alignment training, thereby generating multiple candidate control schemes.
[0037] The target scheme is obtained by comprehensively evaluating each candidate control scheme.
[0038] The target solution is converted into equipment control commands and sent to the execution terminal. The execution effect is continuously monitored. If the actual effect deviates from the expected effect by more than a preset threshold, a secondary decision is triggered to regenerate the emergency solution.
[0039] At the same time, the input status, generated plan, execution results, and human intervention records of this decision are saved as input data for the closed-loop optimization of the S5 steps.
[0040] Preferably, incremental updates are performed on the behavioral clone model and the predefined large model, and the specific process is as follows: Data on environmental conditions, operational instructions, and execution effects during the automated decision-making process, as well as data on environmental conditions and manual operations during manual intervention by aquaculture personnel, are collected as a feedback sample set.
[0041] When the number of feedback samples reaches a preset cache threshold, an asynchronous update of the dual models is triggered: First, the behavior clone model is incrementally trained using the feedback samples to update the expert intuition feature vector output in real time at the edge. Then, the newly collected environmental state is paired with the updated expert intuition feature vector and added to the alignment training set to fine-tune the preset large language model.
[0042] The updated model, after performance verification, replaces the old version, realizing a continuous evolutionary closed loop for the system from data collection to intelligent decision-making.
[0043] In a second aspect, the present invention provides an Internet of Things-based intelligent aquaculture monitoring and management system, comprising: The aquaculture data acquisition module is used to collect multimodal environmental data from the aquaculture site and the corresponding operational behavior data of aquaculture personnel at different times, and to construct a raw dataset containing the mapping relationship between environmental data and aquaculture personnel actions.
[0044] The behavior cloning and quantification module is used to standardize the original dataset to generate training samples, train the behavior cloning model based on the training samples, extract the hidden layer feature vectors, and generate the corresponding behavior representation vectors of the farmers in real time.
[0045] The large model experience injection module is used to pair and organize the generated expert intuition feature vectors with the environmental state vectors, construct a training sample set for alignment training, and perform alignment training on the preset large language model.
[0046] The decision control module is used to connect the preset large language model to the Internet of Things platform, generate control commands, and send them to the execution terminal.
[0047] The closed-loop optimization module is used to incrementally update the behavior clone model and the predefined large model.
[0048] The technical solution provided by this invention has the following advantages compared with the known prior art: 1. In the process of collecting aquaculture data, this invention deploys multimodal sensing devices and motion capture sensors to simultaneously record the environmental state and every manual operation of the aquaculture personnel. Scene labels are added according to dimensions such as aquaculture stage and weather type. This helps to transform the implicit experience accumulated by experts over many years into structured "state-action" mapping data, providing a real, rich, and high-quality original dataset with scene context for subsequent behavioral cloning model training.
[0049] 2. In the process of behavior cloning and quantification, this invention learns decision-making logic from expert operation data through a behavior cloning model, extracts the output of the last fully connected hidden layer as a 128-dimensional expert intuition feature vector, and realizes the mathematical expression of expert implicit knowledge. Subsequently, an environmental complexity index is introduced to perform weighted statistics on activation values, so that the quantization range focuses on protecting the accuracy in high-risk scenarios such as sudden drop in dissolved oxygen and heavy rain. Finally, the full-precision model is compressed into INT8 format and deployed to the edge through asymmetric linear quantization. This not only transforms expert intuition into a computable and transmittable lightweight vector, but also ensures that the decision-making accuracy at critical moments is not lost due to quantization. This is beneficial for realizing 24 / 7 real-time perception of expert intuition under the premise of limited computing power of edge devices.
[0050] 3. In the large model experience injection process of this invention, the expert intuition feature vector in INT8 format output in real time at the edge is paired with the environment state vector to construct a training set. By performing internal feature alignment training on the preset large language model, the output vector of its last Transformer layer gradually approaches the expert intuition feature vector. This is beneficial for the large model to make reasonable inferences based on the aligned intuition representation when facing complex scenes that it has never seen before, and it has the ability to generalize beyond historical data.
[0051] 4. In the intelligent decision-making and control process of this invention, the aligned large model is connected to the Internet of Things platform. Based on the real-time environmental state vector, an internal feature representation aligned with expert intuition is first generated. Then, the language generation capability of the large model is used to conduct multi-candidate scheme deduction, generating several sets of control schemes containing specific operation instructions, execution parameters, and expected effects. Based on the risk level implied by the internal features, the weights of multiple objectives such as breeding safety, energy consumption cost, and equipment lifespan are dynamically adjusted for comprehensive scoring and selection. This process simulates the decision-making method of human experts who "think of several approaches and then weigh the pros and cons." At the same time, it has interpretability—it can output the decision-making reasons in natural language form at the same time. This is conducive to achieving unmanned automatic management, enhancing the trust of breeding personnel in the system, and providing a clear decision-making basis for subsequent optimization.
[0052] 5. In the closed-loop optimization process, this invention continuously records the complete context of each automatic decision and the manual intervention of the aquaculture personnel. New samples are periodically fed back to the original dataset. A dual-model asynchronous update strategy is used to incrementally fine-tune the behavioral cloning model and the preset large language model, respectively. This allows the system to gradually adapt to the unique seasonal changes of the aquaculture farm, the evolution trend of water quality, and the management preferences of individual aquaculture farmers. This helps to break the limitations of traditional AI systems that are "trained once and used permanently," forming a continuous evolutionary closed loop from data collection, experience injection, intelligent decision-making to feedback optimization. This ensures that the system becomes smarter with use and maintains a high degree of consistency with the latest aquaculture practices in the long term. Attached Figure Description
[0053] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are merely some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without any creative effort.
[0054] Figure 1 This is a schematic diagram of the implementation steps of the present invention.
[0055] Figure 2 This is a schematic diagram of the system structure connection of the present invention. Detailed Implementation
[0056] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative effort are within the scope of protection of the present invention.
[0057] The present invention will be further described below with reference to embodiments.
[0058] Please see Figure 1 As shown, the IoT-based smart aquaculture monitoring and management method includes at least the following: S1. Aquaculture Data Collection: Collect multimodal environmental data from the aquaculture site and operational behavior data of aquaculture personnel at corresponding times to construct an original dataset containing the mapping relationship between environmental data and aquaculture personnel actions.
[0059] The multimodal environmental data includes aquaculture water quality parameters, fish activity parameters, and meteorological parameters.
[0060] In one specific embodiment, the construction of the original dataset containing the mapping relationship between environmental data and the actions of aquaculture personnel is carried out as follows: Sensing devices are deployed at the aquaculture site. The sensing devices include a water quality sensor array, a visual acquisition device, and an environmental meteorological device. Multimodal environmental data corresponding to the aquaculture site are collected according to a preset environmental data collection time interval.
[0061] Motion capture sensors are installed on operable equipment at the breeding site. Operable equipment includes, but is not limited to, breeding control cabinets, feeders, and aerators. The sensors record the time points, operation types, and operation ranges of the breeding personnel when operating the operable equipment, forming operational behavior data of the breeding personnel at the corresponding time.
[0062] The collected multimodal environmental data and the corresponding operational behavior data of the aquaculture personnel are aligned on the time axis, outliers generated by the sensors are deleted, and the operation records are filtered. For each valid operation record, the environmental state time series within the first preset time before its occurrence is matched to form a state-operation behavior candidate pair.
[0063] Keyframes are extracted from the video stream acquired by the visual acquisition device, and semantic information such as fish density, activity level, and water color features are extracted using a pre-trained lightweight visual model as a supplementary dimension to the state vector.
[0064] Multimodal environmental data and operational behavior data are labeled with scene tags according to dimensions such as breeding stage and weather type to form a structured raw dataset. The raw dataset includes at least scene tags, timestamps, environmental state vectors, and operation records of breeding personnel.
[0065] It should be noted that the aquaculture water quality data includes, but is not limited to, dissolved oxygen concentration, water temperature, pH value, ammonia nitrogen content, nitrite content, turbidity, and conductivity; the fish activity data includes, but is not limited to, fish density distribution, fish swimming speed, feeding activity, frequency of abnormal behavior, and water color change characteristics obtained through visual acquisition devices; the meteorological data includes, but is not limited to, air temperature, humidity, light intensity, wind speed and direction, rainfall, and air pressure.
[0066] It should be noted that the first preset duration is determined based on statistical analysis of the decision-making behavior patterns of aquaculture experts. Specifically, it is the average reaction time between the expert's observation of the abnormal phenomenon and the execution of the operation, so as to fully cover the information window required for decision-making. The operation record is filtered based on the existing anti-shake algorithm, which filters out the momentary invalid records caused by equipment shaking or accidental touch by setting a minimum effective operation time threshold.
[0067] It should be noted that in actual aquaculture scenarios, the optimal data acquisition frequency varies among different sensing devices. For example, dissolved oxygen sensors, due to their rapid changes, are typically set to acquire data every 1-5 seconds; pH and temperature sensors, with relatively gradual changes, can be set to acquire data every 10-30 seconds; meteorological devices, such as wind speed and direction sensors, typically acquire data every 1-10 minutes; and vision devices extract 1-2 key images per second, depending on the analysis requirements. Therefore, the "preset environmental data acquisition time interval" does not refer to a uniform interval for all devices, but rather to setting an independent acquisition frequency for each type of sensing device that matches the dynamic characteristics of the object being monitored.
[0068] It should be noted that for the fusion of multi-source heterogeneous data, this solution adopts a time axis alignment mechanism: using the highest sampling frequency among all sensing devices as the reference time axis, the data collected at other low frequencies are interpolated (such as linear interpolation or spline interpolation) to ensure that there is a complete environmental state vector at each reference time point.
[0069] It should be noted that adding scene labels according to dimensions such as aquaculture stage and weather type means labeling each state-action data pair with its corresponding fish aquaculture production stage (such as fry stage, growth stage, fattening stage, and overwintering stage) and the weather conditions at the time of data collection (such as sunny, cloudy / rainy, muggy / low-pressure, and cold wave). For example, if a data point is recorded during the high-temperature season of June-July each year, and the day has a southerly wind, low air pressure, and calm water, then the scene labels "fattening stage" and "muggy / low-pressure" are added to it. If another data point is recorded in mid-to-late October, with a sudden drop in temperature and a northerly wind of level 4, then the scene labels "late fattening stage - overwintering transition period" and "cold wave" are added. These labels are subsequently used in step S2 to distinguish the aquaculture risk characteristics under different meteorological conditions when calculating the environmental complexity index, and in step S3 to fine-tune the large model for specific aquaculture stages (such as intensive fattening before overwintering).
[0070] In the process of collecting aquaculture data, this invention deploys multimodal sensing devices and motion capture sensors to simultaneously record the environmental conditions and every manual operation of the aquaculture personnel. Scene labels are added according to dimensions such as aquaculture stage and weather type. This helps to transform the implicit experience accumulated by experts over many years into structured "state-action" mapping data, providing a real, rich, and high-quality original dataset with scene context for subsequent behavioral cloning model training.
[0071] S2. Behavioral Cloning and Quantitative Representation: The original dataset is standardized to generate training samples. The behavioral cloning model is trained based on the training samples, and the hidden layer feature vector is extracted. The behavioral representation vector corresponding to the farmers is generated in real time.
[0072] It should be noted that the behavioral cloning model is an imitation learning model built on deep neural networks. It is used to learn the decision-making patterns of farmers from their operational behavior data and extract implicit experience vectors representing farmers' intuitive judgments through the output of the hidden layer inside the model.
[0073] In a specific embodiment, the original dataset is standardized to generate training samples. The specific process is as follows: key frames are extracted from the video stream at a preset frame rate. For each key frame, a pre-trained lightweight convolutional neural network is used to extract visual features to obtain the corresponding visual feature vector. The visual feature vector includes, but is not limited to, fish density distribution, fish swimming activity, and water color features.
[0074] The time-aligned multi-source environmental data are stitched together to form a unified environmental state vector.
[0075] The operation records of the farmers in the original dataset are converted into operation behavior labels. For discrete operations, one-hot encoding is used to map them to fixed-dimensional binary vectors. For continuous operations, min-max normalization is used to map the original values to... Within the interval, dimensionless transformation is achieved.
[0076] The time-aligned environment state vector is paired with the corresponding operation behavior label to form state-operation behavior training sample pairs. The corresponding scene label in the original dataset is retained as the meta-information of the sample pairs, and finally a standardized training sample set is obtained.
[0077] It should be noted that the preset frame rate is dynamically set according to the biological characteristics of the aquaculture object and the monitoring needs. For example, for fish that swim fast, it can be set to 1 frame / second to capture their instantaneous behavior, while for aquaculture scenarios with slow water quality changes, it can be reduced to 1 frame / 5 seconds to balance the amount of data and the consumption of computing resources, ensuring that key behavioral features can be captured while avoiding redundant data.
[0078] It should be noted that the pre-trained lightweight convolutional neural network is built on existing image classification networks (such as MobileNetV3, ShuffleNet, etc.). This network has been pre-trained on the large-scale public dataset ImageNet and has general image feature extraction capabilities. In this scheme, it is used as a fixed feature extractor and does not participate in subsequent training. It is only used to extract high-level visual features such as fish density and activity from aquaculture video frames and transform them into feature vectors that can be processed by the model.
[0079] It should be noted that the water color characteristics refer to the feature parameters extracted by analyzing visual attributes such as water color, turbidity, and transparency. These parameters are used to characterize the algal population structure, organic matter content, and water quality status in the water body. They are an important visual basis for aquaculture experts to judge the quality of water and to warn of water quality deterioration.
[0080] It should be noted that stitching together time-aligned multi-source environmental data means connecting all data at the same moment in a fixed order to form a one-dimensional vector. The specific stitching order is: water quality parameters first, meteorological parameters in the middle, and visual feature vector last. Taking the data of a grass carp farming pond at a certain time as an example: the water quality parameters include dissolved oxygen 3.2 mg / L, water temperature 28.5℃, pH 7.8, and ammonia nitrogen 0.3 mg / L, a total of 4 values; the meteorological parameters include air temperature 30.2℃, humidity 75%RH, light intensity 4500 Lux, and wind speed 2.1 m / s, a total of 4 values; the visual feature vector is 128-dimensional (for example, the first 5 dimensions are 0.32, -0.87, 1.23, 0.05, -0.44...), and the spliced environmental state vector is [3.2, 28.5, 7.8, 0.3, 30.2, 75, 4500, 2.1, 0.32, -0.87, 1.23, 0.05, -0.44, ...], with a total dimension of 136.
[0081] It should be noted that the discrete operations referred to are operations performed by aquaculture personnel on equipment that have a finite number of defined states and are not continuous between states, such as turning aerators on / off, starting / stopping feeders, and switching water pump gears. The specific process of mapping these operations to a fixed-dimensional binary vector is as follows: First, count the total number N possible states involved in this type of operation, constructing an N-dimensional vector of all zeros; then, based on the actual operation state, set the dimension corresponding to that state to 1, and keep the other dimensions at 0. For example, for the operation "turn on aerator #1", if there are 3 aerators on site, the corresponding 3-dimensional vector is... .
[0082] It should be noted that the aforementioned continuous operation refers to operations performed by the aquaculture personnel on the equipment with a continuously varying range, such as adjusting the feed amount with a knob, stepless speed regulation of a waterwheel aerator, and adjusting the opening of the inlet valve. These values can vary continuously within a certain range, and minimum-maximum normalization is used to map the original values to... The specific process for the interval is as follows: the historical minimum and maximum values of the operation type are pre-calculated, and the current original operation value is normalized to obtain a dimensionless value, so that continuous operations with different dimensions can be uniformly input into the model for training.
[0083] It should be noted that the meta-information refers to the descriptive labels attached to the training sample pairs, which are used to identify the breeding stage (such as seedling stage, growth stage), weather type (such as sunny day, rainy day), and whether abnormal events have occurred, etc. Meta-information does not directly participate in model training.
[0084] In a specific embodiment, a behavior cloning model is trained based on training samples and hidden layer feature vectors are extracted. The specific process is as follows: a deep neural network with a long short-term memory network is used as the basic architecture, and the output layer is set according to the type of operation behavior. For discrete operations, the Softmax activation function is used, and the output dimension is equal to the number of possible operation states; for continuous operations, linear activation is used, and the output dimension is 1.
[0085] The behavior cloning model is trained based on a standardized training sample set. A sliding time window of a set length is constructed from the training sample set in chronological order as input, and the corresponding current operation behavior is used as the prediction target.
[0086] The loss function is selected based on the type of operation. For discrete operations, cross-entropy loss is used, while for continuous operations, mean squared error loss is used. Training stops when the validation set loss no longer decreases after a set number of consecutive rounds.
[0087] After training, the performance of the behavioral cloning model is evaluated on the validation set. If the accuracy of discrete operation prediction reaches the preset qualified accuracy and the mean square error of continuous operation prediction is lower than the preset error threshold, then the behavioral cloning model is considered to have met the requirements of imitating the decision-making behavior of aquaculture personnel.
[0088] Based on the trained behavior clone model, the output layer of the trained model is removed, while the mapping relationship from the input layer to the last fully connected hidden layer is retained. For any input environmental state vector, the mapping is calculated to obtain the corresponding operation behavior representation vector of the breeder. The space in which this operation behavior representation vector is located is the implicit experience space, thereby obtaining the hidden layer feature vector.
[0089] It should be noted that the deep neural network with long short-term memory is a variant of the recurrent neural network specifically designed for processing time-series data. In this scheme, it is used to learn the decision-making logic of the farmers.
[0090] It should be noted that the behavioral cloning model's input layer receives the environmental state vector generated in the first part, sets up two LSTM layers, each with 128 hidden units, and inputs the output of the last time step of the LSTM layer into a fully connected hidden layer, keeping the number of neurons at 128, and using ReLU as the activation function.
[0091] It should be noted that when the predicted operation type involves a finite number of mutually exclusive states (such as the on / off state of an aerator or the start / stop state of a feeder), the model output layer uses the Softmax function to convert the output value into a probability distribution. The sum of the probabilities of all possible states is 1, and the state with the highest probability is taken as the prediction result. For example, if a pond has 3 aerators, and the possible operation states are "all off," "on No. 1," "on No. 2," and "on No. 3," then the output layer dimension is set to 4, and the Softmax output is [0.05, 0.85, 0.07, 0.03], indicating that the model predicts that "aerator No. 1" has the highest probability. When the predicted operation type is a continuously changing value (such as the rotation angle of the feed amount adjustment knob or the stepless speed regulation of a waterwheel aerator), the model output layer uses a linear activation function (i.e., without any nonlinear transformation), directly outputting a real number as the predicted value. For example, if the feed amount adjustment range is 0-100%, the model output 0.65 indicates that the predicted feed amount should be 65%.
[0092] It should be noted that setting the sliding time window length refers to slicing the environmental state vector at consecutive moments into segments of a fixed length when training the behavioral cloning model. Each segment serves as an input sample to predict the operational behavior at the next moment (or the current moment) after that segment. Since aquaculture experts' decisions are often based on trend changes over a period of time rather than a single instantaneous value, it is necessary to package historical information into the model. For example, if the window length is set to 60, the input sample is the state vector at 60 consecutive moments, and the model predicts the operational behavior at the current or future moment based on this historical trend. The setting of the sliding time window length is based on the statistical analysis of the decision-making reaction time of aquaculture experts. Through the analysis of the operation logs of several senior aquaculture personnel in real production scenarios, it was found that the average reaction time from observing abnormal phenomena (such as dissolved oxygen starting to drop, fish surfacing) to performing operations (such as turning on the aerator) is about 30 to 90 seconds. This solution takes the middle value and sets it to 60 (corresponding to 60 seconds, assuming a sampling frequency of 1 second / time).
[0093] It should be noted that cross-entropy loss and mean squared error loss are both standard loss functions commonly used in the field of deep learning and belong to existing technologies. In this solution, these two loss functions are directly used for model training of discrete and continuous operations, respectively, and will not be elaborated further here.
[0094] It should be noted that the preset qualified accuracy rate is determined by statistical analysis of the consistency of operations of multiple aquaculture experts in the same historical scenario. 1000 sets of historical state samples are randomly selected, and three aquaculture experts independently provide discrete operation suggestions. The average consistency rate among the experts is calculated (e.g., 92%), and this value is used as the lower limit threshold of the preset qualified accuracy rate. The preset error threshold is determined based on the actual distinguishable accuracy of aquaculture experts in continuous operations. Taking the adjustment of feeding amount as an example, the standard deviation of the operation when experts repeatedly adjust in the same scenario is statistically analyzed through field tests (e.g., ±2.5%), and twice the standard deviation (i.e., 5%) is used as the upper limit of the preset error threshold.
[0095] It should be noted that removing the output layer of the trained model means cutting off the last layer of the trained behavior clone model (the layer used to output specific operation instructions), retaining only the part from the input layer to the last fully connected hidden layer. This makes the model no longer output specific actions such as "which aerator to turn on", but instead output an internal feature vector that can represent the expert's decision-making logic. The fully connected hidden layer is a neural network layer located after the LSTM layer and before the output layer. Its role is to further nonlinearly abstract and compress the temporal features extracted by the LSTM to form a 128-dimensional expert behavior representation vector. This vector is a point in the implicit experience space, representing the expert's intuitive judgment in the current aquaculture scenario.
[0096] It should also be noted that the calculation process of the operation behavior representation vector in this scheme is to input the truncated behavior clone model into the environment state vector, and obtain the output value of the last fully connected hidden layer through the forward propagation of the model. This calculation process follows the standard neural network forward propagation algorithm and belongs to the existing technology. The specific matrix operations and activation function calculation details will not be elaborated here.
[0097] In a specific embodiment, behavioral representation vectors corresponding to aquaculture personnel are generated in real time. The specific process is as follows: Based on each sample in the original dataset, each sample includes each collection time point, as well as the dissolved oxygen concentration, water temperature, and pH value of the aquaculture water corresponding to each collection time point, combined with the preset maximum dissolved oxygen change rate. Maximum water temperature change rate and maximum pH change rate The data collection time intervals for dissolved oxygen concentration, water temperature, and pH value, respectively. , and Through the calculation formula: Calculate the first Environmental complexity index corresponding to each sample ,in , and Represented as the first Each sample corresponds to the dissolved oxygen concentration, water temperature, and pH value at the time of collection. , and Represented as the first Each sample corresponds to the dissolved oxygen concentration, water temperature, and pH value at the time of collection. The corresponding number for each sample, , The total number of samples, The value can be a positive integer.
[0098] Obtain the first hidden layer from the output of the last fully connected hidden layer of the behavior clone model. Expert intuition feature value corresponding to each sample Combined with the first The environmental complexity index corresponding to each sample and the preset sensitivity adjustment factor Through the calculation formula: The average value of the expert intuition feature value was calculated after being weighted by environmental complexity. Through the calculation formula: The standard deviation of the dispersion of the expert intuition feature values was calculated and weighted by environmental complexity. .
[0099] The average value is calculated based on expert intuition and weighted by environmental complexity. The standard deviation of the dispersion of the expert intuition feature values after being weighted by environmental complexity ,pass The principle determines the dynamic range of quantization, which serves as the boundary of the interval for mapping floating-point numbers to INT8 integers.
[0100] The scaling factor and zero point are calculated using the asymmetric linear quantization method commonly used in the field of deep learning. All full-precision weights of the behavior cloning model are converted into INT8 integer format to obtain a lightweight quantized model, which is then deployed to an edge computing device. The deployed edge model receives the standardized environmental state vector of the breeding site in real time and outputs the expert intuition feature vector in INT8 format through forward propagation, which is the behavior representation vector corresponding to the breeder.
[0101] It should be noted that the first The specific process for obtaining the expert intuition feature value corresponding to each sample is as follows: The standardized environmental state vector (including water quality parameters, meteorological parameters, and visual feature vectors) corresponding to the k-th sample is input into the trained full-precision behavioral cloning model. After the model's forward propagation calculation, when the data is passed to the last fully connected hidden layer, the activation values output by the 128 neurons in that layer are directly extracted. The vector composed of these 128 values is the expert intuition feature value corresponding to that sample. This feature value is not a specific operation instruction (such as "turn on aerator #1"), but rather an abstract and comprehensive judgment of the model on the current aquaculture scenario. For example, when the scenario of "dissolved oxygen 3.2 mg / L, water temperature 28.5℃, moderate fish activity, and continuous decrease in dissolved oxygen over the past 10 minutes" is input, the output 128-dimensional vector may contain values in some dimensions that represent "the degree of oxygen deficiency risk", some dimensions that represent "the urgency of feeding needs", and some dimensions that represent "the trend of water quality deterioration". Together, these constitute the mathematical expression of the "intuition" formed by the aquaculture personnel before making specific operations.
[0102] It should be noted that the preset sensitivity adjustment factor The value of is determined through optimization on the validation set using a grid search method; the specific process is as follows: Within the range, candidate values are set with a step size of 0.01, and different values are calculated respectively. The action prediction accuracy of the dequantized model on the validation set was selected, and the model with the highest accuracy that was closest to the performance of the full-precision model was chosen. The value is the final setting.
[0103] It should be noted that all full-precision weights refer to the parameters in the original 32-bit floating-point format of all network layers (including LSTM layers, fully connected hidden layers, etc.) in the behavioral clone teacher model trained in step S2.
[0104] It should also be noted that the expert intuition feature vector in INT8 format refers to the 128-dimensional floating-point number output by the last fully connected hidden layer of the behavior cloning model (each value represents the model's judgment strength on the current aquaculture scenario in a certain abstract dimension), which is then quantized and compressed into a new vector represented by 8-bit integers (range 0-255). For example, when the environmental conditions are input at a certain moment, such as "dissolved oxygen 3.2 mg / L, water temperature 28.5℃, fish activity moderate, dissolved oxygen continuously decreasing over the past 10 minutes", the floating-point vector output by the full-precision model might be [0.32, -0.87, 1.23, ...]. After quantization and compression, it becomes [42, 17, 235, ...] in INT8 format, which uniquely and precisely corresponds to the expert intuition that "the current dissolved oxygen is low and continuously decreasing, requiring attention". The edge device only needs to upload it to the cloud, and the large cloud model can understand the current scenario and make corresponding decisions based on the pre-aligned mapping relationship.
[0105] In the process of behavior cloning and quantification, this invention learns decision-making logic from expert operation data through a behavior cloning model. The output of the last fully connected hidden layer is extracted as a 128-dimensional expert intuition feature vector, realizing a mathematical expression of the expert's implicit knowledge. Subsequently, an environmental complexity index is introduced to weight the activation values, ensuring that the quantization range prioritizes the accuracy in high-risk scenarios such as sudden drops in dissolved oxygen and heavy rain. Finally, asymmetric linear quantization is used to compress the full-precision model into INT8 format and deploy it to the edge. This not only transforms expert intuition into a computable and transmittable lightweight vector, but also ensures that the decision-making accuracy at critical moments is not lost due to quantization. This is beneficial for achieving 24 / 7 real-time perception of expert intuition under the premise of limited computing power of edge devices.
[0106] S3. Large Model Experience Injection: The generated expert intuition feature vectors are paired and organized with the environmental state vectors to construct a training sample set for alignment training. The preset large language model is then aligned and trained so that the internal feature vectors output by the large language model when processing the environmental state vectors approximate the expert intuition feature vectors.
[0107] It should be noted that the preset large language model refers to a general large language model based on the Transformer architecture. In this solution, the input layer is modified to receive environmental state vectors, and after alignment training in step S3, it has the ability to understand and make decisions in aquaculture scenarios.
[0108] In a specific embodiment, a training sample set for alignment training is constructed, and the specific process is as follows: two parallel data streams are collected from the breeding site. The two parallel data streams include an environmental state stream and an expert intuition stream. The environmental state stream is a standardized environmental state vector per second, and the expert intuition stream is an INT8 format expert intuition feature vector output in real time by the edge quantization model at the corresponding time. The two parallel data streams are aligned according to the timestamp to obtain paired samples.
[0109] The continuously collected paired samples are organized into a training set according to time windows, with the standardized environment state vector as input and the expert intuition feature vector as the supervision target.
[0110] In a specific embodiment, a pre-defined large language model is trained, and the specific process is as follows: the pre-defined large language model is used as a base, the environmental state flow is input into the pre-defined large language model, and the internal feature vector output by its last Transformer layer is obtained.
[0111] The difference between the feature vectors inside the predefined large language model and the feature vectors of expert intuition are calculated using the mean squared error loss function, and the training is carried out with minimizing this difference as the optimization objective.
[0112] During training, only the newly added state encoder and the Transformer layer of the final set layer are trained to optimize the internal feature vectors of the pre-set large language model so that they approximate the expert intuition feature vectors. This process continues until the difference between the internal feature vectors of the pre-set large language model and the expert intuition feature vectors is less than a set threshold. At this point, the pre-set large language model is considered to have completed alignment training.
[0113] It should be noted that when using the preset large language model as the base, a state encoder is added to the front end of the preset large language model to map the 136-dimensional environment state vector into input features that match the word embedding dimension of the model.
[0114] It should be noted that the mean squared error loss function is a standard loss function commonly used in the field of deep learning. Its calculation process involves calculating the square of the difference between the model's internal feature vector and the expert intuition feature vector element by element, summing them, and then taking the average. This is an existing technology and will not be elaborated on further here.
[0115] It should be noted that the final set of Transformer layers refers to the last few Transformer layers near the output end in the preset large language model. The specific number of layers is determined according to the model size and validation set performance. In this embodiment, the last 3 layers are fine-tuned to maintain the original capabilities of the model while injecting the corresponding experience of the farmers. The internal features refer to the feature vector output by the last Transformer layer of the preset large language model. This vector is a high-dimensional abstract representation of the current input environment state. The phrase "approaching the expert intuition feature vector" means optimizing the model parameters to gradually reduce the distance (such as Euclidean distance) between the output feature vector and the expert intuition feature vector in the vector space, eventually achieving a high degree of similarity, so that the model's internal representation has the understanding ability consistent with expert intuition.
[0116] It should be noted that the threshold setting is determined based on the consistency level of the behavioral cloning model's own output on the validation set. The specific process is as follows: Select a batch of validation samples that did not participate in training, input them into the behavioral cloning model in step S2, calculate the average cosine similarity between the expert intuition feature vector output by the model and the expert intuition feature vector output by the edge quantization model at the corresponding time, and use the 95th percentile of this similarity as the lower limit of the threshold setting. In this embodiment, the similarity is statistically stable between 0.96 and 0.98, so the threshold setting is 0.95. That is, when the average cosine similarity between the feature vector inside the preset large language model and the expert intuition feature vector reaches 0.95 or higher, the alignment training is determined to be complete.
[0117] In the large model experience injection process, the INT8 format expert intuition feature vectors output in real time at the edge are paired with the environmental state vectors to construct a training set. By performing internal feature alignment training on the preset large language model, the output vector of its last Transformer layer gradually approaches the expert intuition feature vector. This is beneficial for the large model to make reasonable inferences based on the aligned intuition representation when facing complex scenes that it has never seen before, and it has the ability to generalize beyond historical data.
[0118] S4. Decision Control: The pre-set large language model is connected to the Internet of Things platform. Based on the real-time environmental state vector, control commands are generated and sent to the execution terminal to realize the automatic monitoring and management of the breeding process.
[0119] In a specific embodiment, control commands are generated based on real-time environmental state vectors and sent to the execution terminal. The specific process is as follows: According to the preset environmental data collection time interval, the edge computing device collects multimodal environmental data from the breeding site, generates environmental state vectors after standardization processing, and uploads them to the preset large language model service interface.
[0120] The pre-defined large language model first generates internal feature vectors that are highly aligned with expert intuition, forming an implicit understanding of the current aquaculture scenario (such as the level of oxygen deficiency risk and the urgency of feeding needs).
[0121] Based on this understanding, the pre-defined large language model uses its language generation capabilities to conduct multi-candidate scheme deduction, generating several sets of candidate control schemes containing specific operation instructions, execution parameters and expected effects, and comprehensively scoring each candidate control scheme to obtain the target scheme. After selecting the target solution, the target solution is converted into standardized equipment control instructions, which are then sent to the field execution terminals through the Internet of Things platform. These instructions include sending start / stop instructions and running time to the aerator controller, sending feeding amount adjustment parameters to the feeder controller, and sending on / off instructions and opening angles to the inlet and outlet valves.
[0122] During the execution of the command, the dissolved oxygen change curve and fish behavior response data are continuously monitored. If the deviation between the execution effect and the expected effect is greater than the set deviation threshold, a secondary decision is triggered to regenerate the emergency plan. At the same time, the input status, generated plan, execution results and any records of human intervention in this decision are completely saved as input data for the closed-loop optimization of the S5 step.
[0123] It should be noted that after the environmental state vector is input into the preset large language model that has completed the alignment training in step S3, the model first generates an internal feature vector that is highly aligned with expert intuition, thereby achieving an accurate understanding of the implicit state of the current scene (such as the level of hypoxia risk and the urgency of feeding needs).
[0124] It should be noted that the deviation between the executed effect and the expected effect refers to the degree of difference between the actual monitored trend of the aquaculture environment after the system executes a certain control command and the expected change predicted by the model when the plan was generated. For example, when the system selects to turn on the aerator for 30 minutes based on the scenario of "low dissolved oxygen", and expects the dissolved oxygen to rise from 3.2 mg / L to 5.0 mg / L within 20 minutes, if it is found that the dissolved oxygen only rises to 3.5 mg / L after 10 minutes (the expected value should reach 4.2 mg / L), or if there is an abnormal situation where it drops to 3.0 mg / L instead of rising, then the absolute error between the actual value and the expected value at each monitoring time is calculated, and the total deviation difference is obtained by accumulating them. When this difference exceeds the preset threshold, it means that the current plan has failed or the environment has undergone an unexpected drastic change, and a secondary decision-making process needs to be initiated immediately to regenerate an emergency plan.
[0125] It should be noted that the set deviation threshold is determined based on the statistical distribution of errors under historical normal execution conditions. Specifically, the process involves collecting the average absolute error between the actual dissolved oxygen value and the model's predicted value at each execution moment during the past 100 automatic control command executions, calculating the mean and standard deviation of these errors, and using "mean + 3 times the standard deviation" as the upper limit of the threshold. For example, if historical data shows that the average dissolved oxygen prediction error is 0.2 mg / L and the standard deviation is 0.1 mg / L during normal execution, then the threshold is set to 0.5 mg / L—that is, when the actual dissolved oxygen value deviates from the expected value by more than 0.5 mg / L at a certain moment, it is judged as abnormal and a secondary decision is triggered.
[0126] It should be noted that the process of comprehensively scoring each candidate control scheme is as follows: The system dynamically adjusts the weights of the three dimensions of aquaculture safety, energy consumption cost, and equipment lifespan based on the risk level of the current scenario implied by the internal feature vector generated within the preset large language model and aligned with the expert intuition feature vector. For example, when the internal feature vector represents "extremely high risk of hypoxia", the weight of aquaculture safety is set to 0.7, energy consumption cost to 0.2, and equipment lifespan to 0.1; when the scenario is stable, the weights are set to 0.4, 0.3, and 0.3, respectively. Each candidate solution is then scored across various dimensions: Aquaculture safety is scored based on the projected dissolved oxygen recovery rate (e.g., Solution A, projected to recover to 5.0 mg / L in 10 minutes, receives 95 points; Solution B, projected to recover in 20 minutes, receives 80 points); energy cost is scored based on estimated electricity consumption (Solution A, projected to consume 0.5 kWh, receives 90 points; Solution B, projected to consume 0.8 kWh, receives 70 points); and equipment lifespan is scored based on start-up and shutdown frequency (Solution A, projected to start and stop frequently, receives 60 points; Solution B, projected to operate smoothly, receives 85 points). The scores for each dimension are multiplied by their corresponding weights and summed to obtain the overall solution score. The solution with the highest total score is the target solution.
[0127] In the intelligent decision-making and control process of this invention, the aligned large model is connected to the Internet of Things platform. Based on the real-time environmental state vector, an internal feature representation aligned with expert intuition is first generated. Then, the language generation capability of the large model is used to conduct multi-candidate scheme deduction, generating several sets of control schemes containing specific operation instructions, execution parameters, and expected effects. Based on the risk level implied by the internal features, the weights of multiple objectives such as breeding safety, energy consumption cost, and equipment lifespan are dynamically adjusted for comprehensive scoring and selection. This process simulates the decision-making method of human experts who "think of several approaches and then weigh the pros and cons." At the same time, it has interpretability—it can output the decision-making reasons in natural language form simultaneously. This is conducive to achieving unmanned automatic management, enhancing the trust of farmers in the system, and providing a clear decision-making basis for subsequent optimization.
[0128] S5. Closed-loop optimization: Record the execution effect of automatic decision-making and data on human intervention, and periodically feed back to the original dataset to incrementally update the behavior clone model and the pre-defined large model, forming a continuously evolving closed-loop optimization mechanism.
[0129] In a specific embodiment, the behavioral cloning model and the pre-set large model are incrementally updated. The specific process is as follows: the complete process of each automatic decision is continuously recorded, including the environmental state vector when the decision is triggered, the multiple candidate schemes generated by the pre-set large language model and the final selected scheme, the actual execution effect after the instruction is issued, and the manual intervention operations of the breeding personnel on the automatic control (such as modifying instruction parameters, rejecting system suggestions, etc.).
[0130] The data corresponding to the completion process is organized into "status-operation behavior-effect" samples according to timestamps, and stored together with the "status-human operation" samples recorded during manual intervention in a temporary cache area. When the amount of cached data reaches the preset cache threshold, the new sample is merged into the original dataset of step S1.
[0131] The behavior cloning model is incrementally trained using new samples, and the updated expert intuition feature vectors are re-extracted to generate a new standardized training sample set for fine-tuning the behavior cloning model.
[0132] The newly collected "environmental state - expert intuition" paired samples are added to the training set in step S3 to perform lightweight fine-tuning of the preset large language model.
[0133] After the updated behavioral clone model and the pre-defined large model have been tested on the validation set and confirmed to be stable, they replace the old version of the model, thereby achieving a continuous evolution closed loop.
[0134] It should be noted that the "state-operation behavior-effect" refers to the complete record of the environmental state vector triggered by the system's automatic decision-making process, the specific operation instructions finally issued by the system, and the actual effect data such as the dissolved oxygen change curve or fish behavior response monitored after the instruction is executed. For example, the complete data of "turning on aerator No. 1 for 30 minutes when dissolved oxygen is 3.2 mg / L, and the dissolved oxygen rose to 5.0 mg / L after execution"; the "state-manual operation" sample refers to the paired record of the environmental state vector and the actual operation behavior data performed by the aquaculture personnel when they manually take over the system or modify the automatic instructions. For example, the operation log of "when dissolved oxygen is 3.0 mg / L, the aquaculture personnel manually turned off the automatic program and turned on aerator No. 2".
[0135] It should be noted that the newly collected "environmental state - expert intuition" is the paired data composed of the expert intuition feature vector and the environmental state vector at the corresponding time.
[0136] It should be noted that the preset cache threshold is determined based on the balance between the marginal benefit of model performance improvement and the consumption of computing resources. Specifically, it is the minimum number of samples when the number of new samples can improve the model accuracy on the validation set by more than 0.5%. In this embodiment, it is set to 1000 new samples.
[0137] Please see Figure 2 As shown, the IoT-based smart aquaculture monitoring and management system includes the following modules: The aquaculture data acquisition module is used to collect multimodal environmental data from the aquaculture site and the corresponding operational behavior data of aquaculture personnel at different times, and to construct a raw dataset containing the mapping relationship between environmental data and aquaculture personnel actions.
[0138] The behavior cloning and quantification module is used to standardize the original dataset to generate training samples, train the behavior cloning model based on the training samples, extract the hidden layer feature vectors, and generate the corresponding behavior representation vectors of the farmers in real time.
[0139] The large model experience injection module is used to pair and organize the generated expert intuition feature vectors with the environmental state vectors, construct a training sample set for alignment training, and perform alignment training on the preset large language model.
[0140] The decision control module is used to connect the preset large language model to the Internet of Things platform, generate control commands, and send them to the execution terminal.
[0141] The closed-loop optimization module is used to incrementally update the behavior clone model and the predefined large model.
[0142] In the closed-loop optimization process, this invention continuously records the complete context of each automatic decision and the manual intervention operations of the aquaculture personnel. New samples are periodically fed back to the original dataset. A dual-model asynchronous update strategy is used to incrementally fine-tune the behavioral cloning model and the preset large language model, respectively. This allows the system to gradually adapt to the unique seasonal changes of the aquaculture farm, the evolution trend of water quality, and the management preferences of individual aquaculture farmers. This helps to break the limitations of traditional AI systems that are "trained once and used permanently," forming a continuous evolutionary closed loop from data collection, experience injection, intelligent decision-making to feedback optimization. This ensures that the system becomes smarter with use and maintains a high degree of consistency with the latest aquaculture practices in the long term.
[0143] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions will not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of the present invention.
Claims
1. A smart aquaculture monitoring and management method based on the Internet of Things, characterized in that, include: S1. Collect multimodal environmental data and operational behavior data of the breeding site at the corresponding time to construct an original dataset containing the mapping relationship between environmental data and the actions of the breeding personnel; The multimodal environmental data includes aquaculture water quality parameters, fish activity parameters, and meteorological parameters; S2. Standardize the original dataset to generate training samples, train the behavior cloning model based on the training samples and extract the hidden layer feature vectors, and generate the corresponding behavior representation vectors of the farmers in real time. S3. Pair and organize the generated expert intuition feature vectors with the environmental state vectors to construct a training sample set for alignment training, and perform alignment training on the preset large language model. S4. Connect the preset large language model to the Internet of Things platform, generate control commands and send them to the execution terminal; S5. Incrementally update the behavior clone model and the pre-defined large model.
2. The method for monitoring and managing smart aquaculture based on the Internet of Things according to claim 1, characterized in that, The specific process for constructing the original dataset containing the mapping relationship between environmental data and the actions of aquaculture workers is as follows: Multimodal environmental data of the breeding site are acquired according to the preset acquisition frequency, and the operational behavior data of the breeding personnel at the corresponding time are recorded simultaneously. The multimodal environmental data and operational behavior data are preprocessed, and each operational behavior data is matched with an environmental state window within a preset time before its occurrence, forming a state-action candidate pair that represents the mapping relationship between the environmental state and the operation of the breeding personnel. Feature extraction is performed on the acquired visual data, and the extracted visual semantic information is used as a supplementary dimension of the environment state vector to construct a structured original dataset containing scene labels.
3. The IoT-based smart aquaculture monitoring and management method according to claim 2, characterized in that, The original dataset is standardized to generate training samples. The specific process is as follows: The original dataset is standardized, including: Align and interpolate multi-source environmental data with different sampling frequencies on the time axis; Extract visual feature vectors from video data; The aligned multi-source data is concatenated with the visual feature vector to form an environment state vector. Convert the operation records of aquaculture personnel into operation behavior tags in a preset format; Pair the environmental state vector at each time step with the corresponding operation behavior label to form state-operation training sample pairs, and obtain a standardized training sample set.
4. The IoT-based smart aquaculture monitoring and management method according to claim 3, characterized in that, The behavioral cloning model is trained based on training samples, and hidden layer feature vectors are extracted. The specific process is as follows: Based on a standardized training sample set, a deep neural network with time-series processing capabilities is constructed as a behavior cloning model, and it is trained with the historical environmental state sequence within a sliding time window as input and the current operation behavior as the prediction target. After training, the model performance is evaluated on the validation set. If the prediction accuracy of discrete operations reaches the preset threshold and the prediction error of continuous operations is lower than the preset threshold, the behavioral cloning model is considered to have met the requirements of imitating the decision-making behavior of farmers. After training, the output layer of the corresponding behavior clone model is removed, and the mapping relationship from the input layer to the last fully connected hidden layer is retained. The output of the fully connected hidden layer is used as the operational behavior representation vector representing the decision-making logic of the breeder. The space in which this vector is located is the implicit experience space.
5. The IoT-based smart aquaculture monitoring and management method according to claim 4, characterized in that, The behavioral representation vectors corresponding to the farmers are generated in real time. The specific process is as follows: Based on each sample in the original dataset, each sample includes each collection time point, as well as the dissolved oxygen concentration, water temperature and pH value of the aquaculture water corresponding to each collection time point. Combined with the preset maximum dissolved oxygen change rate, maximum water temperature change rate and maximum pH change rate, as well as the collection time intervals corresponding to dissolved oxygen concentration, water temperature and pH value, the environmental complexity index corresponding to each sample is calculated. Obtain the expert intuition feature value corresponding to each sample from the last fully connected hidden layer of the trained behavior clone model; Based on the environmental complexity index corresponding to each sample, the expert intuition feature values are weighted statistically to obtain the weighted mean and weighted standard deviation; The quantization range is determined based on the weighted mean and weighted standard deviation, and the full-precision behavioral cloning model is converted into a lightweight model in INT8 integer format using an asymmetric linear quantization method and deployed to edge computing devices. The edge computing device receives environmental state vectors from the breeding site in real time, and outputs expert intuition feature vectors in INT8 format through forward propagation calculation of the lightweight model, which serve as real-time behavioral representation vectors for the breeding personnel.
6. The method for monitoring and managing smart aquaculture based on the Internet of Things according to claim 5, characterized in that, The specific process for constructing the training sample set for alignment training is as follows: Collect parallel data streams corresponding to the aquaculture site; The parallel data stream includes a standardized environment state vector generated per second and an INT8 format expert intuition feature vector output in real time by the edge quantization model at the corresponding moment; Align the two data streams by timestamp to obtain paired samples; The continuously collected paired samples are arranged in chronological order to form a training set for large model alignment training.
7. The IoT-based smart aquaculture monitoring and management method according to claim 6, characterized in that, The pre-defined large language model is trained as follows: Using a pre-defined large language model as a base, a state encoder is added to the front end of the model to map the environmental state vector into input features that match the word embedding dimension of the model. Input the environmental state vector into the modified preset large language model, and obtain the internal feature vector output by the last Transformer layer of the preset large language model. The difference between the internal feature vector and the expert intuition feature vector of the supervised target is calculated, and the training is carried out with minimizing this difference as the optimization objective. When the average difference between the internal feature vectors on the validation set and the expert intuition feature vectors of the supervised target is less than a preset threshold, the preset large language model is determined to have completed alignment training.
8. The method for monitoring and managing smart aquaculture based on the Internet of Things according to claim 7, characterized in that, Control commands are generated based on the real-time environment state vector and sent to the execution terminal. The specific process is as follows: The real-time generated environmental state vector is input into a pre-set large language model that has been trained to align, thereby generating multiple candidate control schemes; The target scheme is obtained by comprehensively evaluating each candidate control scheme. The target solution is converted into equipment control commands and sent to the execution terminal. The execution effect is continuously monitored. If the actual effect deviates from the expected effect by more than a preset threshold, a secondary decision is triggered to regenerate the emergency solution. At the same time, the input status, generated plan, execution results, and human intervention records of this decision are saved as input data for the closed-loop optimization of the S5 steps.
9. The IoT-based smart aquaculture monitoring and management method according to claim 8, characterized in that, The behavioral clone model and the preset large model are incrementally updated, and the specific process is as follows: Collect data on environmental conditions, operational instructions, and execution effects during the automated decision-making process, as well as data on environmental conditions and manual operations during manual intervention by aquaculture personnel, as a feedback sample set; When the number of feedback samples reaches the preset cache threshold, the dual-model asynchronous update is triggered: first, the behavior clone model is incrementally trained using the feedback samples to update the expert intuition feature vector output in real time at the edge; then, the newly collected environmental state is paired with the updated expert intuition feature vector and added to the alignment training set to fine-tune the preset large language model. The updated model, after performance verification, replaces the old version, realizing a continuous evolutionary closed loop for the system from data collection to intelligent decision-making.
10. A smart aquaculture monitoring and management system for implementing the IoT-based smart aquaculture monitoring and management method according to any one of claims 1-9, characterized in that, Includes the following modules: The aquaculture data acquisition module is used to collect multimodal environmental data from the aquaculture site and operational behavior data of aquaculture personnel at corresponding times, and to construct a raw dataset containing the mapping relationship between environmental data and aquaculture personnel actions; The behavior cloning and quantitative representation module is used to standardize the original dataset to generate training samples, train the behavior cloning model based on the training samples, extract the hidden layer feature vector, and generate the behavior representation vector corresponding to the farmers in real time. The large model experience injection module is used to pair and organize the generated expert intuition feature vectors with the environmental state vectors, construct a training sample set for alignment training, and perform alignment training on the preset large language model. The decision control module is used to connect the preset large language model to the Internet of Things platform, generate control commands and send them to the execution terminal. The closed-loop optimization module is used to incrementally update the behavior clone model and the preset large model.