A system for judging the harvesting time of haematococcus pluvialis based on optical microscopic images
By combining optical microscopic images and deep learning models, the problem of accurately judging the harvesting time of Haematococcus pluvialis has been solved, enabling real-time, high-frequency monitoring and early warning, thereby improving astaxanthin production and aquaculture efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- MOUTAI INST
- Filing Date
- 2026-03-09
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies make it difficult to accurately determine the harvesting time of Haematococcus pluvialis, which can easily lead to missing the optimal harvesting window. Furthermore, traditional methods such as visual inspection and chemical analysis cannot achieve real-time, high-frequency monitoring.
A system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images is adopted. The system acquires microscopic images through an image acquisition module, and the data processing terminal calculates quantitative values such as encapsulation rate, proportion of red cells, uniformity of pigment distribution, cell aggregation degree, and cell wall thickness. The system is then combined with a deep learning model for real-time reasoning and decision-making.
It enables high-precision, real-time assessment of Haematococcus pluvialis harvesting, avoiding missing the optimal harvesting window, improving astaxanthin production and aquaculture economic benefits, and providing real-time monitoring and early warning functions for the aquaculture process.
Smart Images

Figure CN122244631A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of smart agriculture technology, specifically to a system for determining the harvesting time of Haematococcus pluvialis based on optical microscopic images. Background Technology
[0002] Haematococcus pluvialis is one of the organisms with the highest natural astaxanthin content in the natural world. The astaxanthin produced by it has extremely strong antioxidant activity and is widely used in food, health products, cosmetics, and aquatic feed. In industrial aquaculture, Haematococcus pluvialis typically goes through a green vegetative cell growth stage and a red stress-induced astaxanthin accumulation stage. Only when the cells fully enter the red stage and astaxanthin is synthesized in large quantities and stored in liposomes does it have economic harvesting value.
[0003] Traditional harvesting decisions mainly rely on human experience or destructive chemical analysis (such as HPLC), which has many drawbacks: manual visual inspection relies solely on changes in the color of the culture medium, ignoring key physiological states such as cell morphology and wall thickness, and the accuracy rate is usually less than 80%. Moreover, it cannot achieve high-frequency, automated monitoring and is prone to missing the optimal harvesting window. Although spectrophotometry / HPLC can accurately determine astaxanthin content, it requires sample destruction, complex pretreatment, and is time-consuming (≥2 hours), making it unsuitable for real-time decision-making.
[0004] Patent application number 202510414760.5 discloses a method and system for microalgae feature fusion analysis based on deep learning. This patent obtains microalgae harvesting time nodes based on the microalgae growth and harvesting threshold range, acquires real-time pollutant data at the current microalgae harvesting time node, performs pollution analysis based on the real-time pollutant data at the current microalgae harvesting time node, and generates relevant control strategies based on the pollution analysis results, then implements control according to the relevant control strategies. This invention, by fusing deep learning networks and generative adversarial networks, can integrate the cell division patterns of microalgae under different environments and image features within a preset time range to analyze microalgae growth data in a target area. This solves the pain point of image technology in identifying microalgae growth and reproduction, accurately grasps the stage of microalgae growth, and provides theoretical support for the identification, control, and removal of microalgae pollutants in watersheds. However, this patent obtains the microalgae harvesting time node by measuring the microalgae growth and harvesting threshold range, but it cannot confirm in real time whether the cells in Haematococcus pluvialis have fully entered the red stage or whether astaxanthin has been synthesized in large quantities and stored in liposomes by measuring the growth morphology. Therefore, it cannot be applied to the judgment of the harvesting time of Haematococcus pluvialis. Summary of the Invention
[0005] The present invention aims to provide a system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images, in order to solve the problem that existing methods of harvesting Haematococcus pluvialis often miss the optimal harvesting window.
[0006] To achieve the above objectives, the present invention adopts the following technical solution: a system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images, comprising: The server includes an image acquisition module and a data processing terminal. The image acquisition module is used to acquire microscopic images of sample droplets of Haematococcus pluvialis; The data processing terminal is used to preprocess the acquired microscopic images. Then, based on the preprocessed microscopic images, it calculates and outputs quantitative values of Haematococcus pluvialis's encapsulation rate, red cell proportion, pigment distribution uniformity, cell aggregation degree, and cell wall thickness. When the encapsulation rate is ≥85%, the red cell proportion is ≥90%, the pigment distribution uniformity is ≥85%, the cell aggregation degree is ≤10, the cell wall thickness is ≥2μm, and the visibility rate of the bilayer structure of the cell wall is ≥80%, it is judged that harvesting is recommended. The encapsulation rate is the percentage of spherical / near-spherical dormant cysts in the total number of cells; the red cell percentage is the percentage of cells with a red phenotype in the Haematococcus pluvialis culture sample out of the total number of detected cells; the pigment distribution uniformity is that the red cells of Haematococcus pluvialis meet the condition that the standard deviation of the grayscale of all pixels within the ROI of the red cell is ≤20, and there are no pixels >5μm in the detection binarization mask. 2 The percentage of cells in a continuous blank area relative to the total number of red cells; cell aggregation degree is the percentage of aggregated cells to the total number of cells; cell wall thickness is the thickness of the double-layer structure formed by the thickening of the mature sac cell wall.
[0007] Preferably, as an improvement, the server also includes a data management platform, an intelligent analysis module, and a harvesting decision output module; The data management platform is used to store the calculation results of microscopic images, encapsulation rate, proportion of red cells, pigment distribution uniformity, cell aggregation degree, and cell wall thickness, as well as harvesting decision records, measured astaxanthin content, and CNN model files; at the same time, it provides historical images and astaxanthin content paired datasets for the intelligent analysis module to be used for CNN model training and incremental learning. The intelligent analysis module uses a trained CNN model to perform real-time inference on the acquired microscopic images, outputting astaxanthin content regression values, maturity probability, and anomaly detection confidence scores. Simultaneously, it calculates the final maturity score by using an adaptive weighted fusion algorithm to combine the output maturity probability with the data processing terminal's suggested harvesting results, providing a harvesting strategy. The maturity probability, obtained by mapping the astaxanthin content regression value using a Sigmoid function, is used for continuous assessment of the maturity of astaxanthin accumulation in the cell population. The anomaly detection confidence score is used to identify abnormal situations in the sample, such as contamination, cell death, or excessive stress interfering with the harvesting decision. The harvesting decision output module is used to visualize the decision results of the data processing terminal and the intelligent analysis module, provide audio-visual prompts, and output device linkage commands.
[0008] Preferably, as an improvement, the image acquisition module employs an optical microscope or micro-camera that supports bright field and DIC modes.
[0009] Preferably, as an improvement, the method for calculating the encapsulation rate of Haematococcus pluvialis includes the following steps: A1, three fields of view were randomly selected from the microscopic images of each droplet sample, with ≥50 cells in each field of view; A2. Segment the Haematococcus pluvialis cells in the field of view, and then calculate the roundness. If the roundness is ≥0.85 and the equivalent diameter is between 10 and 20 μm, it is determined to be a cystic cell. The roundness calculation formula is:
[0010] Where C is the roundness, A is the cell area, and P is the cell perimeter; A3. Calculate the encapsulation rate. The formula for calculating the encapsulation rate is: ×100% Where CFR is the encapsulation rate. To determine the number of cells that are cystic cells, N total This represents the total number of cells identified in the field of view.
[0011] Preferably, as an improvement, the method for calculating the proportion of red cells in Haematococcus pluvialis includes the following steps: B1. Before each microscopic image acquisition, a standard white reference plate with a reflectivity ≥95% under CIE D65 standard light source is used as a reference for white balance calibration. B2 converts the calibrated microscopic image to the sRGB color space, limited to an 8-bit depth, with a pixel value range of 0–255; B3. The microscopic image is segmented to obtain an independent region for each cell. A circular sub-region with a diameter of 3μm is extracted with the cell centroid as the center. The RGB mean ranges of green, orange and red are predefined by the built-in three color templates. The red template is R[150,200], G[50,100], B[50,100], and the midpoint of the range (175,75,75) is taken as the center point of the red template. B4, calculate the average RGB value of all pixels within the sub-region ( R avg , G avg , B avg ) B5 uses Euclidean distance to measure the distance between this average value and the center point of the three color templates:
[0012] Where k∈{green, orange, red}, ( R avg , G avg , B avg ) represents the RGB center value of the k-th template; B6, classify this cell into the template category with the smallest distance, i.e., argmin d k If the minimum distance corresponds to the red template, then it is determined to be a red cell; B7 will be judged as the number of red blood cells. N red Total number of effective detected cells N total The red blood cell percentage (RCR) is calculated using the following formula: ×100%.
[0013] Preferably, as an improvement, the method for calculating the uniformity of pigment distribution includes the following steps: C1 will be identified as a red cell individual, and its complete outline will be extracted as the ROI region. C2 extracts the red channel of the RGB image and converts it into an 8-bit grayscale image with pixel values ranging from 0 to 255; it also normalizes the pixels within the ROI region. C3. Calculate the standard deviation σ of gray levels for all pixels within the ROI region. If σ ≤ 20, the cells are considered uniform with a smooth overall gray level distribution. In the binarized ROI, the threshold is set to the global Otsu threshold to detect whether there are cells with an area greater than 5 μm. 2 The continuous low grayscale values represent blank areas lacking pigment; if such areas exist, they are considered incompletely filled and are not included in the uniform cell count even if σ≤20. The formula for calculating the grayscale standard deviation σ is: σ=
[0014] Among them G i Let be the grayscale value of the i-th pixel within the ROI. is the average grayscale value of all pixels within the ROI, and n is the total number of pixels within the ROI of this red cell; C4, Calculate pigment distribution uniformity The calculation formula is: ×100% inN uniform To satisfy the uniformity condition, the number of red cells, N red This represents the total number of red blood cells.
[0015] Preferably, as an improvement, the method for calculating cell aggregation includes the following steps: D1, perform cell segmentation on the microscopic image and count cell pairs with an adjacent distance ≤3μm; D2, defining a cell as "aggregated" when the distance between it and ≥1 other cell is ≤3μm; D3, calculate cell aggregation degree AI, the calculation formula is: AI= ×100% Where N aggregated The number of cells with a distance ≤3 μm from one or more neighboring cells; N total This represents the total number of cells that can be clearly identified under a microscope.
[0016] Preferably, as an improvement, the method for quantifying cell wall thickness includes the following steps: E1, through the DIC mode of the image acquisition module, selects cystic cells as the measurement object; E2, using edge detection to locate the inner and outer boundaries of cystic cells; E3, along the cell's radial direction, with the centroid as the center, uniformly select 8 radial directions and measure the distance between the inner and outer wall boundaries. Measure 3 points in each direction and take the average value as the wall thickness in that direction; calculate the average wall thickness in the 8 directions using the following formula: CWT=
[0017] Where CWT is the cell wall thickness, and T d1 ~T d8 It is the arithmetic mean of three measurement points in eight directions.
[0018] Preferably, as an improvement, the training method of the CNN model includes the following steps: S1. Collect ≥50 paired samples of microscopic images and measured astaxanthin content, and divide them into training and validation sets at a ratio of 7:3. Enhance the images by rotating and brightness perturbation to expand the paired samples to ≥200. Normalize the astaxanthin content to the range of 1.5%-5.0% DW. S2, load the pre-trained weights of MobileNetV3-Small on ImageNet, freeze the first 10 layers of the backbone network, and train only the last 3 feature layers; replace the original classification head with a three-output head for astaxanthin regression, maturity classification, and anomaly detection, configure the combined loss function MSE, BCE, and Adam optimizer, and set the initial learning rate to 0.001. S3, batch size is set to 16, training rounds are set to 50, early stopping strategy is enabled, stop if the validation set loss does not decrease after 5 rounds, and save the initial model with the minimum validation set loss. S4: After each batch of harvesting, add 10-20 new samples and merge them with historical data; use the 3σ criterion to remove abnormal samples, balance the proportion of samples at each maturity stage, and perform image enhancement on the new samples according to the same standard. S5 loads the best model from the previous round, freezes the first 7 layers of the backbone, and unfreezes the 3 feature layers and the entire head network; the learning rate decays with the number of samples, and for every 50 new samples, the learning rate drops to the original 0.9, with an initial incremental learning rate of 0.0005; S6, batch size set to 16, training epochs set to 20, gradient accumulation 2-step parameter update; if the loss decreases by ≥0.001 in each epoch, save the temporary model; after training, compare the accuracy on the test set, if the improvement is ≥2%, update the deployed model; S7 uses regression MAE, classification accuracy, and anomaly detection accuracy as evaluation indicators. If the target is not met (MAE ≤ 0.2% DW) or classification accuracy ≥ 92%, the number of network neurons, sample augmentation method, and anomaly sample weight will be adjusted.
[0019] The advantages of this solution are: 1. This method abandons the traditional method of judging solely by the color of the culture medium through visual inspection. It comprehensively reflects the maturity of astaxanthin accumulation from multiple dimensions, including cell morphology, pigment distribution, population status, and cell wall structure. This avoids the problem of inconsistent judgment standards among different operators and significantly improves the accuracy rate compared to manual visual inspection (<80%). The standardized and quantitative methods of various indicators provide a unified and objective scientific basis for harvesting judgment and can establish a standardized Haematococcus pluvialis cultivation and harvesting process.
[0020] 2. Compared with traditional HPLC / spectrophotometry (≥2 hours / sample, destructive detection), the detection efficiency is greatly improved, and high-frequency, continuous monitoring of the Haematococcus pluvialis cultivation process can be achieved, avoiding missing the optimal harvest window for astaxanthin content peak due to detection lag, thereby improving astaxanthin yield and aquaculture economic benefits.
[0021] 3. This method requires only a small amount of Haematococcus pluvialis culture medium to prepare microscopic samples, which are then observed using optical / DIC microscopy. No destructive treatment such as staining or cell disruption is necessary. All index calculations and model inferences are based on digital analysis of the microscopic images. Samples can be returned to the culture system as needed, with no cell damage throughout the process. This solves the problem of sample destruction required by traditional chemical analysis, reducing sample loss and allowing for continuous monitoring of the same culture system. It enables precise understanding of the dynamic changes in astaxanthin accumulation, providing more complete cell growth and maturation data for determining harvesting timing.
[0022] 4. This solution employs a dual-mode decision-making architecture: a basic mode and an intelligent mode. The basic mode relies solely on optical microscopic images and indicator threshold rules, requiring zero AI computing power and can be deployed in edge computing resource-constrained scenarios such as small-scale farms and grassroots testing points. The intelligent mode overlays a lightweight deep learning model on top of the basic mode, supporting cloud data interaction and model iteration, adapting to the intelligent needs of medium- to large-scale Haematococcus pluvialis cultivation bases. The α adaptive weight coefficient is dynamically adjusted based on historical sample size, achieving a smooth integration of basic rules and model inference. When the sample size is small, stability is ensured by relying on the basic mode; when the sample size is sufficient, the model's advantages are leveraged to improve accuracy. The five-dimensional core indicators are designed based on the general biological characteristics of astaxanthin accumulation in Haematococcus pluvialis. After simple fine-tuning, the model can be applied to different strains of Haematococcus pluvialis under different cultivation conditions (light, temperature, salinity). This breaks through the limitations of existing technologies, which are either only suitable for high-precision laboratory testing or only meet simple manual judgment requirements, achieving full coverage of simplified testing in small-scale farms and intelligent management in large-scale bases.
[0023] 5. This solution is not only a tool for judging the harvesting time, but also a real-time monitoring and early warning system for the Haematococcus pluvialis cultivation process. Through the dynamic changes of five-dimensional indicators, it can promptly detect and troubleshoot problems in the early stages of aquaculture abnormalities, such as adjusting light intensity, supplementing nutrients, and sterilizing and disinfecting, to avoid a significant drop in astaxanthin production due to aquaculture abnormalities. At the same time, it can optimize aquaculture process parameters in reverse based on indicator changes, thereby improving the overall stability and yield of Haematococcus pluvialis cultivation. It achieves the triple value of harvesting judgment, aquaculture monitoring, and process optimization, far exceeding the functional boundaries of traditional harvesting judgment technology. Attached Figure Description
[0024] Figure 1 This is a flowchart of a system for determining the harvesting time of Haematococcus pluvialis based on optical microscopic images, according to the present invention.
[0025] Figure 2 This is a block diagram of the architecture of a system for determining the harvesting time of Haematococcus pluvialis based on optical microscopic images, according to the present invention. Detailed Implementation
[0026] The following detailed description provides further details on specific implementation methods.
[0027] As attached Figure 1 and attached Figure 2 As shown: Example 1 A system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images includes: The server includes an image acquisition module and a data processing terminal. The image acquisition module uses an optical microscope or micro-camera that supports bright field and DIC modes to acquire microscopic images of sample droplets of Haematococcus pluvialis.
[0028] The data processing terminal was selected as the NVIDIA Jetson Nano 2GB embedded development board, serving as the edge computing terminal.
[0029] Key specifications: CPU is a Quad-core ARM Cortex-A57 (1.43GHz), GPU is a 128-core Maxwell (391MHz), RAM is 2GB LPDDR4, and storage is provided by a 32GB Micro SD card (expandable to 128GB); the operating system is Ubuntu 20.04 LTS for Jetson Nano, pre-installed with OpenCV 4.5.5 and ImageJ 1.54f open-source software, running basic rule engine and image preprocessing algorithms; interfaces include USB 3.0×2, USB 2.0×1, HDMI×1, and Ethernet×1, supporting camera input, display output, and data transfer.
[0030] The data processing terminal is used to preprocess the acquired microscopic images. Then, based on the preprocessed microscopic images, it calculates and outputs quantitative values of Haematococcus pluvialis's encapsulation rate, red cell proportion, pigment distribution uniformity, cell aggregation degree, and cell wall thickness. According to the judgment criteria shown in Table 1, when the encapsulation rate is ≥85%, the red cell proportion is ≥90%, the pigment distribution uniformity is ≥85%, the cell aggregation degree is ≤10, the cell wall thickness is ≥2μm, and the visibility rate of the cell wall's double-layer structure is ≥80%, it displays "harvesting recommended".
[0031] Table 1. Quantitative Standards for Encapsulation Rate, Percentage of Red Cells, Evenness of Pigment Distribution, Cell Aggregation, and Cell Wall Thickness
[0032] The calculation method for the encapsulation rate of Haematococcus pluvialis includes the following steps: A1, three fields of view were randomly selected from the microscopic images of each droplet sample, with ≥50 cells in each field of view; A2. Segment the Haematococcus pluvialis cells in the field of view, and then calculate the roundness. If the roundness is ≥0.85 and the equivalent diameter is between 10 and 20 μm, it is determined to be a cystic cell. The roundness calculation formula is:
[0033] Where C is the roundness, A is the cell area, and P is the cell perimeter; A3. Calculate the encapsulation rate. The formula for calculating the encapsulation rate is: ×100% Where CFR is the encapsulation rate. To determine the number of cells that are cystic cells, N total This represents the total number of cells identified in the field of view.
[0034] The method for calculating the proportion of red cells in Haematococcus pluvialis includes the following steps: B1. Before each microscopic image acquisition, a standard white reference plate with a reflectivity ≥95% under CIE D65 standard light source is used as a reference for white balance calibration. B2 converts the calibrated microscopic image to the sRGB color space, limited to an 8-bit depth, with a pixel value range of 0–255; B3. The microscopic image is segmented to obtain an independent region for each cell. A circular sub-region with a diameter of 3μm is extracted with the cell centroid as the center. The RGB mean ranges of green, orange and red are predefined by the built-in three color templates. The red template is R[150,200], G[50,100], B[50,100], and the midpoint of the range (175,75,75) is taken as the center point of the red template. B4, calculate the average RGB value of all pixels within the sub-region ( R avg , G avg , B avg ) B5 uses Euclidean distance to measure the distance between this average value and the center point of the three color templates:
[0035] Where k∈{green, orange, red}, ( R avg , G avg , B avg () represents the RGB center value of the k-th template; the three color templates are shown in Table 2; Table 2 shows three types of color templates.
[0036]
[0037] B6, classify this cell into the template category with the smallest distance, i.e., argmin d k If the minimum distance corresponds to the red template, then it is determined to be a red cell; B7, the number of cells that will be judged as "red". N red Total number of effective detected cells N total The red blood cell percentage (RCR) is calculated using the following formula: ×100%.
[0038] The calculation method for pigment distribution uniformity includes the following steps: C1 will be identified as a red cell individual, and its complete outline will be extracted as the ROI region. C2 extracts the red channel of the RGB image and converts it into an 8-bit grayscale image with pixel values ranging from 0 to 255; it also normalizes the pixels within the ROI region. C3. Calculate the standard deviation σ of gray levels for all pixels within the ROI region. If σ ≤ 20, the cells are considered uniform with a smooth overall gray level distribution. In the binarized ROI, the threshold is set to the global Otsu threshold to detect whether there are cells with an area greater than 5 μm. 2 The continuous low grayscale values represent blank areas lacking pigment; if such areas exist, they are considered incompletely filled and are not included in the uniform cell count even if σ≤20. The formula for calculating the grayscale standard deviation σ is: σ=
[0039] Among them G i Let be the grayscale value of the i-th pixel within the ROI. is the average grayscale value of all pixels within the ROI, and n is the total number of pixels within the ROI of this red cell; C4, Calculate pigment distribution uniformity The calculation formula is: ×100% in N uniform To satisfy the uniformity condition, the number of red cells, N red This represents the total number of red blood cells.
[0040] The method for calculating cell aggregation degree includes the following steps: D1, perform cell segmentation on the microscopic image and count cell pairs with an adjacent distance ≤3μm; D2, defining a cell as "aggregated" when the distance between it and ≥1 other cell is ≤3μm; D3, calculate cell aggregation degree AI, the calculation formula is: AI= ×100% Where N aggregated The number of cells with a distance ≤3 μm from one or more neighboring cells; N total This represents the total number of cells that can be clearly identified under a microscope.
[0041] The method for quantifying cell wall thickness includes the following steps: E1, through the DIC mode of the image acquisition module, selects cystic cells as the measurement object; E2, using edge detection to locate the inner and outer boundaries of cystic cells; E3, along the cell's radial direction, with the centroid as the center, uniformly select 8 radial directions and measure the distance between the inner and outer wall boundaries. Measure 3 points in each direction and take the average value as the wall thickness in that direction; calculate the average wall thickness in the 8 directions using the following formula: CWT=
[0042] Where CWT is the cell wall thickness, and T d1 ~T d8 It is the arithmetic mean of three measurement points in eight directions.
[0043] Example 1 is the basic model, which relies solely on ordinary optical microscope images and uses five visualization indicators and threshold rules to determine the harvest, without requiring network connectivity or AI computing power.
[0044] Example 2 The difference between Example 2 and Example 1 is that, The server also includes an intelligent analysis module, a data management platform, and a harvesting decision output module; The intelligent analysis module includes training equipment and an inference acceleration module; Training equipment: An NVIDIA RTX 3090 GPU workstation was selected for CNN model training and incremental learning; configuration: Intel Core i9-12900K CPU (16 cores, 24 threads, 3.2GHz), 64GB DDR5 memory, 2TB NVMe SSD, RTX 3090 GPU (24GB GDDR6X VRAM, 35.6 TFLOPS computing power); operating system: Windows 10 Professional, equipped with PyTorch 1.12.1 deep learning framework, CUDA 11.6, CuDNN 8.4 acceleration library for rapid model training.
[0045] Inference acceleration module: Optional NVIDIA Jetson Xavier NX (replaces Jetson Nano) to improve inference speed in intelligent mode; core parameters: CPU is Hexa-core ARM Cortex-A57 (1.9GHz), GPU is 384-core Volta (1.1GHz), memory is 8GB LPDDR4X, computing power is 21 TFLOPS, single-view image inference time is ≤0.5 seconds, meeting real-time decision-making requirements.
[0046] Data management platforms include local storage or cloud servers; Local storage: Employs a Western Digital My Cloud Home 4TB NAS network storage device to store historical images, sample data, model files, etc.; supports RAID 0 / 1 arrays, with data read / write speeds ≥100MB / s, and features data backup functionality to prevent data loss; connects to local processing terminals and training workstations via Ethernet to achieve data sharing.
[0047] Cloud server: Alibaba Cloud ECS server (instance specification ecs.g6.xlarge) is selected for large-scale data storage and federated learning model aggregation; configuration: 4 cores, 8GB memory, 500GB cloud disk, 5Mbps bandwidth; operating system is CentOS 7.9, equipped with MySQL 8.0 database to store distributed sample data; supports Docker containerized deployment to achieve unified model management and remote updates.
[0048] The harvesting decision output module includes a display device, an audible and visual alarm device, and a communication interface module; Display device: The selected model is a 7-inch HDMI touch screen (1024×600 resolution, capacitive touch), which is connected to the local processing terminal to display indicator values, decision results, and image previews in real time; it supports touch operation and can manually trigger functions such as image acquisition and parameter setting.
[0049] Audible and visual alarm device: The selected model is LTE-1101J audible and visual alarm, with an operating voltage of 12V, an alarm volume of ≥100dB, and a red light flashing frequency of 1Hz; it is connected to Jetson Nano via the GPIO interface, with the green light always on when "harvest recommended" and the red light flashing accompanied by an alarm sound when "abnormal warning" is triggered.
[0050] Communication interface module: The selected model is a USB-RS485 converter (model: FTDI USB-RS485-WE-1800-BT), which supports the Modbus RTU communication protocol and transmits decision commands to the aquaculture control system (such as centrifuge harvesters and water quality monitoring equipment); an optional 4G module (Huawei ME909s-821) can be added to realize remote data upload and command reception, and adapt to scenarios without network.
[0051] The intelligent analysis module uses a trained CNN model to perform real-time inference on the acquired microscopic images, outputting astaxanthin content regression values, maturity probability, and anomaly detection confidence scores. Simultaneously, it calculates the final maturity score by combining the output maturity probability with the rule-based judgment results from the data processing terminal using an adaptive weighted fusion algorithm, and then provides a harvesting strategy. Specifically, it includes the following steps: Historical samples were collected: Microscopic images (multi-view stitched) and corresponding measured astaxanthin content (%DW) were saved for each batch; a paired dataset was constructed: {(I1, y1), (I2, y2), ..., (I... n , y n )}, where I is the image and y is the astaxanthin content; data augmentation: rotation, brightness perturbation, simulation of different focus states to improve generalization.
[0052] A lightweight neural network design is used to preprocess the optical image data and segment the Haematococcus pluvialis cell region. Input: Bright-field images stitched together from a single field of view or multiple fields of view (may contain DIC channels); Backbone network: MobileNetV3 or EfficientNet-Lite (suitable for edge deployment); Output: Astaxanthin content regression value (% DW); maturity probability (0-1), replacing hard threshold judgment; anomaly detection confidence (identifying interference such as contamination and dead cells).
[0053] Intelligent decision fusion: This involves weighted fusion of the "maturity probability" output by deep learning with the results of basic rules. I AST =α· P +(1 α)· I RulePass The value α can be adaptively adjusted according to the amount of data (initially α=0, later α→0.7). in I AST This is the final maturity score, used to determine harvesting decisions; α is the adaptive weighting coefficient (0≤α≤0.7); PThe maturity probability output by the deep learning model (value range 0~1). I RulePass The result is determined by the basic rules (1 if the basic harvesting rules are met, 0 if they are not). Specific adjustment rules: When the historical sample size is less than 100, α = 0 (only the basic model judgment result is used); when the sample size is less than 500 and 100, α increases linearly with the sample size to 0.4; when the sample size is greater than or equal to 500, α is stable at 0.7 (fully leveraging the advantages of the intelligent model).
[0054] Incremental learning strategy: A fixed learning rate decay is adopted (the learning rate decays to 0.9 of the original value for every 50 new samples), and only the parameters of the fully connected layer and the last 3 feature extraction layers are updated. The sample selection adopts "outlier removal + balanced sampling" (outlier samples with astaxanthin content standard deviation > 0.5% are removed to ensure that the sample proportion of each maturity stage is balanced). Harvesting is recommended when the Final Score is ≥ 0.85.
[0055] Online learning and model updates After each harvest, new samples are added to the training set; Employing an incremental learning strategy helps avoid catastrophic forgetting. It supports cloud-based model aggregation (federated learning) and protects enterprise data privacy.
[0056] The data management platform is used to store the calculation results of microscopic images, encapsulation rate, proportion of red cells, pigment distribution uniformity, cell aggregation degree, and cell wall thickness, as well as harvesting decision records, measured astaxanthin content, and CNN model files. At the same time, it provides historical images and astaxanthin content paired datasets for the intelligent analysis module to be used for CNN model training and incremental learning.
[0057] The harvesting decision output module is used to visualize the decision results of the data processing terminal and the intelligent analysis module, provide audio-visual prompts, and output device linkage commands.
[0058] The CNN model uses MobileNetV3-Small as its base model and employs transfer learning and incremental learning strategies to achieve astaxanthin content regression and maturity prediction. The specific training steps are as follows: Step 1: Model Initialization and Transfer Learning (1) Loading pre-trained weights: Download the pre-trained weights of MobileNetV3-Small on the ImageNet dataset (PyTorch official weights), load them into the backbone network of the model, freeze the parameters of the first 10 layers of the backbone network, and retain the last 3 feature extraction layers for training, thereby reducing the amount of training data required and improving the convergence speed.
[0059] (2) Head network construction: Remove the original classification head of MobileNetV3-Small and replace it with a dual output head for regression and classification; ① Regression branch: Add one fully connected layer (input dimension 1024, output dimension 256), ReLU activation function, Dropout layer (dropout=0.2), then add another fully connected layer (input dimension 256, output dimension 1), Linear activation function, output astaxanthin content regression value y; ② Classification branch: Add a Sigmoid activation function based on the regression value to output the maturity probability P; ③ Anomaly detection branch: Add one fully connected layer (input dimension 1024, output dimension 128), ReLU activation function, Dropout layer (dropout=0.2), fully connected layer (input dimension 128, output dimension 1), Sigmoid activation function, and output anomaly detection confidence C.
[0060] (3) Loss function and optimizer configuration: The combined loss function is adopted, Loss = Loss + 0.3 × Loss + 0.2 × Loss, where Loss is MSE loss (regression branch), Loss is BCE loss (maturity classification branch), and Loss is BCE loss (anomaly detection branch); the optimizer is Adam, with an initial learning rate of 0.001, β1 = 0.9, β2 = 0.999, and a weight decay coefficient λ = 1e-5 to prevent overfitting.
[0061] Step 2: Initial training (sample size < 100, α = 0) (1) Dataset preparation: Collect no less than 50 paired samples (images + measured values of astaxanthin content), and divide them into training set and validation set in a 7:3 ratio; perform enhancement processing on the images (rotation, brightness perturbation, cropping) to generate an expanded dataset (the number of samples after expansion is ≥200); normalize the astaxanthin content data (y=(y-1.5) / (5.0-1.5), based on the normal range of astaxanthin content of Haematococcus pluvialis 1.5%-5.0% DW).
[0062] (2) Training process settings: The batch size is set to 16, the training epochs are set to 50, and an early stopping strategy (patience=5) is adopted. That is, when the validation set loss does not decrease for 5 consecutive epochs, the training is stopped to avoid overfitting. During the training process, the training set loss, validation set loss, and regression accuracy are recorded in real time (MAE≤0.3% DW is accurate).
[0063] (3) Model saving: After training is completed, the model with the smallest loss on the validation set is saved as the initial model, named "model_initial.pth", and stored on the NAS device " / model / initial / ". At the same time, the training parameters (learning rate, loss value, iteration number) are saved to the log file.
[0064] Step 3: Incremental training (sample size ≥ 100, α dynamically adjusted) (1) Sample update and preprocessing: For each batch harvested, 10-20 new samples are added to the training set; abnormal samples are removed using the 3σ criterion, and the new samples are image-enhanced and normalized to maintain a balanced distribution of the dataset; if the number of samples exceeds 500, a random sampling strategy is adopted to keep the proportion of samples in each maturity stage consistent (20% in the nutrient stage, 30% in the transition stage, and 50% in the maturity stage).
[0065] (2) Model parameter adjustment: Load the optimal model obtained from the previous training round, fix the parameters of the first 7 layers of the backbone network, unfreeze the 3 feature extraction layers and all head network parameters; adjust the learning rate, for every 50 new samples, the learning rate decays to 0.9 of the original, and the initial incremental learning rate is 0.0005; set the batch size to 16, the training rounds to 20, and the early stopping strategy patience=3.
[0066] (3) Incremental training execution: Start training and adopt the gradient accumulation strategy (accumulate 2 steps to update parameters once) to reduce the memory usage; during training, when the loss of the validation set decreases by ≥0.001 in each round, save the model as a temporary file; after training, compare the test set accuracy of the temporary model with that of the currently deployed model. If the accuracy increases by ≥2%, update the deployed model; otherwise, retain the original model.
[0067] Step 4: Model Evaluation and Optimization (1) Performance evaluation: The model performance was evaluated using a test set (independent samples, accounting for 10%). Evaluation indicators included: regression MAE (mean absolute error), maturity classification accuracy (P≥0.8 corresponds to actual maturity ≥3% DW for accuracy), and anomaly detection accuracy (C≥0.8 corresponds to actual anomaly for accuracy); target indicators: MAE≤0.2% DW, classification accuracy≥92%, and anomaly detection accuracy≥90%.
[0068] (2) Model optimization: If MAE > 0.2% DW, increase the number of neurons in the fully connected layer (e.g., increase the dimension from 256 to 512), adjust the Dropout coefficient to 0.15, and retrain; if the classification accuracy is < 92%, increase the diversity of sample augmentation (e.g., add Gaussian noise, color perturbation), and expand the dataset; if the anomaly detection accuracy is < 90%, label the abnormal samples separately (contamination, dead cells), and increase the training weight of the abnormal samples (weight coefficient 1.5).
[0069] Step 5: CNN Model Deployment and Update (1) Model conversion: The trained PyTorch model is converted to ONNX format (version 1.12), optimized by TensorRT 8.4, and an inference engine adapted to Jetson Nano / Xavier NX is generated to reduce inference time (the inference speed is increased by 30%-50% after optimization).
[0070] (2) Edge deployment: The optimized model file is transferred to the local processing terminal (Jetson device), integrated into the intelligent analysis module, and inference parameters (input image size, confidence threshold) are configured; the model inference interface is written and connected with the basic rule engine to achieve decision fusion.
[0071] (3) Remote update: If a cloud server is provided, remote push update of the model is supported; the local processing terminal checks the cloud model version regularly (every 24 hours). If an updated version exists, it automatically downloads and replaces the old model, and backs up the old model to ensure that the update can be rolled back if it fails.
[0072] Example 2 is the intelligent mode. Based on the basic mode, it accesses a database of paired historical images and astaxanthin content to train a lightweight deep learning model, dynamically optimize the judgment boundary, and improve accuracy and robustness. The data flywheel mechanism re-injects the measured astaxanthin content (HPLC / spectrophotometry) into the system after each harvest to continuously iterate the model.
[0073] The above descriptions are merely embodiments of the present invention, and common knowledge such as specific technical solutions and / or characteristics are not described in detail here. It should be noted that those skilled in the art can make various modifications and improvements without departing from the technical solutions of the present invention, and these should also be considered within the scope of protection of the present invention. These modifications and improvements will not affect the effectiveness of the implementation of the present invention or the practicality of the patent. The scope of protection claimed in this application should be determined by the content of its claims, and the specific embodiments described in the specification can be used to interpret the content of the claims.
Claims
1. A system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images, characterized in that, include: The server includes an image acquisition module and a data processing terminal. The image acquisition module is used to acquire microscopic images of sample droplets of Haematococcus pluvialis; The data processing terminal is used to preprocess the acquired microscopic images, and then, based on the preprocessed microscopic images, calculates and outputs quantitative values of Haematococcus pluvialis's encapsulation rate, proportion of red cells, pigment distribution uniformity, cell aggregation degree, and cell wall thickness. When the encapsulation rate is ≥85%, the proportion of red cells is ≥90%, the pigment distribution uniformity is ≥85%, the cell aggregation degree is ≤10 and the cell wall thickness is ≥2μm, and the bilayer structure of the cell wall is visible in ≥80% of cases, it is considered that harvesting is recommended. The encapsulation rate is the percentage of spherical / near-spherical dormant cysts in the total number of cells; the red cell percentage is the percentage of cells with a red phenotype in the Haematococcus pluvialis culture sample out of the total number of detected cells; the pigment distribution uniformity is that the red cells of Haematococcus pluvialis meet the condition that the standard deviation of the grayscale of all pixels within the ROI of the red cell is ≤20, and there are no pixels >5μm in the detection binarization mask. 2 The percentage of cells in a continuous blank area relative to the total number of red cells; cell aggregation degree is the percentage of aggregated cells to the total number of cells; cell wall thickness is the thickness of the double-layer structure formed by the thickening of the mature sac cell wall.
2. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 1, characterized in that: The server also includes an intelligent data management platform, an energy analysis module, and a harvesting decision output module; The data management platform is used to store the calculation results of microscopic images, encapsulation rate, proportion of red cells, pigment distribution uniformity, cell aggregation degree, and cell wall thickness, as well as harvesting decision records, measured astaxanthin content, and CNN model files; at the same time, it provides historical images and astaxanthin content paired datasets for the intelligent analysis module to be used for CNN model training and incremental learning. The intelligent analysis module is used to perform real-time reasoning on the collected microscopic images using a trained CNN model, outputting astaxanthin content regression values, maturity probability, and anomaly detection confidence. At the same time, it calculates the final maturity score by combining the output maturity probability with the harvesting recommendation results from the data processing terminal through an adaptive weighted fusion algorithm, and provides a harvesting strategy. The maturity probability is obtained by mapping the astaxanthin content regression value through the Sigmoid function, and is used for continuous assessment of the maturity of astaxanthin accumulation in cell populations; the anomaly detection confidence is used to identify whether there are abnormal situations in the sample, such as contamination, cell death, or excessive stress interfering with the harvesting judgment. The harvesting decision output module is used to visualize the decision results of the data processing terminal and the intelligent analysis module, provide audio-visual prompts, and output device linkage commands.
3. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 2, characterized in that: The image acquisition module uses an optical microscope or micro-camera that supports bright field and DIC modes.
4. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 3, characterized in that: The method for calculating the encapsulation rate of Haematococcus pluvialis includes the following steps: A1, three fields of view were randomly selected from the microscopic images of each droplet sample, with ≥50 cells in each field of view; A2. Segment the Haematococcus pluvialis cells in the field of view, and then calculate the roundness. If the roundness is ≥0.85 and the equivalent diameter is between 10 and 20 μm, it is determined to be a cystic cell. The roundness calculation formula is: Where C is the roundness, A is the cell area, and P is the cell perimeter; A3. Calculate the encapsulation rate. The formula for calculating the encapsulation rate is: ×100% Where CFR is the encapsulation rate. To determine the number of cells that are cystic cells, N total This represents the total number of cells identified in the field of view.
5. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 4, characterized in that: The method for calculating the proportion of red cells in Haematococcus pluvialis includes the following steps: B1. Before each microscopic image acquisition, a standard white reference plate with a reflectivity ≥95% under CIE D65 standard light source is used as a reference for white balance calibration. B2 converts the calibrated microscopic image to the sRGB color space, limited to an 8-bit depth, with a pixel value range of 0–255; B3. The microscopic image is segmented to obtain an independent region for each cell. A circular sub-region with a diameter of 3μm is extracted with the cell centroid as the center. The RGB mean ranges of green, orange and red are predefined by the built-in three color templates. The red template is R[150,200], G[50,100], B[50,100], and the midpoint of the range (175,75,75) is taken as the center point of the red template. B4, calculate the average RGB value of all pixels within the sub-region ( R avg , G avg , B avg ) B5 uses Euclidean distance to measure the distance between this average value and the center point of the three color templates: Where k∈{green, orange, red}, ( R avg , G avg , B avg ) represents the RGB center value of the k-th template; B6, classify this cell into the template category with the smallest distance, i.e., argmin d k If the minimum distance corresponds to the red template, then it is determined to be a red cell; B7 will be judged as the number of red blood cells. N red Total number of effective detected cells N total The red blood cell percentage (RCR) is calculated using the following formula: ×100% 。 6. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 5, characterized in that: The method for calculating the uniformity of pigment distribution includes the following steps: C1 will be identified as a red cell individual, and its complete outline will be extracted as the ROI region. C2 extracts the red channel of the RGB image and converts it into an 8-bit grayscale image with pixel values ranging from 0 to 255; it also normalizes the pixels within the ROI region. C3. Calculate the standard deviation σ of gray levels for all pixels within the ROI region. If σ ≤ 20, the cells are considered uniform with a smooth overall gray level distribution. In the binarized ROI, the threshold is set to the global Otsu threshold to detect whether there are cells with an area greater than 5 μm. 2 The continuous low grayscale values represent blank areas lacking pigment; if such areas exist, they are considered incompletely filled and are not included in the uniform cell count even if σ≤20. The formula for calculating the grayscale standard deviation σ is: σ= Among them G i Let be the grayscale value of the i-th pixel within the ROI. is the average grayscale value of all pixels within the ROI, and n is the total number of pixels within the ROI of this red cell; C4, Calculate pigment distribution uniformity The calculation formula is: ×100% in N uniform To satisfy the uniformity condition, the number of red cells, N red This represents the total number of red blood cells.
7. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 6, characterized in that: The method for calculating cell aggregation includes the following steps: D1, perform cell segmentation on the microscopic image and count cell pairs with an adjacent distance ≤3μm; D2, defining a cell as "aggregated" when the distance between it and ≥1 other cell is ≤3μm; D3, calculate cell aggregation degree AI, the calculation formula is: Who= ×100% Where N aggregated The number of cells with a distance ≤3 μm from one or more neighboring cells; N total This represents the total number of cells that can be clearly identified under a microscope.
8. The system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 7, characterized in that: The method for quantifying cell wall thickness includes the following steps: E1, through the DIC mode of the image acquisition module, selects cystic cells as the measurement object; E2, using edge detection to locate the inner and outer boundaries of cystic cells; E3, along the cell's radial direction, with the centroid as the center, uniformly select 8 radial directions and measure the distance between the inner and outer wall boundaries. Measure 3 points in each direction and take the average value as the wall thickness in that direction; calculate the average wall thickness in the 8 directions using the following formula: CWT= Where CWT is the cell wall thickness, and T d1 ~T d8 It is the arithmetic mean of three measurement points in eight directions.
9. A system for determining the harvesting timing of Haematococcus pluvialis based on optical microscopic images according to claim 8, characterized in that: The training method of the CNN model Includes the following steps: S1. Collect ≥50 paired samples of microscopic images and measured astaxanthin content, and divide them into training and validation sets at a ratio of 7:
3. Enhance the images by rotating and brightness perturbation to expand the paired samples to ≥200. Normalize the astaxanthin content to the range of 1.5%-5.0%DW. S2, load the pre-trained weights of MobileNetV3-Small on ImageNet, freeze the first 10 layers of the backbone network, and train only the last 3 feature layers; Replace the original classification head with a three-output head for astaxanthin regression, maturity classification, and anomaly detection, and configure the combined loss function MSE, BCE, and Adam optimizer with an initial learning rate of 0.
001. S3, batch size is set to 16, training rounds are set to 50, early stopping strategy is enabled, stop if the validation set loss does not decrease after 5 rounds, and save the initial model with the minimum validation set loss. S4: After each batch of harvesting, add 10-20 new samples and merge them with historical data; use the 3σ criterion to remove abnormal samples, balance the proportion of samples at each maturity stage, and perform image enhancement on the new samples according to the same standard. S5 loads the best model from the previous round, freezes the first 7 layers of the backbone, and unfreezes the 3 feature layers and the entire head network; the learning rate decays with the number of samples, and for every 50 new samples, the learning rate drops to the original 0.9, with an initial incremental learning rate of 0.0005; S6, batch size set to 16, training epochs set to 20, gradient accumulation 2-step parameter update; if the loss decreases by ≥0.001 in each epoch, save the temporary model; after training, compare the accuracy on the test set, if the improvement is ≥2%, update the deployed model; S7 uses regression MAE, classification accuracy, and anomaly detection accuracy as evaluation indicators. If the target is not met (MAE ≤ 0.2% DW or classification ≥ 92%), the number of network neurons, sample augmentation method, and anomaly sample weight will be adjusted.