A mangrove unmanned aerial vehicle remote sensing dynamic monitoring method, system and software product based on an agent
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGDONG LABORATORY OF SOUTHERN OCEAN SCIENCE AND ENGINEERING (GUANGZHOU)
- Filing Date
- 2026-04-03
- Publication Date
- 2026-06-16
AI Technical Summary
Existing mangrove drone remote sensing monitoring technology suffers from problems such as high professional threshold, rigid processes and slow response, insufficient multi-task collaboration and comprehensive judgment capabilities, and limited reliability and reusability of results, making it difficult to meet the high-efficiency needs of mangrove resource surveys, dynamic change monitoring, and invasive species early warning.
By employing a large language model as the intelligent decision-making hub and combining it with a knowledge graph in the mangrove domain, the remote sensing processing workflow under natural language input is adaptively adjusted through multi-agent tools. This enables automatic completion of mangrove range extraction, species/invasion identification, index calculation, change detection, and report output. Furthermore, it introduces result consistency verification and iterative optimization to reduce the operational threshold and improve monitoring efficiency.
It significantly lowers the barrier to entry for remote sensing and GIS operations, improves monitoring efficiency and emergency response speed, enhances the ability to conduct comprehensive analysis of multiple tasks, ensures the stability and reliability of monitoring results, and supports comprehensive decision support for multiple scenarios such as mangrove resource surveys, dynamic change monitoring, and invasive species assessment.
Smart Images

Figure CN121963006B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of mangrove remote sensing monitoring technology, and in particular to a method, system and software product for mangrove UAV remote sensing dynamic monitoring based on intelligent agents. Background Technology
[0002] Mangroves are typical wetland ecosystems distributed in the intertidal zone of tropical and subtropical coasts, providing important ecological services such as wind and wave protection, bank protection and embankment reinforcement, water purification, carbon sequestration and storage, and biodiversity maintenance. Mangroves are often located in complex environments characterized by numerous tidal channels, muddy and slippery conditions, and frequent tidal fluctuations. Furthermore, the significant heterogeneity in their canopy structure and community space makes traditional ground survey methods (such as transects, squares, and manual tree height / canopy width measurements) difficult to implement in practical engineering applications. These methods face challenges such as difficulty in access, limited sampling, high labor intensity, long cycles, and small coverage areas, making it difficult to support the needs of routine mangrove resource inspections, rapid emergency verification, and high-frequency dynamic monitoring. Especially during events such as the expansion of invasive alien species, extreme weather disturbances, and human encroachment, management departments typically need comprehensive conclusions within a short period, including the location, extent, speed of change, and potential risks of the event. Traditional survey methods are insufficient to meet the requirements for response timeliness and spatial scale.
[0003] With the development of UAV platforms, sensors, and remote sensing processing software, UAV remote sensing, with its advantages of high mobility, low cost, and high resolution, has gradually become an important technical route for mangrove monitoring. UAVs can acquire multi-angle (overhead, oblique photography, etc.) and multi-type (visible light, multispectral, hyperspectral, lidar, etc.) image data in a relatively short time, and can achieve high geometric accuracy by combining ground control points and RTK / PPK positioning, providing a data foundation for mangrove extent extraction, tree species identification, canopy parameter inversion, change detection, and restoration assessment. However, UAV data is large in volume, diverse in type, temporally dense, and has a long processing chain. From image preprocessing (radiometric / atmospheric correction, geometric correction, mosaic registration) to feature index calculation (such as vegetation / water body correlation index), to classification / detection model training and inference, and finally to mapping and report generation, it often involves the cascading of multiple software, algorithms, and parameters, and still highly relies on personnel with remote sensing and GIS expertise for scheme design and implementation.
[0004] In existing patents and engineering solutions related to mangrove drone remote sensing monitoring, the existing technologies mainly focus on fixed process algorithms for a certain type of monitoring target. For example, Chinese patent document CN112861837A discloses a smart extraction method for mangrove ecological information based on drones: the drone collects visible light remote sensing data and performs preprocessing, uses a pixel-level species identification algorithm to obtain mangrove species information, and then calculates the mangrove area and species diversity; furthermore, it generates a digital surface model (DSM) through three-dimensional reconstruction, and calculates ecological indicators such as canopy diameter and tree height in combination with the species identification results, so as to realize the extraction of mangrove ecological information and health assessment. This solution can systematically complete the chain of data collection, preprocessing, identification, index calculation, and 3D inversion. It has good feasibility and accuracy advantages under a single business objective. However, its technical approach is still based on a pre-defined processing flow: users need to select input data, configure algorithm parameters, and organize the processing order and output format according to their understanding of the task. When the monitoring needs expand from ecological index extraction to more complex scenarios such as external intrusion expansion assessment, multi-task linkage judgment, and cross-temporal change diagnosis, it still requires professional personnel to redesign and reconfigure. Furthermore, it lacks a unified scheduling and adaptive adjustment mechanism for different tasks.
[0005] For example, Chinese patent document CN112560623A discloses a rapid identification method for mangrove plant species based on UAVs: A training dataset is obtained by collecting visible light data using a UAV and preprocessing it; a 3D point cloud reconstruction is performed to generate a DSM (Digital Model of Trees); and vegetation is pre-classified using height information. Subsequently, the classification results of pixel-level and object-oriented recognition algorithms are combined to construct a classification weight database. The pre-classification results are then refined based on the weights to improve recognition accuracy. Furthermore, weight parameters can be reused for the same area at different times or for similar community areas to improve efficiency. This scheme emphasizes the idea of multi-algorithm fusion and weight database reuse, which can improve accuracy and versatility in specific tree species identification tasks. However, its core remains the solidification of the algorithmic process around the single task of tree species identification: for multiple tasks such as range extraction, change detection, invasion monitoring, pest and disease identification, human interference interpretation, and ecological factor correlation analysis, separate independent processes still need to be constructed. Moreover, the selection of indices, models, and parameter adaptation under different data sources (multispectral / hyperspectral / LiDAR) and different task objectives still mainly rely on human experience.
[0006] Meanwhile, intelligent interaction for geographic information applications has also been explored. For example, Chinese patent document CN120541135A discloses a conversational GIS visualization scene generation method and system based on an LLM large model: the dialogue processing module receives and parses natural language commands, uses LLM to standardize and encapsulate key information parameters, and then the GIS capability service module performs corresponding GIS operations according to the operation type and saves the scene. Finally, the results are presented in the form of map thumbnails, realizing the intelligent link from natural language to GIS analysis and visualization. This type of solution is enlightening in reducing the threshold of GIS visualization operations, but its focus is more on GIS scene generation / map visualization and interaction. It does not provide a domain-oriented overall mechanism for the professional processing chain commonly found in UAV remote sensing mangrove monitoring, such as multi-source image preprocessing, index and feature engineering, model inference and change detection, and consistency verification with ground verification data. When facing complex, temporary, and cross-task monitoring needs, there may still be a breakpoint where natural language can be parsed, but the remote sensing processing chain still needs to be constructed by experts.
[0007] In summary, although existing technologies can achieve tree species identification, ecological indicator extraction, or efficiency improvements in specific processes in mangrove UAV remote sensing scenarios, the following shortcomings still exist in practical business applications: First, the professional threshold remains high. Whether using traditional remote sensing software or models and processes in patented solutions, users typically need to understand the characteristics of multi-source image data, the meaning of preprocessing steps and parameters, master the applicable conditions of indices / features, and be able to judge the adaptability of different algorithm models, making it difficult for ecological management departments or grassroots patrol personnel to use them directly. Second, the processes are rigid and the response is slow. Existing solutions are mostly pre-designed and rigid processing chains for a specific task. When the monitoring target changes (such as switching from tree species classification to Spartina alterniflora expansion or aquaculture pond encroachment), or when the data source and resolution change, it is often necessary to reselect tools, adjust parameters, and rebuild the process, making it difficult to achieve rapid adaptation. Third, the ability for multi-task collaboration and comprehensive analysis is insufficient. Mangrove management decisions often require integrating range extraction, change detection, invasive species identification, ecological indicator estimation, risk assessment, and countermeasure recommendations into a unified conclusion. However, existing technologies mostly focus on single-task optimization, lacking a closed-loop mechanism for unified cross-task scheduling, result fusion, and consistency verification, making it difficult to form stable, traceable, and interpretable comprehensive products. Fourth, the reliability and reusability of results are limited. UAV imagery is significantly affected by tide levels, illumination, turbidity, and seasonal phenology, and the model is prone to accuracy degradation when applied across regions and seasons. Existing technologies often lack systematic verification and iterative strategies for result consistency, temporal rationality, and ground verification, thus limiting the reliability and generalizability of monitoring results.
[0008] Therefore, there is an urgent need for a technical solution that can significantly reduce the barrier to entry for mangrove drone remote sensing monitoring, and can achieve dynamic planning, collaborative execution, and reliable result verification in multi-task scenarios. This solution would support the efficient implementation of various applications such as mangrove resource surveys, dynamic change monitoring, invasive species early warning, and restoration effectiveness assessment, and provide more timely, interpretable, and reusable technical support for management and protection decisions. Summary of the Invention
[0009] The technical objective of this invention is to provide a method and system for dynamic monitoring of unmanned aerial vehicles (UAVs) using a large language model as the intelligent decision-making center, combined with a knowledge graph in the mangrove domain, and driven by multi-agent tools to perform collaborative execution. This allows users to quickly generate and adaptively adjust the remote sensing processing flow through natural language input, automatically completing tasks such as mangrove range extraction, species / invasion identification, index calculation, change detection, and report output. At the same time, the credibility and interpretability of monitoring results are improved through result consistency verification and iterative optimization, thereby reducing the professional threshold, improving monitoring efficiency, and enhancing the ability to comprehensively analyze multiple tasks.
[0010] Firstly, in order to achieve the above-mentioned objectives, the present invention adopts the following technical solution:
[0011] A method for dynamic remote sensing monitoring of mangrove forests using unmanned aerial vehicles (UAVs) based on intelligent agents, comprising the following steps:
[0012] S1 receives the user's natural language monitoring instruction I and uses the large language model M to parse and generate structured tasks. O represents the target object being monitored, and R represents the geographical area being monitored. Where A represents the time frame and A represents the analysis task type.
[0013] S2, Knowledge Graph KG in the Mangrove Domain m The subgraph G is obtained by performing GraphRAG retrieval based on T. s And it is composed of the large language model M combined with I, T, G s Generate task execution graph ;
[0014] S3, press Each node is routed to an agent for image processing, remote sensing index calculation, target detection, and report generation to obtain the node results. r i With call log L i ;
[0015] S4, fusion to obtain monitoring results And calculate the consistency score. ,like If the node fails to execute, then Li Feedback is sent to M-replanning to adjust the parameter set. and / or replace the agent, iterating to Or the number of iterations K reaches ; To preset a consistency threshold, This represents the maximum number of iterations.
[0016] S5 outputs GeoTIFF and Shapefile data, along with a structured monitoring report, and calculates the early warning indicator W. It generates early warning information in real time and will verify the information. Write back to To update the parameter template, where, The preset warning threshold is used.
[0017] Preferably, in step S1, the structured task T further includes sensor types. With output product set ,in Used to characterize the type of drone payload A set of formats used to characterize output products;
[0018] And / or, in step S2, the mangrove domain knowledge graph It must contain at least the entity type set E and the relation type set. E includes species entities, invasive alien entities, remote sensing index entities, environmental factor entities, model entities, and tool entities. This includes at least one of the following: Applicable-Inapplicable, Dependency, Parameter Template, and Synonym Mapping;
[0019] And / or, in step S2, the This includes: first, performing vector recall based on the structured task description T to obtain a candidate entity set, and then performing k-hop graph expansion on the candidate entity set to obtain the subgraph. Where k is the number of extended hops;
[0020] And / or, in step S2, the task execution graph It is a directed acyclic graph, and each task node Further including resource budget The resource budget Used to limit the upper limit of computing resources that a node can execute.
[0021] Preferably, in step S2, the task execution graph Includes multiple task nodes connected by dependencies. Each task node It must include at least: node task type Input data mode Output data mode Parameter set With quality control rules Where i is the node number;
[0022] And / or, in step S3, according to the task execution diagram For each task node Call and node task types Matching tool intelligent agent Perform the corresponding processing to obtain the node output results. r i And generate call logs L i The tool intelligent agent It includes at least one of the following: image processing agent, remote sensing index calculation agent, target detection agent, and report generation agent; the call log L i At least the following should be recorded: the identifier of the tool invoked, the version of the input data, and the set of parameters. With output results r i ;
[0023] And / or, in step S4, ;
[0024] in, For monitoring results Data verified on the ground Consistency indicators between them A consistency metric between the outputs of at least two different algorithms. As a temporal consistency index for multi-phase results, , , The weighting coefficients are satisfied. ;when If any task node fails to execute, L i The inconsistency description is fed back to the large language model M, which then performs the task execution graph. Perform reprogramming to modify at least one set of parameters. and / or replace at least one tool agent And repeat S3 and S4 until... Or the number of iterations K reaches the upper limit. .
[0025] Preferably, in step S3, the image processing agent performs at least two of the following: radiometric calibration, atmospheric correction, geometric correction and image registration, and outputs a corrected image in a unified coordinate reference system.
[0026] And / or, in step S3, the remote sensing index calculation agent calculates the set of indices. It includes at least two of NDVI, NDWI, NDMI, MVI, and TIMI, and uses the index as one of the input features of the target detection agent;
[0027] And / or, in step S3, the target detection agent includes at least a detection model for identifying invasive alien species. and output the intrusion range mask. With the area of invasion ;in, Depend on The corresponding pixel surface is accumulated and added together.
[0028] Preferably, in step S4, the consistency index The spatial overlap index is calculated using different algorithms, and the spatial overlap index includes at least the intersection-union ratio. Or one of the Dice coefficients.
[0029] Preferably, the text report generated by the report-generating agent includes four types of paragraphs: descriptive explanation, diagnostic explanation, predictive explanation, and strategy explanation, and each type of paragraph is associated with at least one of the aforementioned call logs. L i To achieve traceability of the analysis process.
[0030] Secondly, the present invention also provides a multi-agent collaborative mangrove UAV remote sensing monitoring system based on a large language model intelligent hub, comprising:
[0031] The natural language interaction module is used to receive monitoring instructions I and generate structured task descriptions T;
[0032] The decision-making central module, containing a large language model M, is used to combine... Retrieval results generate task execution graph And replan under trigger conditions;
[0033] The domain knowledge base module includes a knowledge graph for the mangrove forest domain. Used to output subgraph And receive parameter write-back;
[0034] The task planning and routing module is used to map the task execution graph. The node is routed to the corresponding tool agent. ;
[0035] The tool execution layer is used to execute tool agents and generate call logs. L i ;
[0036] The results integration and validation module is used to calculate the consistency score. And determine whether an iteration is triggered;
[0037] The results generation and interpretation module is used to output spatial data products and text reports, and meets the following requirements: It outputs early warning information in real time.
[0038] Preferably, the tool execution layer includes a tool registry, a tool adapter, a resource manager, and an execution engine, wherein the execution engine is used to provide timeout control and error handling;
[0039] And / or, the tool adapter is used to encapsulate heterogeneous remote sensing processing libraries into a unified callable interface and write tool capability descriptions into the mangrove domain knowledge graph. ;
[0040] And / or, the domain knowledge base module also connects to a spatiotemporal database to store UAV image metadata and ground verification data. And supports matching the geographic range R with the time range Perform a query;
[0041] And / or, the spatial data products output by the result generation and interpretation module include at least GeoTIFF and Shapefile, and the text report output is PDF.
[0042] Thirdly, the present invention also provides a computer-readable storage medium having a computer program or instructions stored thereon, which, when executed by a processor, implement the steps of the method.
[0043] Fourthly, the present invention also provides a computer program product, including a computer program or instructions that, when executed by a processor, implement the steps of the method.
[0044] This invention utilizes a large language model as the central hub for dynamic task planning and scheduling, combined with prior constraints from a knowledge graph in the mangrove domain and GraphRAG retrieval enhancements. It automatically generates remote sensing processing execution graphs tailored to data types, task objectives, and spatiotemporal ranges, even when users only provide natural language monitoring requirements. This allows for the invocation of specialized intelligent agents for image preprocessing, index calculation, target detection / classification, change detection, and report generation to complete end-to-end closed-loop monitoring. Compared to existing solutions that rely on expert-built workflows or fixed scripts, this significantly lowers the operational threshold for remote sensing and GIS, reduces manual intervention and parameter trial-and-error costs, and improves the processing efficiency and emergency response speed of massive amounts of UAV imagery. Furthermore, this invention introduces a quantitative consistency score based on ground verification data, multi-model consistency, and temporal consistency as a quality control and iteration trigger criterion, enabling automatic reclassification upon detection of misclassification, boundary drift, or temporal anomalies. By planning the toolchain and parameters and repeatedly executing them until the threshold requirements are met, the stability, reliability, and cross-regional transferability of mangrove range extraction, tree species / invasive species identification, and change quantification results are significantly improved. Furthermore, the invention writes back the verified tool call logs and parameter templates to the knowledge graph, forming accumulative and reusable prior knowledge of tasks, tools, and parameters, enabling the system to have self-learning process optimization capabilities during continuous operation. Finally, in addition to outputting spatial products such as GeoTIFF and Shapefile, the system can automatically generate structured monitoring reports containing descriptions, diagnoses, predictions, and countermeasures suggestions, and trigger early warnings when key indicators reach thresholds. This achieves an integrated value-added link from data acquisition, analysis and processing, reliable verification, to decision output, meeting the comprehensive decision support needs of multiple scenarios such as mangrove resource surveys, dynamic change monitoring, invasive species expansion assessment, and restoration effectiveness evaluation. Attached Figure Description
[0045] Figure 1 This is a schematic diagram of the structure of the multi-agent collaborative mangrove UAV remote sensing monitoring system based on a large language model intelligent hub according to the present invention.
[0046] Figure 2 This is a schematic diagram of the process of the mangrove UAV remote sensing dynamic monitoring method based on intelligent agents according to the present invention.
[0047] Figure 3 A schematic diagram of task execution graph construction and node routing.
[0048] Figure 4 This is a schematic diagram of the closed loop for result integration and consistency verification.
[0049] Figure 5 This is a schematic diagram illustrating the generation and interpretation of the results.
[0050] Figure 6 A visual comparison chart of three-class semantic segmentation.
[0051] Figure 7 A visualization comparison of semantic segmentation for multiple categories of coastal wetlands.
[0052] Figure 8 The average intersection-union ratio (AUC) for the three-class classification task in Application Example 1 Comparison chart.
[0053] Figure 9 For the boundary F1 metric of the three-class classification task in Application Example 1 Comparison chart.
[0054] Figure 10 Consistency score for the present invention A graph showing how the number of iterations K changes.
[0055] Figure 11 This is a comparison chart of end-to-end task execution times.
[0056] Figure 12 This is a comparison chart of the average number of iterations before and after writing back the parameter template. Detailed Implementation
[0057] The preferred embodiments of the present invention will be further described below with reference to the accompanying drawings. Those skilled in the art should understand that the present invention can be implemented through a combination of software and hardware. The software can run on a server, workstation, or edge computing device, and the hardware may include a CPU, GPU, memory, network interface, and UAV data acquisition terminal, etc. Without departing from the concept of the present invention, module division, algorithm replacement, and equivalent transformations of parameter values should all fall within the protection scope of the present invention.
[0058] I. Terminology Explanation
[0059] 1. Large Language Model (LLM): Refers to a pre-trained generative model based on the Transformer architecture, used to understand, reason about and generate task plans from natural language instructions.
[0060] 2. GraphRAG (Graph-based Retrieval-Augmented Generation): This refers to graph retrieval-augmented generation techniques that enhance the accuracy and controllability of LLM generation by retrieving relevant subgraphs from the domain knowledge graph and using them as context.
[0061] 3. Knowledge Graph for the Mangrove Domain ( : A domain knowledge base constructed from entities and relationships such as species, indices, models, tools, parameter templates, environmental factors, and business tasks.
[0062] 4. Task Execution Diagram ( ): A directed acyclic graph that organizes the subtask steps of the monitoring task into directed dependencies, with each node corresponding to an executable processing step.
[0063] 5. Tool intelligent agents ( ): A callable component that encapsulates remote sensing processing algorithms / models / software capabilities and supports scheduled execution by LLM.
[0064] 6. Consistency score ( ): A quantitative indicator for the fusion evaluation of ground verification consistency, multi-model consistency and time series consistency of monitoring results, used as a criterion for quality control and iteration triggering.
[0065] 7. Parameter Templates: Reusable combinations of parameters and constraint rules accumulated for a certain type of task, a certain type of data, and a certain type of tool / model, used to guide subsequent planning and execution.
[0066] 8. Spatial data products: These refer to output data such as GeoTIFF and Shapefile that contain spatial reference information, used for mapping, statistics, and management decision-making.
[0067] 9. Early warning threshold ( ): The information judgment threshold that triggers automatic early warning when the early warning indicator reaches or exceeds this threshold.
[0068] 10. Call Log (L): A traceable execution log that records tool calls, input data versions, parameters, output and error messages.
[0069] II. System Structure of the Invention
[0070] like Figure 1 As shown, the multi-agent collaborative mangrove UAV remote sensing monitoring system based on a large language model intelligent hub of the present invention preferably includes a natural language interaction and input module 101, a natural language understanding module 102, an LLM decision hub module 103, a mangrove domain knowledge base module 104, a task planning and routing module 105, a multi-agent tool layer 106, a tool execution layer 107, an external tool and data integration module 108, a result integration and verification module 109, and a result generation and interpretation module 110. Each module is connected to a message queue / service interface through a data bus.
[0071] 1. Natural Language Interaction and Input Module 101
[0072] This module receives monitoring commands (I) from the user and supports optional supplementary information, including: monitoring area name / administrative division / vector boundary file, monitoring time range, data source type (RGB, multispectral, hyperspectral, LiDAR), output format preference (map / statistics / report), etc. It can be implemented as a web, mobile, or desktop interface.
[0073] 2. Natural Language Understanding Module 102
[0074] This module is used to invoke LLM to perform intent parsing, feature extraction, and structured encapsulation of monitoring command I, generating a structured task description T. This module may include: an input normalizer, a JSON pattern validator, and a place name resolver (mapping the XX protected area to standard boundaries or coordinate ranges).
[0075] 3. LLM Decision Center Module 103
[0076] As the core of intelligent decision-making, it receives T and retrieves enhanced context to generate a task execution graph. The module also performs quality assessment and replanning on the execution results. Ideally, this module includes: a plan generator, a plan validator, a replanner, and a prompt word template manager.
[0077] 4. Mangrove Domain Knowledge Base Module 104
[0078] At least including a knowledge graph in the mangrove domain With index retrieval services. It stores entities (species, indices, models, tools, parameter templates, environmental factors, scenario examples, etc.) and relationships (applicability, dependencies, parameter templates, synonym mappings, etc.), and supports GraphRAG retrieval to return subgraphs. .
[0079] 5. Task Planning and Routing Module 105
[0080] Used to Nodes in By task type Routing to the corresponding tool agent It then performs topology sorting and dependency scheduling to form an executable queue.
[0081] 6. Multi-agent tool layer 106
[0082] At least including:
[0083] Image processing intelligent agents: image preprocessing, registration, fusion, range extraction, change detection, etc.;
[0084] Exponential calculation agents: NDVI / NDWI / NDMI / MVI / TIMI and other exponential calculation and feature engineering;
[0085] Target detection / segmentation agents: species identification, invasive species monitoring, pest and disease and human interference detection;
[0086] Report generation agent: results integration, chart preparation, natural language interpretation, and PDF report generation.
[0087] 7. Tool Execution Layer 107
[0088] Establish a unified tool call interface, preferably including: a tool registry center, a tool adapter, a resource manager, and an execution engine. The execution engine supports synchronous / asynchronous execution, timeout control, error capture, retry strategies, and result caching.
[0089] External Tools and Data Integration Module 108
[0090] This includes UAV imagery and ground survey databases (such as PostgreSQL / PostGIS + distributed file system), model libraries (traditional machine learning models, deep learning models and ecological models), and GIS platforms (spatial analysis and visualization services).
[0091] 8. Results Integration and Verification Module 109
[0092] It is used to perform spatiotemporal alignment, fusion, logical consistency checks and consistency score calculations on multi-source results, and output quality assessment and iterative feedback.
[0093] 9. Result Generation and Interpretation Module 110
[0094] It is used to generate spatial data products (GeoTIFF / Shapefile), statistical charts and structured monitoring reports (PDF), and output early warning information and recommended measures when early warning indicators reach the threshold.
[0095] III. Specific Technical Route for Implementing the Method of the Invention
[0096] like Figure 2 As shown, the method of the present invention forms a closed loop according to S1 to S5 corresponding to this application:
[0097] S1, Natural Language Parsing and Structured Generation Task ;
[0098] S2, GraphRAG retrieves the domain subgraph and generates the task execution graph. ;
[0099] S3 routes and invokes the tool agent to execute, obtaining the node results. With log ;
[0100] S4, result fusion and consistency verification; if the threshold is not reached, a replanning iteration is triggered.
[0101] S5 outputs spatial products / reports and alerts, and writes back the verified parameter templates and logs to [the relevant database / system / etc.]. .
[0102] The parts that contribute more to the inventiveness of this invention are mainly concentrated in: S2's domain knowledge enhancement planning + executable task graph generation, S4's quantitative consistency scoring closed loop + interpretable iterative replanning, and S5's verification through parameter template write-back evolution. These parts will be disclosed in more detail below.
[0103] IV. Specific Implementation of Method Steps
[0104] (I) S1 User Input Parsing: From Natural Language I to Structured Task T
[0105] In a preferred embodiment of the present invention, the natural language interaction module receives user input monitoring instruction I, for example:
[0106] Monitor the expansion of Spartina alterniflora in the mangrove forests of XX Nature Reserve over the past year, and output the expansion area change curve and management recommendations.
[0107] Then, LLM is invoked to extract features from I and encapsulate them into a structured form, generating T. Preferably, T is represented as a JSON object, with the following fields and their meanings (field names can be replaced with equivalents):
[0108] O: Monitoring target objects, such as Spartina alterniflora or mangrove range;
[0109] R: Geographic extent, which can be the name of an administrative region / protected area, a vector boundary (GeoJSON / Shapefile), or a rectangular bounding box. ;
[0110] Time range, preferably expressed as start and end time. For example, the past year;
[0111] A: The set of analysis task types can include range extraction, change detection, species identification, invasion and expansion assessment, report generation, etc.
[0112] Therefore, it can be expressed as:
[0113] ;
[0114] Where: O represents the target object being monitored; R represents the geographical area being monitored; A represents the time range; A represents the analysis task type (which can be a single task or a set of multiple tasks).
[0115] To ensure that T can be stably consumed by subsequent modules, this invention preferably sets structured verification rules:
[0116] Verify that R must be resolvable as a geometric object in a uniform spatial reference frame;
[0117] The verification τ must be resolved into a defined start and end time;
[0118] Verification A must be mapped to the system's supported task type enumeration;
[0119] When necessary fields are missing, LLM generates clarification questions or adopts a default strategy (e.g., default output GeoTIFF+PDF).
[0120] Furthermore, to reduce place name ambiguity, this invention can use a place name resolver to standardize R: if the user inputs "XX protected area," its standard boundary vector is retrieved from the built-in place name database or PostGIS table; if the retrieval fails, the user is prompted to upload a boundary file or provide a latitude and longitude range. This step is an engineering enhancement and does not affect the core closed loop of this invention.
[0121] (II) S2 Domain Knowledge Enhancement Planning: GraphRAG + Task Execution Graph Generation (e.g.) Figure 3 (As shown)
[0122] 1. Knowledge Graph in the Mangrove Domain Structured disclosure
[0123] The present invention It is used to constrain and enhance the task planning of LLM, so that it can not only talk, but also do, do correctly, and be verifiable. Preferably implemented using a graph database (such as Neo4j) or an RDF triplet library, its core entity type set E and relation type set... as follows:
[0124] 1) Entity type E
[0125] Species include: Avicennia marina, Red Seaweed, Kandelia candel, Avicennia marina, Spartina alterniflora, etc.
[0126] Index entities: NDVI, NDWI, NDMI, MVI, TIMI, etc.;
[0127] Tools include: radiometric calibration tools, geometric correction tools, registration tools, segmentation model inference tools, change detection tools, area statistics tools, etc.
[0128] Model entities: Random Forest (RF), Support Vector Machine (SVM), U-Net segmentation model, YOLO detection model, etc.
[0129] Parameter template entity: A set of parameters associated with task type, data type, tide condition, and resolution;
[0130] Entities of environmental factors: tide level, turbidity, water temperature, seasonal phenology, light conditions, etc.
[0131] Business task entities: scope extraction, intrusion expansion assessment, remediation effectiveness assessment, pest and disease identification, etc.;
[0132] Data source entities: RGB orthophotos, multispectral reflectance images, hyperspectral images, LiDAR point clouds, etc.
[0133] 2) Relationship type
[0134] Applicable / Inapplicable: For example, TIMI is applicable to the identification of intertidal mangroves;
[0135] Dependencies: For example, change detection depends on orthophotos registered in two phases;
[0136] Parameter templates: The association between tools / models and parameter templates;
[0137] Synonym mapping: mapping natural language keywords to standard tasks / entities;
[0138] Constraints: For example, when the tide level is above the threshold, the tide level correction strategy / water body mask should be used preferentially.
[0139] Through the above structure It not only stores knowledge, but also executable experience, providing a searchable basis for subsequent GraphRAG operations.
[0140] 2. GraphRAG retrieves subgraphs. Implementation
[0141] In S2, the decision-making central module receives T and executes GraphRAG to obtain a knowledge subgraph related to the task. Preferably, GraphRAG includes the following steps:
[0142] Candidate entity recall: Convert the O, A and geographic / time elements in T into search queries (vector queries + keyword queries), and recall the candidate entity set. ;
[0143] k-jump extension: for E c Perform a k-jump graph traversal, expanding to task-related indices, tools, models, and parameter templates to obtain a subgraph. ;
[0144] Context organization: The organization consists of context fragments that can be consumed by LLM, including recommendation toolchains, parameter templates, dependency constraints, and risk warnings.
[0145] Where k is the expansion hop count, preferably k=1 to k=3, to balance information sufficiency and noise control. If the task is to evaluate the expansion of Spartina alterniflora, then The preferred selection includes: Spartina alterniflora entities, invasion monitoring task entities, recommendation models (such as segmentation models / detection models), tidal / seasonal influencing factors, change detection tools and area statistics tools, and corresponding parameter templates.
[0146] 3. Task Execution Diagram Structured representation and generation mechanism
[0147] In this invention, LLM does not directly output a string of text steps, but instead outputs an executable task execution graph. Preferably, For a directed acyclic graph (DAG):
[0148] ;
[0149] in: A set of task nodes; For a set of dependent edges, the edges express Execution dependencies Output.
[0150] Each task node It must contain at least the following fields:
[0151] Node task type (e.g., geometric correction index calculation, species segmentation inference, change detection, area statistics report generation, etc.).
[0152] Input data mode (e.g., GeoTIFF, multispectral band stack, point cloud LAS, vector boundary, etc.);
[0153] Output data modes (e.g., corrected image, exponential raster, classification mask, change map, statistical table, etc.);
[0154] : Parameter set (including tool / model parameters, threshold, window size, resolution, resampling method, etc.);
[0155] Quality control rules (e.g., registration error threshold, cloud shadow ratio threshold, classification confidence threshold, connected component area threshold, etc.).
[0156] Optional fields Resource budget (CPU cores, GPU memory, timeout limit, etc.).
[0157] To ensure Executable, this invention preferably employs a two-layer generation + verification mechanism:
[0158] First layer: LLM draft generation
[0159] LLM input Then, output a serialized representation of the task execution graph, preferably YAML / JSON. For example, each node should provide: node_id, task_type, inputs, outputs, params, qc_rules, and depends_on.
[0160] Second layer: Rule validation and repair
[0161] The plan checker performs a consistency check on the LLM output:
[0162] 1) Does the dependency relationship form a cycle?
[0163] 2) Whether the node input and output can be matched (e.g., change detection requires input of two images or two classification masks in the same coordinate system).
[0164] 3) Are all parameters complete and within a valid range (e.g., threshold is in [0,1], and resampling method is in the enumeration set)?
[0165] 4) Are any necessary steps missing (e.g., cross-temporal analysis must include registration / alignment)? If defects exist, an LLM plan repair prompt can be triggered to fill in missing nodes or correct dependencies; or the system's built-in completion strategy can be used (e.g., automatically inserting registration and reprojection nodes).
[0166] Through the above mechanism, the present invention can achieve a stable closed loop of natural language → executable flowchart → automatic execution, rather than just staying at the level of text suggestions.
[0167] 4. Node routing rules and multi-task collaboration strategies
[0168] The routing module of this invention is based on Nodes are dispatched to different agents. Preferably, the routing rules include two categories: keyword mapping and type mapping.
[0169] like If the data pertains to image preprocessing / range extraction / registration / change detection, it will be routed to the image processing agent.
[0170] like If it pertains to exponential computation / feature engineering, then it is routed to the exponential computation agent;
[0171] like If the detection falls under the categories of species identification / invasion monitoring / pest and disease detection / human interference detection, then the route is sent to the target detection agent.
[0172] like If the output falls under the categories of results summary / chart generation / explanation output / suggestion generation, it will be routed to the report generation agent.
[0173] To achieve multi-task collaboration, this invention preferably adopts a shared intermediate product strategy:
[0174] The unified coordinate system corrected image output by the image preprocessing node serves as the common input for multiple downstream nodes;
[0175] Exponential grid and texture features can be used as additional channel inputs for object detection / segmentation models;
[0176] Classification masks can be used for area statistics, change detection (comparison of masks from two periods), and visualization for report generation.
[0177] 5. Parameter set With quality control rules Generation and binding
[0178] One of the key aspects of this invention is: LLM generation. This is not a guess, but rather based on... Based on the parameter templates and constraint rules, and combined with the current data source metadata (resolution, band, tide level, shooting time, cloud shadow ratio, etc.), adaptive selection is performed. Parameter set This may include, but is not limited to:
[0179] Image processing parameters: reprojection coordinate system, number of registration feature points, RANSAC threshold, resampling method (nearest neighbor / bilinear), stitching and fusion strategy, etc.
[0180] Exponential calculation parameters: band mapping used (NIR / RED / GREEN, etc.), exponential output scale, threshold segmentation threshold, etc.
[0181] Model inference parameters: model version number, input clipping size, sliding window step size, confidence threshold, post-processing connected component area threshold, etc.
[0182] Change detection parameters: differential threshold, change category definition, minimum change patch area, etc.;
[0183] Statistical parameters: area calculation resolution, vectorization smoothing coefficient, etc.
[0184] Quality control rules For automatic acceptance testing at the node level, including:
[0185] Registration nodes: Root mean square error ;
[0186] Split nodes: Output average confidence level And the proportion of isolated noise patches ;
[0187] Change detection: Minimum area of changed patches To avoid chipping;
[0188] in , , , Default values can be provided by the parameter template, and can be adjusted during iteration.
[0189] (III) S3 Multi-Agent Tool Execution: Tool Interface, Model Inference and Training Disclosure
[0190] 1. Unified tool call interface and execution engine
[0191] The tool execution layer provides a unified tool manifest for each tool, which includes at least: Input / output data types, parameter schemas, resource requirements, timeout policies, and error codes. After the routing module dispatches a task, the execution engine calls the tools in dependency order and generates logs. .
[0192] Each call log Preferred components include:
[0193] Tool Icons ;
[0194] Input data version (File hash / timestamp / path);
[0195] Parameter set ;
[0196] Output data version ;
[0197] Running status (status) and error message (error);
[0198] Resource usage summary (CPU time / GPU memory peak).
[0199] 2. Image Processing Intelligent Agent
[0200] Image processing agents can be encapsulated in libraries such as GDAL, Rasterio, and OpenCV to perform tasks such as radiometric calibration, atmospheric correction, geometric correction, image registration, range extraction, and change detection.
[0201] Geometric correction / registration: It is preferable to use feature point matching (SIFT / ORB) + RANSAC to estimate homography, or to use orthorectification based on DEM / control points.
[0202] Range extraction: Threshold segmentation (such as based on NDVI / NDWI combined threshold), region growing, random forest classification, or vectorization of the output mask of a deep segmentation model can be used.
[0203] 3. Exponential Computation Agent
[0204] An index-calculating agent receives a multispectral band stack or reflectance image and calculates a set of indices (at least two). Typical indices:
[0205] NDVI: NDVI=(NIR-RED) / (NIR+RED);
[0206] NDWI: NDWI=(GREEN-NIR) / (GREEN+NIR);
[0207] NDMI: NDMI=(NIR-SWIR) / (NIR+SWIR);
[0208] It also outputs an exponential raster for downstream model / threshold segmentation.
[0209] 4. Object Detection / Segmentation Agent: Model Structure, Training Methods, Parameter Selection, and Dataset Usage
[0210] To meet the practical requirements of those skilled in the art, the model and training process are disclosed in a structured manner below. This invention allows the use of any model structure that satisfies the input-output constraints; preferred feasible solutions are given below.
[0211] 4.1 Dataset Construction and Labeling Specifications
[0212] (1) Data source
[0213] UAV RGB orthophotos: Ground resolution preferably 0.02–0.20 m / pixel;
[0214] UAV multispectral imagery: should include at least RED, GREEN, and NIR bands, preferably with reflectance correction;
[0215] Ground verification data Includes quadrat location, species record, invasive species boundary or location, sampling time, and photographic evidence.
[0216] (2) Sample preparation
[0217] The orthophoto is sliced into fixed tile sizes, preferably 512×512 or 1024×1024 pixels.
[0218] The target categories (mangrove species / Spartina alterniflora / water bodies / bare land / aquaculture ponds, etc.) are labeled using vector polygons and then rasterized into pixel-level masks.
[0219] For detection tasks, target bounding box annotations can be generated simultaneously, with the bounding box being obtained from the bounding rectangle of a polygon.
[0220] (3) Category system
[0221] Mangrove species: Avicennia marina, Red sea olive, Kandelia candel, Avicennia marina (can be adjusted according to the actual situation of the region);
[0222] Invasive species: Spartina alterniflora;
[0223] Background types: water bodies, bare beaches, construction land / roads, aquaculture ponds, etc.
[0224] (4) Data partitioning
[0225] The dataset is stratified by region and time to divide it into training, validation, and test sets, with the optimal ratio being [missing information]. ; and ensure that adjacent tiles in the same flight do not cross sets as much as possible to reduce space leakage.
[0226] 4.2 Model Structure
[0227] (A) Segmentation model (preferred): U-Net or its improved structure
[0228] Encoder: Consists of multiple convolutional blocks and downsampling, and can use ResNet backbone;
[0229] Decoder: Upsampling + skip connections, outputting a class probability map of the same size as the input;
[0230] Output: Probability of class C for each pixel, where C is the number of classes.
[0231] (B) Detection model (optional): YOLO series single-stage detector
[0232] The backbone network extracts multi-scale features;
[0233] The neck structure integrates different scales;
[0234] The detection head outputs bounding boxes and class confidence scores.
[0235] The present invention preferably uses a segmentation model for intrusion detection (Spartina alterniflora patches). In the detection of human interference (roads / fish ponds), the target bounding box can be output by the detection model and converted into a vector.
[0236] 4.3 Training Methods
[0237] Taking the segmentation model as an example, the training process is as follows:
[0238] Input feature construction:
[0239] RGB channels (3 channels) or multispectral channels (B channels, B≥4);
[0240] Optional additional channels: NDVI, NDWI, and other index raster splicing can be used as additional channels.
[0241] Loss function:
[0242] Using cross-entropy loss With Dice loss Weighted sum:
[0243] ;
[0244] in: , As the weighting coefficient, preferred , Or adjust according to unevenness by category.
[0245] Optimizer and Learning Rate:
[0246] AdamW or SGD are preferred;
[0247] Initial learning rate Preferred arrive ;
[0248] Learning rate scheduling employs cosine annealing or step descent.
[0249] Number of training epochs and batch size:
[0250] The optimal number of training rounds (E) is 50–200.
[0251] The batch size (bs) is determined by the video memory, with 4 to 32 being the preferred size.
[0252] Data augmentation:
[0253] Random rotation, horizontal / vertical flipping, color jitter, random cropping, and scale adjustment; for intertidal scenes, brightness variation / haze simulation enhancements can be added to improve cross-illuminance robustness.
[0254] Evaluation indicators:
[0255] Segmentation: Average intersection-union ratio Dice coefficient, pixel precision OA;
[0256] Detection: , .
[0257] The model library records the metrics and training configurations of each version of the model, forming a traceable version.
[0258] 4.4 Parameter Selection and Inference Deployment
[0259] Inference phase parameter set Preferred options include:
[0260] Enter the cutting size tile (e.g., 512);
[0261] Sliding window stride (e.g., tile / 2);
[0262] Confidence threshold (e.g., 0.5);
[0263] Minimum area of connected components in post-processing (Converted to pixels based on resolution);
[0264] Vectorized smoothing coefficient .
[0265] The inference output includes: classification mask, confidence map, and vector boundary; output for the Spartina alterniflora expansion task. With area ,in It is obtained by accumulating the pixel surface of the mask.
[0266] (iv) S4 Result Integration and Verification: Consistency Scoring Closed Loop and Reprogramming (e.g.) Figure 4 (As shown)
[0267] 1. Results fusion and spatiotemporal alignment
[0268] Multi-source results The results may come from different algorithms, different resolutions, or different times. The results integration and verification module preferably follows a unified strategy as follows:
[0269] Spatial alignment: Reproject all raster results to a unified coordinate system and resample to a unified resolution; Unify topology repair for vector results;
[0270] Time alignment: aligning time series results according to... Divided into multiple time phases ;
[0271] Logical consistency checks: For example, water areas should not be classified as mangrove species; the same pixel should not exhibit unreasonable rapid jumps between adjacent periods (unless a disaster event occurs).
[0272] 2. Consistency score Calculation and parameter definition
[0273] This invention uses consistency scoring as a quality control and iteration triggering criterion, preferably defined as:
[0274] ;
[0275] in:
[0276] Monitoring results Data verified on the ground Consistency indicators;
[0277] A consistency metric between the outputs of at least two different algorithms / models;
[0278] Temporal consistency index of multi-phase results;
[0279] Weighting coefficients, satisfying ;
[0280] The optimal value range is normalized to .
[0281] (1) Implementation
[0282] when When labeling sample points / sample plots, extract the predicted category at the corresponding location, construct a confusion matrix, and calculate the macro-average. :
[0283] ;
[0284] in For each category The average value, normalized to .
[0285] (2) Implementation
[0286] When there are two sets of segmentation result masks and Calculate the intersection-union ratio:
[0287] ;
[0288] in The cardinality of the set of pixels.
[0289] (3) Implementation
[0290] Based on the area sequence of Spartina alterniflora For example, define a rate of change smoothing constraint:
[0291] ;
[0292] in: For the phase number; To prevent division by zero of extremely small constants; These are normalization coefficients used to map the average relative change magnitude to... Scope. If a disaster / man-made cleanup event is known to exist, it can be... The registration allows for mutations, thereby enabling... Calculate the applicable exemption rules.
[0293] 3. Iterative Triggering and Replanning Mechanism
[0294] Let the consistency threshold be... The iteration count is K, and the maximum number of iterations is Replanning is triggered when any of the following conditions are met:
[0295] ;
[0296] If any critical node fails to execute (such as registration failure, model inference error, empty output, etc.);
[0297] Quality control rules Not passed (e.g.) ).
[0298] After a replanning is triggered, the system will provide an inconsistency description and related logs. The failure node location information is organized into a feedback packet F and input into the LLM. The LLM outputs a new correction strategy, which includes at least:
[0299] Modify at least one set of parameters (e.g., adjusting threshold, step size, post-processing area threshold, registration RANSAC threshold, etc.); and / or
[0300] Replace at least one tool agent (e.g., changing from threshold segmentation to a segmentation model, or switching from model A to model B); and / or
[0301] Add necessary nodes (such as inserting tide level correction, water body mask node, and cloud shadow detection node).
[0302] The iteration stopping condition is:
[0303] when Stop and enter S5; or
[0304] when Stop when the output is stopped, output a low-confidence result with a prompt that manual review is required, and indicate the source of uncertainty in the report.
[0305] 4. Interpretable output
[0306] The present invention preferably generates an explanatory segment in each iteration, explaining: why the judgment is inconsistent, what adjustments are made, and what the expected improvement is, and the explanatory segment is matched with the corresponding log. Establish citation relationships to create a traceable chain in the final report. This mechanism makes the results not only usable, but also auditable and verifiable.
[0307] (v) S5 Result Generation, Early Warning and Knowledge Rewriting
[0308] 1. Spatial data products and report output
[0309] when At that time, the system outputs the monitoring results. It includes at least:
[0310] GeoTIFF: Classification mask, exponential raster, change detection raster, etc.;
[0311] Shapefile: Mangrove extent boundaries, invasive patch boundaries, variable patch vectors, etc.;
[0312] PDF report: Includes summary, data sources and methods, results charts, diagnostic / predictive / countermeasure recommendations, quality assessment and uncertainty explanation.
[0313] like Figure 5 As shown, the report-generating agent generates text and images based on a predefined template:
[0314] Statistical graphs (e.g., curves showing the change in intrusion area over time);
[0315] Heatmap (changing patch density or expansion direction);
[0316] Key performance indicators (area, expansion rate, risk classification).
[0317] 2. Calculation of early warning indicator W
[0318] The early warning indicator W can be defined according to specific business needs. Taking the expansion of Spartina alterniflora as an example, it can be defined as a fusion of the expansion rate and the proportion of sensitive areas:
[0319] ;
[0320] in:
[0321] , The intrusion area represents the start and end times, respectively.
[0322] The area of the invading patch that fell into a sensitive area (such as a core protected area / remediation area) at the time of termination;
[0323] This represents the total area of the sensitive zone;
[0324] For the weighting coefficients, it can satisfy... ;
[0325] It is a very small constant.
[0326] when When an alert is triggered, the alert level, hotspot location, and handling suggestions are output.
[0327] 3. Parameter Templates and Log Writeback
[0328] One of the key technical advantages of this invention lies in its accumulative and evolvable nature. Therefore, in S5, the system will verify the... With parameter set Write back to This process generates or updates parameter template entities, which are associated with: task type A, data source type, regional characteristics (intertidal zone / estuary / bay), seasonal and tidal conditions, model version number, etc. Subsequent similar tasks can prioritize finding more matching templates in stage S2, thereby reducing the number of iterations and increasing the success rate.
[0329] The preferred content to be written back includes:
[0330] Task summary (normalized representation of T);
[0331] Execution graph version (node list and dependencies);
[0332] Key parameter templates (threshold, post-processing, step size, model version);
[0333] Quality results ( and its sub-items );
[0334] Applicable and prohibited conditions (e.g., disabling the NDWI threshold segmentation scheme when the turbidity is high).
[0335] V. Feasible Examples
[0336] Below are specific application examples (including experimental design and data presentation paradigms) that can be directly incorporated into the instruction manual.
[0337] (a) Experimental Platform and Data Sources
[0338] 1. Hardware and software environment
[0339] Server: 2 x Intel Xeon CPUs (or equivalent), 256GB RAM
[0340] GPU: NVIDIA RTX 4090 (24GB) ×1 (training), RTX A4000 / 3080, etc. can be selected for inference stage.
[0341] Operating System: Ubuntu 22.04LTS
[0342] Deep learning frameworks: Python 3.10, PyTorch 2.1, CUDA 12.x
[0343] GIS / Remote Sensing Databases: GDAL 3.x, Rasterio, OpenCV, PostGIS 15 (Spatial Database)
[0344] 2. Data Acquisition and Deep Data Generation
[0345] Unmanned aerial vehicle platform: multi-rotor, flight altitude 80–120m; RGB camera (no less than 20MP), optional multispectral payload (including RED / GREEN / NIR).
[0346] Imaging products: GeoTIFF orthophotos, spatial resolution 0.05–0.20 m / pixel.
[0347] Depth (DSM) acquisition methods (choose one or a combination):
[0348] The DSM is obtained by aerial triangulation reconstruction of oblique photogrammetry / multi-view images and then resampled to the same resolution as orthophotos.
[0349] LiDAR point cloud (LAS / LAZ) is interpolated into raster elevation;
[0350] A monocular depth estimation network generates a relative depth map and performs scale alignment using ground control points / DSM.
[0351] Depth normalization: The depth map D is cropped according to the quantiles within the region and normalized to [0,1], and then concatenated with RGB as an additional channel to obtain the depth map. .
[0352] (II) Evaluation Indicators and Calculation Methods
[0353] 1) Crossover ratio and average crossover ratio
[0354] For the first The class defines true positive, false positive, and false negative as follows: ,but:
[0355] ;
[0356] in: For the first Intersection, union, and comparison.
[0357] Number of categories hour:
[0358] ;
[0359] in: This represents the average crossover ratio.
[0360] 2) Overall accuracy
[0361] Let the total number of pixels be N, then:
[0362] ;
[0363] Where: OA represents the overall accuracy.
[0364] 3) Boundary F1 (used to evaluate boundary fit)
[0365] Precision calculation for boundary cell set With recall rate (Allowing a tolerance of d pixels), then:
[0366] ;
[0367] in: F1 is the boundary; d is the boundary tolerance (in this embodiment, d = 2 pixels).
[0368] 4) The closed-loop consistency score of this invention (used for iteration triggering)
[0369] ;
[0370] in: Indicators for consistency with ground verification data; For multi-model consistency; For timing consistency; As weight and This embodiment takes threshold Maximum number of iterations .
[0371] (III) Application Example 1: Automatic Extraction and Statistics of Three Categories from Mangrove Forests to Aquaculture Ponds
[0372] 1. Tasks and Datasets
[0373] Task objective: Perform three-class segmentation of the background / mangrove communities such as Kandelia candel or Sonneratia apeetala (Sonneratia apeetala in the example) / aquaculture ponds in the intertidal zone, and output the aquaculture pond area and mangrove patch boundary vectors.
[0374] Dataset: 1,200 orthophotos (each cropped to 1024×1024 tiles) were collected from two protected areas and surrounding aquaculture areas, and divided into training / validation / test sets in a 7:2:1 ratio; depth maps were obtained from DSM resampling and normalization. Ground truth was drawn into polygons and rasterized by remote sensing experts in conjunction with field verification.
[0375] 2. Actual execution of the system of this invention
[0376] S1 (Natural Language Input): The user inputs the expansion data of aquaculture ponds in region I over the past year, and the output is statistics on the boundaries and area of the aquaculture ponds. The system parses this data to obtain T=⟨O,R,τ,A>, where O = aquaculture ponds, and A includes range extraction / change detection / report generation.
[0377] S2 (GraphRAG+ Programming): From The retrieved submaps for aquaculture ponds—water body confusion—tidal level influence—recommended RGBD segmentation + boundary post-processing were generated. Preprocessing → DSM alignment → RGBD segmentation → Vectorization → Statistics.
[0378] S3 (Tool Execution): Calls the target segmentation agent to execute the comparison models DeepLabV3, PSPNet, Segformer and the preferred model of this invention, DepthSAM (with RGBD as input).
[0379] S4 (Consistency Verification and Iteration): If there are too many boundary fragments or conflicts with ground verification, the minimum connected component area threshold is automatically increased. Parallel reasoning; until .
[0380] S5 (Output and Writeback): Output GeoTIFF classification map, Shapefile boundaries, and PDF report, and write back the Task-Data-Parameter template.
[0381] 3. Result Comparison
[0382] (1) Qualitative results
[0383] Figure 6 The image presents the original image, depth map, ground truth annotations, and a comparison of segmentation results from different models (DeepLabV3, PSPNet, Segformer, and the preferred solution of this invention, DepthSAM). Categories include background, mangrove species (Sonneratia apetala), and aquaculture ponds. In scenes involving aquaculture ponds and nearshore water / dark shadows, and at the boundary between mangroves and the background, the DepthSAM results show boundaries that better match the ground truth, fewer fragments, and better segmentation continuity for narrow embankments / ditches.
[0384] (2) Quantitative results
[0385] Table 1 Comparison of segmentation accuracy for three-class classification tasks (test set, n=120 tiles)
[0386]
[0387] Combination Figure 8 , Figure 9 It can be seen that the preferred embodiment of the present invention is... Both the boundary F1 and the boundary F1 are significantly higher than the comparison model, with the boundary F1 being more significantly improved, directly supporting the technical effect of a closer boundary and fewer fragments.
[0388] (iv) Application Example 2: Fine-grained segmentation of coastal wetlands by multiple categories
[0389] 1. Task and Category System
[0390] Task Objective: To perform multi-category semantic segmentation on a coastal wetland scene, with categories including at least: sea, river, land, spartina, suaeda, tamarix, reed, vegetation, building, road, boat, etc. (and related to...) Figure 7 (Same as legend).
[0391] 2. Test Setup
[0392] Data: A total of 2,400 cropped aerial images (1024×1024) from multiple regions, with depth maps from DSM or monocular depth;
[0393] Training: All models use the same number of training rounds. Initial learning rate Batch size ;
[0394] Post-inference processing: Minimize the area of connected components Morphological opening and closing operations reduce noise; parameters are adaptively adjusted by the system in S4.
[0395] 3. Results and Analysis
[0396] Qualitative comparison: Figure 7The paper presents the original image, depth map, ground truth annotations, and a comparison of segmentation results from different models (DeepLabV3, PSPNet, Segformer, and the preferred scheme of this invention, DepthSAM). Categories include sea, river, land, Spartina alterniflora, Suaeda salsa, Tamarix chinensis, Reeds, vegetation, buildings, roads, and vessels. In distinguishing between linear Spartina alterniflora patches along river boundaries and surrounding mudflats, the preferred scheme of this invention more stably maintains river continuity, reduces road breaks and misclassification of buildings, and more closely approximates the ground truth morphology of invasive targets such as Spartina alterniflora, demonstrating improved identification capabilities for complex terrain / feature boundaries after multi-source information (including depth) fusion.
[0397] Quantitative results:
[0398] DeepLabV3: mIoU=0.51, OA=0.78
[0399] PSPNet: mIoU=0.49, OA=0.76
[0400] Segformer: mIoU=0.56, OA=0.82
[0401] DepthSAM (preferred in this invention): mIoU=0.62, OA=0.86;
[0402] Among them, Spartina alterniflora Example upgrade: Segformer 0.58 → DepthSAM 0.67; River class Example improvement: 0.63→0.72, which can directly support the technical effect of more sensitive monitoring of invasive species expansion and more accurate monitoring of changes in land and water boundaries.
[0403] (V) Application Example 3: Validation of the effectiveness of LLM dynamic programming and closed-loop quality control in end-to-end intrusion expansion monitoring scenarios
[0404] 1. Comparison Object
[0405] Comparative Example 1 (Manual GIS Workflow): Remote sensing personnel manually complete preprocessing, threshold segmentation / classification, vectorization, statistics, and mapping using ArcGIS / QGIS.
[0406] Comparative Example 2 (Fixed Script Process): Engineers pre-write a fixed Python / GDAL pipeline, which runs according to fixed parameters after inputting data.
[0407] Example (System of the Invention): User Natural Language Input → LLM Generation → Multi-agent execution → Consistency score triggers iteration → Output and write back template.
[0408] 2. End-to-end efficiency and quality comparison
[0409] Table 2 Comparison of End-to-End Time and Operational Costs (Same Data, Same Task) (average of sub-tasks)
[0410]
[0411] like Figure 11 As shown, this invention significantly reduces end-to-end time; simultaneously, due to the introduction of a consistency scoring closed loop, its final... A higher average value reflects improved efficiency without sacrificing reliability.
[0412] 3. Evidence of quality improvement brought about by closed-loop iteration
[0413] like Figure 10 As shown, the consistency score is [not specified] in the three task categories of invasion expansion, tree species classification, and range change. As the number of iterations K increases, the threshold is generally reached or exceeded when K≤2. This indicates that the system has automatic correction capabilities (such as automatic parameter adjustment, switching model versions, inserting tide level correction or cloud shadow mask steps).
[0414] 4. The increasingly better the parameter template write-back feature.
[0415] Table 3 Comparison of average iteration count before and after parameter template write-back (similar tasks in the same region)
[0416]
[0417] like Figure 12 As shown, template writing significantly reduces the average number of iterations for subsequent tasks, further demonstrating that the present invention has the technical effect of experience accumulation, reuse, and accelerated convergence.
[0418] The foregoing description of embodiments of the present invention, through which those skilled in the art are able to implement or use the present invention, will be readily apparent to those skilled in the art. Various modifications to these embodiments will be readily apparent to those skilled in the art. The general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novelty disclosed herein.
[0419] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0420] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0421] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0422] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0423] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0424] Memory may include non-persistent memory in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, like read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0425] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
Claims
1. A method for dynamic remote sensing monitoring of mangrove forests using unmanned aerial vehicles (UAVs) based on intelligent agents, characterized in that, The method includes the following steps: S1 receives the user's natural language monitoring instruction I and uses the large language model M to parse and generate structured tasks. O represents the target object being monitored, and R represents the geographical area being monitored. Where A represents the time frame and A represents the analysis task type. S2, Knowledge Graph KG in the Mangrove Domain m The subgraph G is obtained by performing GraphRAG retrieval based on T. s And it is composed of the large language model M combined with I, T, G s Generate task execution graph G task ; S3, press G task Each node is routed to an agent for image processing, remote sensing index calculation, target detection, and report generation to obtain the node results. r i With call log L i ; S4, fusion yields monitoring result R out And calculate the consistency score C. cons ,like If the node fails to execute, then L i Feedback is sent to M for replanning to adjust the parameter set P. i and / or replace the agent, iterating to Or the number of iterations K reaches K max ; To preset the consistency threshold, K max This represents the maximum number of iterations. S5 outputs GeoTIFF and Shapefile data, along with a structured monitoring report, and calculates the early warning indicator W. It generates early warning information in real time, and will verify the P. i、 L i Write back to KG m To update the parameter template, where, The preset warning threshold is used; In step S2, the task execution graph G task Includes multiple task nodes v connected by dependencies i Each task node v i It must include at least: Node task type A i Input data mode Output data mode Parameter set P i With quality control rule Q i Where i is the node number; In step S3, according to the task execution diagram G task For each task node v i Calling and node task type A i Matching tool Agent i Perform the corresponding processing to obtain the node output results. r i And generate call logs L i The tool agent i It includes at least one of the following: image processing agent, remote sensing index calculation agent, target detection agent, and report generation agent; the call log L i At least the following should be recorded: the identifier of the tool invoked, the version of the input data, and the parameter set P. i With output results r i ; In step S4, ; Among them, c gt For monitoring results R out Ground verification data D gt Consistency index between them, c mm As a consistency metric between the outputs of at least two different algorithms, c ts The temporal consistency index for multi-phase results is given by w1, w2, and w3, which are weighting coefficients and satisfy the following conditions: ;when If any task node fails to execute, L i The inconsistency description is fed back to the large language model M, which then applies the task execution graph G to the large language model M. task Perform reprogramming to modify at least one set of parameters P i and / or replace at least one tool agent. i And repeat S3 and S4 until... Or the number of iterations K reaches the upper limit. ; To preset a consistency threshold, This represents the maximum number of iterations.
2. The method according to claim 1, characterized in that, In step S1, the structured task T further includes sensor type S. sensor With output product set O out S sensor Used to characterize UAV payload type, O out A set of formats used to characterize output products; And / or, in step S2, the mangrove domain knowledge graph KG m It includes at least a set of entity types E and a set of relation types Rel, where E includes species entities, invasive alien entities, remote sensing index entities, environmental factor entities, model entities and tool entities, and Rel includes at least one of "applicable-not applicable", "dependent", "parameter template" and "synonymous instruction mapping". And / or, in step S2, the GraphRAG includes: first, performing vector recall based on the structured task description T to obtain a candidate entity set, and then performing k-hop graph expansion on the candidate entity set to obtain the subgraph G. s Where k is the number of extended hops; And / or, in step S2, the task execution graph G task It is a directed acyclic graph, and each task node v i Further including resource budget B i The resource budget B i Used to limit the upper limit of computing resources that a node can execute.
3. The method according to claim 1, characterized in that, In step S3, the image processing agent performs at least two of the following: radiometric calibration, atmospheric correction, geometric correction, and image registration, and outputs a corrected image in a unified coordinate reference system. And / or, in step S3, the index set calculated by the remote sensing index calculation agent includes at least two of NDVI, NDWI, NDMI, MVI and TIMI, and the index is used as one of the input features of the target detection agent; And / or, in step S3, the target detection agent includes at least a detection model M for identifying invasive alien species. inv and output the intrusion range mask. inv With the invasion area A inv ; Among them, A inv By Mask inv The corresponding pixel surface is accumulated and added together.
4. The method according to claim 1, characterized in that, In step S4, the consistency index c mm The spatial overlap index is calculated using different algorithms, and the spatial overlap index includes at least one of the intersection-union ratio (IoU) or the Dice coefficient.
5. The method according to claim 1, characterized in that, The text report generated by the report-generating agent includes four types of paragraphs: descriptive explanation, diagnostic explanation, predictive explanation, and strategic explanation. Each type of paragraph is associated with at least one of the aforementioned call logs. L i To achieve traceability of the analysis process.
6. A multi-agent collaborative UAV remote sensing monitoring system for mangrove forests based on a large language model intelligent hub, characterized in that, The system is used to implement the method according to any one of claims 1-5, comprising: The natural language interaction module is used to receive monitoring instructions I and generate structured task descriptions T; The decision-making central module contains a large language model M, which is used to generate the task execution graph G by combining the GraphRAG retrieval results. task And replan under trigger conditions; The domain knowledge base module includes a mangrove domain knowledge graph (KG). m Used to output subgraph G s And receive parameter write-back; The task planning and routing module is used to map the task execution graph G. task The node is routed to the corresponding tool agent. i ; The tool execution layer is used to execute tool agents and generate call logs. L i ; The results integration and validation module is used to calculate the consistency score C. cons And determine whether an iteration is triggered; The results generation and interpretation module is used to output spatial data products and text reports, and meets the following requirements: It outputs early warning information in real time.
7. The system according to claim 6, characterized in that, The tool execution layer includes a tool registry, a tool adapter, a resource manager, and an execution engine, wherein the execution engine is used to provide timeout control and error handling; And / or, the tool adapter is used to encapsulate heterogeneous remote sensing processing libraries into a unified callable interface, and to write tool capability descriptions into the mangrove domain knowledge graph KG. m ; And / or, the domain knowledge base module also connects to a spatiotemporal database to store UAV image metadata and ground verification data. gt It also supports monitoring by geographic range R and time range. Perform a query; And / or, the spatial data products output by the result generation and interpretation module include at least GeoTIFF and Shapefile, and the text report output is PDF.
8. A computer-readable storage medium having a computer program or instructions stored thereon, characterized in that, When the computer program or instructions are executed by a processor, they implement the steps of the method described in any one of claims 1-5.
9. A computer program product, comprising a computer program or instructions, characterized in that, When the computer program or instructions are executed by a processor, they implement the steps of the method described in any one of claims 1-5.