A natural resource investigation and monitoring data aggregation and fusion method and system based on unmanned aerial vehicle remote sensing

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using UAV remote sensing technology to monitor and encrypt satellite and video images, the problem of identification bias in natural resource surveys has been solved, and efficient data aggregation, fusion and correction have been achieved, improving the accuracy and reliability of monitoring results.

CN120976697BActive Publication Date: 2026-06-26NANJING DILI SURVEYING TECHNOLOGY CO LTD +1

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: NANJING DILI SURVEYING TECHNOLOGY CO LTD
Filing Date: 2025-07-28
Publication Date: 2026-06-26

Application Information

Patent Timeline

28 Jul 2025

Application

26 Jun 2026

Publication

CN120976697B

IPC: G06V10/80; G06V20/13; G06V20/17; G06V20/40; G06V10/774; G06V10/776; G06V10/82; G06N3/0455; G06N3/0464; G06N3/096; G06N3/0985

CPC: G06V10/803; G06V20/13; G06V20/17; G06V20/40; G06V10/774; G06V10/776; G06V10/82; G06N3/0455

AI Tagging

Technology Topics

Sensing dataData aggregator

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing natural resource survey methods lack differential monitoring, making it difficult to detect and correct biases in a timely manner. This makes it difficult to guarantee the accuracy and completeness of the results, and increases the risk of misjudgment or omission.

Method used

By acquiring multi-source remote sensing data, using UAV remote sensing technology for data aggregation and fusion, employing a natural resource identification model to independently identify satellite and video images, calculating overlap, boundary matching degree, and area deviation ratio parameters, automatically marking abnormal areas, and performing secondary data acquisition and fusion verification, combined with encryption algorithms and asymmetric signature algorithms to improve data credibility.

Benefits of technology

This will effectively improve the accuracy and completeness of natural resource monitoring results, reduce the risk of misjudgment, and ensure the credibility and accuracy of data identification.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN120976697B_ABST

Patent Text Reader

Abstract

The application discloses a natural resource investigation and monitoring data gathering and fusing method and system based on unmanned aerial vehicle remote sensing, and relates to the technical field of resource investigation.The application comprises the following steps: inputting satellite image data SAT and video image data VID into a natural resource recognition model NRM, performing independent recognition on the SAT and VID, and outputting a main recognition layer PRD and an auxiliary recognition layer AUP;the application performs recognition on the satellite image data SAT and the video image data VID and respectively outputs the main recognition layer PRD and the auxiliary recognition layer AUP, performs difference analysis on the two layers through an overlap parameter OVL, a boundary matching degree parameter EDG and an area deviation ratio parameter DIF, labels an abnormal area when any parameter exceeds a preset difference tolerance threshold THR, automatically reacquires multi-source remote sensing data and performs secondary fusion verification, can monitor and correct recognition deviation, effectively improves the accuracy and integrity of the natural resource monitoring result, and reduces the risk of misjudgment.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of resource survey technology, specifically to a method and system for data aggregation and fusion of natural resource survey and monitoring based on unmanned aerial vehicle (UAV) remote sensing. Background Technology

[0002] Natural resource surveys are a comprehensive undertaking that integrates remote sensing acquisition, ground verification, and spatiotemporal analysis. They combine large-scale observations from satellite remote sensing platforms with high-resolution low-altitude imaging from UAV platforms to acquire multi-source remote sensing data and simultaneously collect ground sample points. The data is then processed in a geographic information system (GIS) for coordinate registration, feature extraction, and attribute correlation analysis. Furthermore, based on temporal change detection algorithms and multi-source data fusion models, quantitative assessments and dynamic monitoring of elements such as forest cover, grassland distribution, water area, land use types, and mineral resources are conducted. Simultaneously, accuracy verification and error correction are performed using field measurement results to ensure the spatial accuracy and statistical reliability of the resource survey results.

[0003] Existing methods lack difference monitoring, making it impossible to detect and correct biases in a timely manner. This makes it difficult to guarantee the accuracy and completeness of the results, and increases the risk of misjudgment or omission.

[0004] To address the aforementioned technical shortcomings, a solution is proposed. Summary of the Invention

[0005] To address the shortcomings of existing technologies, this invention provides a method and system for data aggregation and fusion of natural resource survey and monitoring based on unmanned aerial vehicle (UAV) remote sensing.

[0006] To achieve the above objectives, the present invention provides the following technical solution: a method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing, comprising:

[0007] S1. Acquire multi-source remote sensing data, which includes satellite image data SAT acquired by a satellite remote sensing platform and video image data VID acquired by a low-altitude UAV platform. During the UAV's aerial photography mission, GPS location information and TRJ flight trajectory are collected. After performing an encryption algorithm on the GPS and TRJ data, the data is embedded in the VID file metadata structure.

[0008] S2. Input satellite image data (SAT) and video image data (VID) into the natural resource identification model (NRM), perform independent identification on SAT and VID, and output the main identification layer PRD and the auxiliary identification layer AUP.

[0009] S3. Calculate the overlap parameter OVL, boundary matching parameter EDG, and area deviation ratio parameter DIF for the main identification layer PRD and the auxiliary identification layer AUP. When any parameter exceeds the preset difference tolerance threshold THR, trigger the abnormal indication flag ALM and mark the abnormal area, and reacquire multi-source remote sensing data for the abnormal area.

[0010] S4. Using SAT data as the primary confidence source MSE and VID data as the secondary confidence source SSE, perform confidence-weighted fusion and output the fused recognition layer.

[0011] Satellite image data (SAT) and video image data (VID) are input into the Natural Resource Identification Model (NRM). The NRM generates pixel-level or object-level high-dimensional semantic codes (ESP) based on the ground feature parameter (FET). The high-dimensional semantic codes (ESP) include category probability distribution (CPD), depth feature vector (DFV), and spatial context relation (SCR).

[0012] The high-dimensional semantic code ESP is passed to the SAT branch and the VID branch respectively. The SAT branch, while preserving high coverage and wide-area information, performs the main recognition task based on CPD and the preset main classification confidence threshold MCT to generate the main recognition layer PRD and calculate the corresponding main boundary confidence PRI. The VID branch uses DFV and SCR to supplement low-altitude high-resolution details and performs detail recognition based on DFV and the preset auxiliary classification confidence threshold ACT to generate the auxiliary recognition layer AUP and calculate the corresponding auxiliary boundary confidence AUX.

[0013] In the SAT branch, each pixel of the input SAT image is mapped to a class probability based on CPD. Pixels in CPD that exceed the preset class confidence threshold THR_CPD are initially classified and labeled by threshold screening. Then, morphological dilation and erosion operations are used to optimize boundary connectivity and a region growing algorithm is applied to expand adjacent pixels of the same type to output the main recognition layer PRD with smooth boundaries and coherent regions.

[0014] In the VID branch, mesoscale feature enhancement processing is performed based on DFV, and spatial relationship maps are constructed in the neighborhood in combination with SCR to distinguish fine textures and small ground features. Then, the boundary deviation is corrected across frames and time through the feature fusion algorithm based on graph convolutional network GCN, generating an auxiliary recognition layer AUP containing high-resolution ground feature outlines and fine structures.

[0015] After synchronously inputting satellite image data (SAT) and video image data (VID) into the Natural Resources Identification Model (NRM) and obtaining the main identification layer (PRD) and auxiliary identification layer (AUP), the system regards SAT data as the main confidence source (MSE) and VID data as the auxiliary confidence source (SSE). Based on the encryption integrity score (ENC), identification consistency score (ACQ), main boundary confidence (PRI), and auxiliary boundary confidence (AUX), a fusion factor is constructed using preset weight coefficients W1 to W6.

[0016] WGT=W1×MSR+W2×SSR+W3×ENC+W4×ACQ+W5×PRI+W6×AUX;

[0017] Among them, MSR represents the global confidence score for satellite identification, SSR represents the detailed confidence score for UAV identification, ENC represents the encryption integrity verification score for GPS and TRJ data, ACQ represents the consistency analysis score for PRD and AUP, and PRI and AUX represent the boundary confidence scores for PRD and AUP, respectively.

[0018] Furthermore, during the drone's aerial photography, location information (GPS) and flight trajectory (TRJ) are collected. The GPS and TRJ are initially encrypted using a pre-set symmetric key (KEY) to generate intermediate ciphertext. After generating an initialization vector (IVC) using a true random number generator (TRG), the IVC and the intermediate ciphertext are used as input to call the AES symmetric encryption algorithm to generate ciphertext (CTE). The asymmetric signature algorithm (RSA) uses the private key (SK) to sign the CTE to generate a digital digest (DSG). The DSG and timestamp are then encapsulated into a structured metadata field and inserted into the video image data (VID).

[0019] Furthermore, the configuration and training process of the Natural Resource Recognition (NRM) model includes:

[0020] S11. Preprocess the satellite image data SAT and video image data VID, and divide them into training set TRN, validation set VAL and test set TST according to the ratio.

[0021] S12. Construct a convolutional neural network that integrates the U-Net decoder structure and the ResNet-50 encoder. The pre-trained weights of the ResNet-50 encoder are learned from ImageNet and the first NFR residual blocks are frozen. The U-Net decoder uses stepwise upsampling and skip connections to fuse semantic features from the encoder.

[0022] S13. In the hyperparameter setting stage, the learning rate LRN is set to 1e-4, the batch size BSZ is set to 16, the optimizer OPT adopts the Adam algorithm, and the loss function LOS is a weighted fusion of cross-entropy and Dice coefficients.

[0023] S14. During the training iteration phase, the model is iterated according to the training cycle EPO=30. Every 30 cycles, the average intersection-union ratio (AIO) and loss LOS_VAL are calculated on the VAL set. The learning rate LRN is adjusted using the LRN decay strategy and the early stopping mechanism is enabled.

[0024] Furthermore, after the training period ends, the mean intersection-union ratio (AIO), precision (ACC), and recall (REC) on the validation set (VAL) are evaluated, and a comprehensive score is calculated based on preset weighting coefficients a1, a2, and a3.

[0025]

[0026] After all cycles are completed, the model weight with the highest PER is selected as the optimal weight parameter and deployed to the Natural Resource Identification (NRM) model.

[0027] Furthermore, for the main recognition layer PRD and the auxiliary recognition layer AUP, the number of intersection pixels and the number of union pixels of the two layers are calculated at the pixel level, and the spatial consistency between the layers is quantified by the overlap parameter OVL = number of intersection pixels / number of union pixels.

[0028] Obtain boundary point sets B_PRD and B_AUP for the boundaries of PRD and AUP. Use the average of the bidirectional minimum distances from each boundary point in B_PRD to the nearest boundary point in B_AUP and from each boundary point in B_AUP to the nearest boundary point in B_PRD as the boundary matching degree parameter EDG to evaluate the degree of boundary alignment.

[0029] Furthermore, the total number of pixels in the areas covered by PRD and AUP are counted to represent the area S_PRD and S_AUP, respectively, and the difference in area between the two layers is measured in the form of area deviation ratio parameter DIF=|S_PRD-S_AUP| / max(S_PRD,S_AUP).

[0030] When any parameter of OVL, EDG, or DIF exceeds the preset difference tolerance threshold THR, the anomaly warning flag ALM is triggered and the anomaly area is automatically marked in the corresponding spatial location layer by highlighting or marking, and multi-source remote sensing data is reacquired for the anomaly area.

[0031] A data aggregation and fusion system for natural resource survey and monitoring based on UAV remote sensing, comprising:

[0032] The data acquisition module uses satellite image data (SAT) acquired by the satellite remote sensing platform and video image data (VID) acquired by the low-altitude unmanned aerial vehicle platform.

[0033] The data recognition module has a built-in Natural Resource Recognition Model (NRM) that identifies the collected satellite image data (SAT) and video image data (VID) respectively, and outputs the main recognition layer (PRD) and the auxiliary recognition layer (AUP).

[0034] The anomaly detection module calculates the spatial overlap parameter OVL, boundary matching parameter EDG, and area deviation ratio parameter DIF for the main recognition layer PRD and the auxiliary recognition layer AUP to identify abnormal areas.

[0035] The confidence fusion output module performs confidence-weighted fusion of the main recognition layer PRD and the auxiliary recognition layer AUP, and outputs the fused recognition layer.

[0036] This invention provides a method and system for data aggregation and fusion in natural resource survey and monitoring based on unmanned aerial vehicle (UAV) remote sensing. Compared with existing technologies, it has the following advantages:

[0037] This invention performs recognition on satellite image data (SAT) and video image data (VID), and outputs a main recognition layer (PRD) and an auxiliary recognition layer (AUP) respectively. The two layers are analyzed for differences using the overlap parameter (OVL), boundary matching parameter (EDG), and area deviation ratio parameter (DIF). When any parameter exceeds the preset difference tolerance threshold (THR), the abnormal area is marked. The multi-source remote sensing data is automatically reacquired and a secondary fusion verification is performed. This invention can monitor and correct recognition deviations, effectively improve the accuracy and completeness of natural resource monitoring results, and reduce the risk of misjudgment.

[0038] This invention embeds GPS and TRJ location information processed by AES symmetric encryption algorithm and RSA asymmetric signature algorithm into the metadata structure of UAV video image data VID, and uses the encryption integrity score ENC as a weight factor in the calculation of the fusion factor WGT. When the ENC score is lower than a preset threshold, the data is marked, which effectively improves the data identification reliability of the monitoring system. Attached Figure Description

[0039] Figure 1 This is a schematic diagram of the principle framework of the present invention. Detailed Implementation

[0040] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0041] Please see Figure 1This application provides a method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing, including:

[0042] S1. Acquire multi-source remote sensing data, which includes satellite image data SAT acquired by a satellite remote sensing platform and video image data VID acquired by a low-altitude UAV platform. During the UAV's aerial photography mission, GPS location information and TRJ flight trajectory are collected. After performing an encryption algorithm on the GPS and TRJ data, the data is embedded in the VID file metadata structure.

[0043] S2. Input satellite image data (SAT) and video image data (VID) into the natural resource identification model (NRM), perform independent identification on SAT and VID, and output the main identification layer PRD and the auxiliary identification layer AUP.

[0044] S3. Calculate the overlap parameter OVL, boundary matching parameter EDG, and area deviation ratio parameter DIF for the main recognition layer PRD and the auxiliary recognition layer AUP. When any parameter exceeds the preset difference tolerance threshold THR, trigger the abnormal prompt flag ALM and mark the abnormal area.

[0045] S4. Using SAT data as the primary confidence source MSE and VID data as the secondary confidence source SSE, perform confidence-weighted fusion and output the fused recognition layer.

[0046] During the drone's aerial photography, location information (GPS) and flight trajectory (TRJ) are collected. The GPS and TRJ use a pre-set symmetric key (KEY) and an initialization vector (IVC) is generated by a true random number generator (TRG). Then, the AES symmetric encryption algorithm is called to generate a ciphertext (CTE). The asymmetric signature algorithm (RSA) uses the private key (SK) to sign the CTE and generate a digital digest (DSG). The DSG and timestamp are then encapsulated into a structured metadata field and inserted into the video image data (VID).

[0047] During the drone's aerial photography mission, this embodiment first collects location information (GPS) and flight trajectory (TRJ) via the built-in GNSS and sends them to the encryption processing unit (ENC). The ENC loads a pre-set symmetric key (KEY, e.g., a 128-bit AES key) to perform preliminary encryption on the GPS and TRJ data, generating intermediate ciphertext. Subsequently, the true random number generator (TRG) generates a 128-bit initialization vector (IVC) based on environmental noise and the system clock. The IVC and the intermediate ciphertext are then used as input to call the AES-256-CBC symmetric encryption algorithm for a second encryption, outputting the final ciphertext (CTE). Following this, the asymmetric signature unit loads the persistently stored RSA private key (SK, e.g., 2048 bits) and performs PKCS#1 on the CTE. v1.5 signature operation generates a digital digest (DSG), and encapsulates the DSG and timestamp (TS) (format YYYYMMDDhhmmssSSS) into a structured metadata field 'EncryptionSignature', where the fields include {"DSG":DSG, "TS":TS}. Finally, the system controller calls the video metadata writing interface to embed the above structured fields into the EXIF metadata segment of the video image data VID. After the writing is completed, a verification loop is executed to verify the CRC check value of the EXIF segment to ensure that the writing is correct and tamper-free. The abort or retry strategy can ensure at least three write attempts or retransmission through the ground station in case of failure, thereby achieving end-to-end encryption protection, signature verification, and traceability management of UAV GPS and TRJ data.

[0048] Beneficial effects: By embedding GPS and TRJ location information processed by AES symmetric encryption algorithm and RSA asymmetric signature algorithm into the metadata structure of UAV video image data VID, and using the encryption integrity score ENC as a weighting factor in the calculation of the fusion factor WGT, the data is marked when the ENC score is lower than a preset threshold, which effectively improves the data recognition credibility of the monitoring system and ensures the accuracy of the fusion recognition layer.

[0049] The configuration and training process of the Natural Resource Recognition (NRM) model includes:

[0050] S11. Preprocess the satellite image data SAT and video image data VID, and divide them into training set TRN, validation set VAL and test set TST according to the ratio.

[0051] S12. Construct a convolutional neural network that integrates the U-Net decoder structure and the ResNet-50 encoder. The pre-trained weights of the ResNet-50 encoder are learned from ImageNet and the first NFR residual blocks are frozen. The U-Net decoder uses stepwise upsampling and skip connections to fuse semantic features from the encoder.

[0052] S13. In the hyperparameter setting stage, the learning rate LRN is set to 1e-4, the batch size BSZ is set to 16, the optimizer OPT adopts the Adam algorithm, and the loss function LOS is a weighted fusion of cross-entropy and Dice coefficients.

[0053] S14. During the training iteration phase, the model is iterated according to the training cycle EPO=30. Every 30 cycles, the average intersection-union ratio (AIO) and loss LOS_VAL are calculated on the VAL set. The learning rate LRN is adjusted using the LRN decay strategy and the early stopping mechanism is enabled.

[0054] After the training period ends, the mean intersection-union ratio (AIO), precision (ACC), and recall (REC) on the validation set (VAL) are evaluated, and a comprehensive score is calculated based on preset weighting coefficients a1, a2, and a3.

[0055]

[0056] After all cycles are completed, the model weight with the highest PER is selected as the optimal weight parameter and deployed to the Natural Resource Identification (NRM) model.

[0057] It is worth noting that, in order to achieve efficient configuration and training of the Natural Resource Identification Model (NRM), the collected satellite image data (SAT) and video image data (VID) are first preprocessed in step S11, including denoising, geometric correction, film effect restoration and color enhancement. Histogram equalization and Gaussian filtering are then used to enhance the image details. Subsequently, the data is divided into a training set (TRN), a validation set (VAL) and a test set (TST) in a ratio of 7:2:1.

[0058] Next, in step S12, a convolutional neural network fusion of U-Net decoder and ResNet-50 encoder is constructed based on the PyTorch deep learning framework. The ResNet-50 encoder loads ImageNet pre-trained weights and freezes the first NFR=3 residual blocks to retain low-level general features. Only residual block 4 and subsequent layers are fine-tuned. Batch normalization (BatchNorm) and Dropout (p=0.3) are added after each residual group to prevent overfitting. The U-Net decoder adopts four-level progressive upsampling and fuses semantic features of different scales of the encoder through fusion skip connection to restore the spatial resolution of object detection.

[0059] Then, in step S13, during the hyperparameter setting phase, the initial learning rate LRN is set to 1×10. -4The cosine annealing learning rate scheduler was enabled, the batch size BSZ was set to 16, the optimizer OPT adopted the Adam algorithm (β1=0.9, β2=0.999) and weighted L2 regularization (weight decay of 0.0001), and the loss function LOS was defined as a weighted fusion of cross-entropy and Dice coefficients (weight ratio 1:1). At the same time, spatial attention (SAM) was added to each layer of the decoder to enhance the sensitivity to boundaries and small features. Then, in the training iteration phase of step S14, EPO=30 training cycles were executed. After every 10 cycles, the average intersection-union ratio AIO and the validation loss LOS_VAL were calculated on the VAL set. The early stopping condition was triggered when AIO did not improve for 5 consecutive cycles. At the same time, LRN was decayed by a factor of 0.5 after each VAL evaluation. During the training process, GPU utilization and memory usage were monitored and BSZ was adjusted as necessary to avoid OOM.

[0060] Satellite image data (SAT) and video image data (VID) are input into the Natural Resource Identification Model (NRM). The NRM generates pixel-level or object-level high-dimensional semantic codes (ESP) based on the ground feature parameter (FET). The high-dimensional semantic codes (ESP) include category probability distribution (CPD), depth feature vector (DFV), and spatial context relation (SCR).

[0061] The high-dimensional semantic code ESP is passed to the SAT branch and the VID branch respectively. The SAT branch performs the main recognition task based on CPD and the preset main classification confidence threshold MCT to generate the main recognition layer PRD and calculate the corresponding main boundary confidence PRI while retaining the high coverage and wide-area information. The VID branch uses DFV and SCR to supplement low-altitude high-resolution details and performs detail recognition based on DFV and the preset auxiliary classification confidence threshold ACT to generate the auxiliary recognition layer AUP and calculate the corresponding auxiliary boundary confidence AUX.

[0062] In the SAT branch, each pixel of the input SAT image is mapped to a class probability based on CPD. Pixels in CPD that exceed the preset class confidence threshold THR_CPD are initially classified and labeled by threshold screening. Then, morphological dilation and erosion operations are used to optimize boundary connectivity and a region growing algorithm is applied to expand adjacent pixels of the same type to output the main recognition layer PRD with smooth boundaries and coherent regions.

[0063] In the VID branch, mesoscale feature enhancement processing is performed based on DFV, and spatial relationship maps are constructed in the neighborhood in combination with SCR to distinguish between fine textures and small ground features. Then, the boundary deviation is corrected across frames and time using a feature fusion algorithm based on graph convolutional network GCN, generating an auxiliary recognition layer (AUP) containing high-resolution ground feature outlines and fine structures.

[0064] It is worth noting that NRM generates pixel-level or object-level high-dimensional semantic encoding ESP based on ground feature parameters FET (such as spectral response, texture statistics, and elevation information) through multi-layer convolution and fully connected operations. This ESP includes class probability distribution CPD (confidence probability vector for each class per pixel), depth feature vector DFV (256-dimensional feature representation for each pixel or object), and spatial context relation SCR (pixel / object adjacency matrix constructed based on graph convolution). Then, the ESP is passed to the SAT branch and VID branch for differential processing. In the SAT branch, the model first performs threshold screening on the CPD, selecting pixels in the CPD that are greater than the preset main classification confidence threshold MCT (e.g., 0.8) for preliminary classification and labeling. Then, the boundary connectivity is optimized through 5×5 morphological dilation and 3×3 erosion operations. Then, the connected region expansion method based on the region growing algorithm is called to merge adjacent pixels of the same type. The main recognition layer PRD with smooth boundaries and coherent spatial distribution is output in the fusion layer. At the same time, the main boundary confidence PRI_CONF is calculated in the boundary extraction unit.

[0065] In the VID branch, the model first uses DFV as a basis to perform multi-scale convolution and attention mechanisms through mesoscale feature enhancement units to highlight texture and small ground features. It then combines SCR to construct a spatial relationship map in the neighborhood to capture the topological association between targets. Subsequently, it performs temporal cross-frame boundary correction through a three-layer graph convolutional network (GCN) and outputs an auxiliary recognition layer AUP containing high-resolution ground feature outlines and calculates the auxiliary boundary confidence AUX_CONF.

[0066] For the main recognition layer PRD and the auxiliary recognition layer AUP, the number of intersection pixels and the number of union pixels of the two layers are calculated at the pixel level, and the spatial consistency between the layers is quantified by the overlap parameter OVL = number of intersection pixels / number of union pixels.

[0067] Obtain boundary point sets B_PRD and B_AUP for the boundaries of PRD and AUP. Use the average of the bidirectional minimum distances from each boundary point in B_PRD to the nearest boundary point in B_AUP and from each boundary point in B_AUP to the nearest boundary point in B_PRD as the boundary matching degree parameter EDG to evaluate the degree of boundary alignment.

[0068] The area S_PRD and S_AUP are represented by the total number of pixels in the regions covered by PRD and AUP, respectively. The difference in area between the two layers is measured by the area deviation ratio parameter DIF = |S_PRD-S_AUP| / max(S_PRD,S_AUP).

[0069] When any parameter of OVL, EDG, or DIF exceeds the preset difference tolerance threshold THR, the anomaly warning flag ALM is triggered and the anomaly area is automatically marked in the corresponding spatial location layer by highlighting or marking, and multi-source remote sensing data is reacquired for the anomaly area.

[0070] After synchronously inputting satellite image data (SAT) and video image data (VID) into the Natural Resources Identification Model (NRM) and obtaining the main identification layer (PRD) and auxiliary identification layer (AUP), the system regards SAT data as the main confidence source (MSE) and VID data as the auxiliary confidence source (SSE). Based on the encryption integrity score (ENC), identification consistency score (ACQ), main boundary confidence (PRI), and auxiliary boundary confidence (AUX), a fusion factor is constructed using preset weight coefficients W1 to W6.

[0071] WGT = W1×MSR + W2×SSR + W3×ENC + W4×ACQ + W5×PRI + W6×AUX. It's worth noting that PRD and AUP are mapped to two confidence sources. The global confidence score (MSR) corresponding to PRI is obtained by averaging the CPD of each pixel in the main recognition layer. The detail confidence score (SSR) corresponding to AUX is calculated using the DFV consistency scoring function for each object in the auxiliary recognition layer. Subsequently, the confidence fusion unit obtains the encryption integrity score (ENC) of GPS and TRJ data. This score is obtained by comparing the consistency of the signature verification results of the DSG output embedded in the VID metadata. Then, the consistency analysis unit calculates the consistency analysis score (ACQ) of PRD and AUP, which is derived based on a weighted scoring function of three parameters: OVL, EDG, and DIF.

[0072] Beneficial effects: This invention performs recognition on satellite image data (SAT) and video image data (VID) and outputs a main recognition layer (PRD) and an auxiliary recognition layer (AUP) respectively. The two layers are analyzed for differences using the overlap parameter (OVL), boundary matching parameter (EDG), and area deviation ratio parameter (DIF). When any parameter exceeds the preset difference tolerance threshold (THR), the abnormal area is marked. Multi-source remote sensing data is automatically reacquired and secondary fusion verification is performed. This invention can monitor and correct recognition deviations, effectively improve the accuracy and completeness of natural resource monitoring results, and reduce the risk of misjudgment.

[0073] A data aggregation and fusion system for natural resource survey and monitoring based on UAV remote sensing, comprising:

[0074] The data acquisition module uses satellite image data (SAT) acquired by the satellite remote sensing platform and video image data (VID) acquired by the low-altitude unmanned aerial vehicle platform.

[0075] The data recognition module has a built-in Natural Resource Recognition Model (NRM) that identifies the collected satellite image data (SAT) and video image data (VID) respectively, and outputs the main recognition layer (PRD) and the auxiliary recognition layer (AUP).

[0076] The anomaly detection module calculates the spatial overlap parameter OVL, boundary matching parameter EDG, and area deviation ratio parameter DIF for the main recognition layer PRD and the auxiliary recognition layer AUP to identify abnormal areas.

[0077] The confidence fusion output module performs confidence-weighted fusion of the main recognition layer PRD and the auxiliary recognition layer AUP, and outputs the fused recognition layer.

[0078] The above embodiments can be implemented, in whole or in part, by software, hardware, firmware, or any other combination thereof. When implemented using software, the above embodiments can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via a wired or wireless network. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more sets of available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. A semiconductor medium can be a solid-state drive.

[0079] Some of the data in the above formulas are numerical calculations with dimensions removed, and the contents not described in detail in this specification are all prior art known to those skilled in the art.

[0080] The above embodiments are only used to illustrate the technical methods of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical methods of the present invention without departing from the spirit and scope of the technical methods of the present invention.

Claims

1. A method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing, characterized in that, include: S1. Acquire multi-source remote sensing data, which includes satellite image data SAT acquired by a satellite remote sensing platform and video image data VID acquired by a low-altitude UAV platform. During the UAV's aerial photography mission, GPS location information and TRJ flight trajectory are collected. After performing an encryption algorithm on the GPS and TRJ data, the data is embedded in the VID file metadata structure. S2. Input satellite image data (SAT) and video image data (VID) into the natural resource identification model (NRM), perform independent identification on SAT and VID, and output the main identification layer PRD and the auxiliary identification layer AUP. S3. Calculate the overlap parameter OVL, boundary matching parameter EDG, and area deviation ratio parameter DIF for the main identification layer PRD and the auxiliary identification layer AUP. When any parameter exceeds the preset difference tolerance threshold THR, trigger the abnormal indication flag ALM and mark the abnormal area, and reacquire multi-source remote sensing data for the abnormal area. S4. Using SAT data as the primary confidence source MSE and VID data as the secondary confidence source SSE, perform confidence-weighted fusion and output the fused recognition layer. Satellite image data (SAT) and video image data (VID) are input into the Natural Resource Identification Model (NRM). The NRM generates pixel-level or object-level high-dimensional semantic codes (ESP) based on the ground feature parameter (FET). The high-dimensional semantic codes (ESP) include category probability distribution (CPD), depth feature vector (DFV), and spatial context relation (SCR). The high-dimensional semantic code ESP is passed to the SAT branch and the VID branch respectively. The SAT branch, while preserving high coverage and wide-area information, performs the main recognition task based on CPD and the preset main classification confidence threshold MCT to generate the main recognition layer PRD and calculate the corresponding main boundary confidence PRI. The VID branch uses DFV and SCR to supplement low-altitude high-resolution details and performs detail recognition based on DFV and the preset auxiliary classification confidence threshold ACT to generate the auxiliary recognition layer AUP and calculate the corresponding auxiliary boundary confidence AUX. In the SAT branch, each pixel of the input SAT image is mapped to a class probability based on CPD. Pixels in CPD that exceed the preset class confidence threshold THR_CPD are initially classified and labeled by threshold screening. Then, morphological dilation and erosion operations are used to optimize boundary connectivity and a region growing algorithm is applied to expand adjacent pixels of the same type to output the main recognition layer PRD with smooth boundaries and coherent regions. In the VID branch, mesoscale feature enhancement processing is performed based on DFV, and spatial relationship maps are constructed in the neighborhood in combination with SCR to distinguish fine textures and small ground features. Then, the boundary deviation is corrected across frames and time through the feature fusion algorithm based on graph convolutional network GCN, generating an auxiliary recognition layer AUP containing high-resolution ground feature outlines and fine structures. After synchronously inputting satellite image data (SAT) and video image data (VID) into the Natural Resources Identification Model (NRM) and obtaining the main identification layer (PRD) and auxiliary identification layer (AUP), the system regards SAT data as the main confidence source (MSE) and VID data as the auxiliary confidence source (SSE). Based on the encryption integrity score (ENC), identification consistency score (ACQ), main boundary confidence (PRI), and auxiliary boundary confidence (AUX), a fusion factor is constructed using preset weight coefficients W1 to W6. WGT=W1×MSR+W2×SSR+W3×ENC+W4×ACQ+W5×PRI+W6×AUX; Among them, MSR represents the global confidence score for satellite identification, SSR represents the detailed confidence score for UAV identification, ENC represents the encryption integrity verification score for GPS and TRJ data, ACQ represents the consistency analysis score for PRD and AUP, and PRI and AUX represent the boundary confidence scores for PRD and AUP, respectively.

2. The method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing according to claim 1, characterized in that, During drone aerial photography, location information (GPS) and flight trajectory (TRJ) are collected. The GPS and TRJ are initially encrypted using a pre-set symmetric key (KEY) to generate intermediate ciphertext. After generating an initialization vector (IVC) using a true random number generator (TRG), the IVC and the intermediate ciphertext are used as input to call the AES symmetric encryption algorithm to generate ciphertext (CTE). The asymmetric signature algorithm (RSA) uses the private key (SK) to sign the CTE to generate a digital digest (DSG). The DSG and timestamp are then encapsulated into a structured metadata field and inserted into the video image data (VID).

3. The method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing according to claim 1, characterized in that, The configuration and training process of the Natural Resource Recognition (NRM) model includes: S11. Preprocess the satellite image data SAT and video image data VID, and divide them into training set TRN, validation set VAL and test set TST according to the ratio. S12. Construct a convolutional neural network that integrates the U-Net decoder structure and the ResNet-50 encoder. The pre-trained weights of the ResNet-50 encoder are learned from ImageNet and the first NFR residual blocks are frozen. The U-Net decoder uses stepwise upsampling and skip connections to fuse semantic features from the encoder. S13. In the hyperparameter setting stage, the learning rate LRN is set to 1e-4, the batch size BSZ is set to 16, the optimizer OPT adopts the Adam algorithm, and the loss function LOS is a weighted fusion of cross-entropy and Dice coefficients. S14. During the training iteration phase, the model is iterated according to the training cycle EPO=30. Every 30 cycles, the average intersection-union ratio (AIO) and loss LOS_VAL are calculated on the VAL set. The learning rate LRN is adjusted using the LRN decay strategy and the early stopping mechanism is enabled.

4. The method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing according to claim 1, characterized in that, After the training period ends, the mean intersection-union ratio (AIO), precision (ACC), and recall (REC) on the validation set (VAL) are evaluated, and a comprehensive score is calculated based on preset weighting coefficients a1, a2, and a3. After all cycles are completed, the model weight with the highest PER is selected as the optimal weight parameter and deployed to the Natural Resource Identification (NRM) model.

5. The method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing according to claim 1, characterized in that, For the main recognition layer PRD and the auxiliary recognition layer AUP, the number of intersection pixels and the number of union pixels of the two layers are calculated at the pixel level, and the spatial consistency between the layers is quantified by the overlap parameter OVL = number of intersection pixels / number of union pixels. Obtain boundary point sets B_PRD and B_AUP for the boundaries of PRD and AUP. Use the average of the bidirectional minimum distances from each boundary point in B_PRD to the nearest boundary point in B_AUP and from each boundary point in B_AUP to the nearest boundary point in B_PRD as the boundary matching degree parameter EDG to evaluate the degree of boundary alignment.

6. The method for data aggregation and fusion of natural resource survey and monitoring based on UAV remote sensing according to claim 1, characterized in that, The area S_PRD and S_AUP are represented by the total number of pixels in the regions covered by PRD and AUP, respectively. The difference in area between the two layers is measured by the area deviation ratio parameter DIF = |S_PRD-S_AUP| / max(S_PRD,S_AUP). When any parameter of OVL, EDG, or DIF exceeds the preset difference tolerance threshold THR, the anomaly warning flag ALM is triggered and the anomaly area is automatically marked in the corresponding spatial location layer by highlighting or marking, and multi-source remote sensing data is reacquired for the anomaly area.

7. A data aggregation and fusion system for natural resource survey and monitoring based on unmanned aerial vehicle (UAV) remote sensing, used in the method described in any one of claims 1-6, characterized in that, include: The data acquisition module uses satellite image data (SAT) acquired by the satellite remote sensing platform and video image data (VID) acquired by the low-altitude unmanned aerial vehicle platform. The data recognition module has a built-in Natural Resource Recognition Model (NRM) that identifies the collected satellite image data (SAT) and video image data (VID) respectively, and outputs the main recognition layer (PRD) and the auxiliary recognition layer (AUP). The anomaly detection module calculates the spatial overlap parameter OVL, boundary matching parameter EDG, and area deviation ratio parameter DIF for the main recognition layer PRD and the auxiliary recognition layer AUP to identify abnormal areas. The confidence fusion output module performs confidence-weighted fusion of the main recognition layer PRD and the auxiliary recognition layer AUP, and outputs the fused recognition layer.