Remote sensing image processing method, device, equipment, medium and product
By combining feature pyramid structure and spatial refinement module in remote sensing image processing, the accuracy problem of small target feature extraction and fusion in remote sensing agricultural images under complex backgrounds is solved, improving the accuracy and clarity of farmland feature maps.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2023-03-27
- Publication Date
- 2026-06-19
Smart Images

Figure CN116310832B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of financial technology or other related fields, and in particular to a remote sensing image processing method, apparatus, equipment, medium and product. Background Technology
[0002] With the rapid development and application of computer vision and satellite technology, the combination of satellite remote sensing image data and target detection technology can significantly reduce costs for monitoring farmland information, providing high efficiency and real-time monitoring. In terms of farmland yield forecasting, the rapid and accurate acquisition of crop and spatial distribution information in a given area, enabling accurate crop yield predictions, is crucial for agricultural production safety early warning, industrial structure optimization, and the sales and distribution of agricultural products. In the financial sector, such as banking, real-time monitoring of agricultural yields is also necessary to assist in providing customer profiles, verifying the authenticity of application information, and understanding changes in farmland information and the growth status of agricultural products.
[0003] Current methods for agricultural remote sensing image recognition generally employ traditional or deep learning approaches, neither of which considers the feature extraction and fusion issues of small-scale farmland in complex environments. In complex backgrounds, remote sensing agricultural images contain numerous small targets with limited pixel information, which are easily lost during multiple feature sampling processes, resulting in low accuracy of the farmland feature maps extracted from agricultural remote sensing images. Summary of the Invention
[0004] This application provides a remote sensing image processing method, apparatus, device, medium, and product to solve the problem that current remote sensing agricultural images in complex backgrounds have many small targets and little pixel information, which are easily lost during multiple feature sampling processes, thus affecting the accuracy of farmland feature maps extracted from agricultural remote sensing images.
[0005] The first aspect of this application provides a remote sensing image processing method, including:
[0006] Based on a preset feature processing model, feature extraction and feature fusion are performed on the agricultural remote sensing image to be processed to generate corresponding deep feature maps and shallow feature maps; the preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in the structure of adjacent layers from top to bottom.
[0007] Texture features are extracted from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features;
[0008] The image segmentation scale is determined based on the deep texture features and the shallow texture features;
[0009] The deep feature map and the shallow feature map are segmented according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions;
[0010] The deep farmland feature regions and the shallow farmland feature regions are merged to generate a farmland feature map.
[0011] Furthermore, in the method described above, the preset feature processing model includes: a preset feature extraction network and a feature pyramid structure;
[0012] The process involves feature extraction and feature fusion of the agricultural remote sensing image to be processed based on a preset feature processing model, generating corresponding deep and shallow feature maps, including:
[0013] The agricultural remote sensing image to be processed is input into a preset feature extraction network to generate multiple feature images from bottom to top.
[0014] The feature images are fused based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps.
[0015] Furthermore, in the method described above, the fusion of the feature images based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps includes:
[0016] Determine deep feature maps based on the feature pyramid structure and the feature image of the top layer;
[0017] The deep feature map is selected as the current feature map, and the following steps are repeated until a shallow feature map is generated:
[0018] The current feature map is upsampled using a spatial thinning module to obtain an upsampled feature map and the corresponding upsampled offset.
[0019] A differentiable bilinear sampling mechanism is used to perform global information acquisition and processing on the current feature map to generate global information corresponding to the current feature map;
[0020] A spatial refinement module is used to fuse the upsampled feature map with the feature image at the same layer based on the upsampled offset and global information to obtain the next layer feature map corresponding to the current feature map in the top-down structure; the feature image at the same layer is a feature image with the same dimension as the upsampled feature map;
[0021] The next layer feature map is selected as the current feature map.
[0022] Further, in the method described above, determining the image segmentation scale based on the deep texture features and the shallow texture features includes:
[0023] The K-means clustering algorithm is used to cluster the shallow feature map based on the deep texture features and the shallow texture features to generate shallow homogeneous regions; the shallow homogeneous regions are the regions with small pixel color differences on the shallow feature map;
[0024] The image segmentation scale is determined based on the shallow homogeneous region and the deep feature map.
[0025] Furthermore, in the method described above, the image segmentation scale includes a spatial segmentation scale, a spectral segmentation scale, and a texture segmentation scale;
[0026] Determining the image segmentation scale based on the shallow homogeneous region and the deep feature map includes:
[0027] The size of the target selection window in the shallow homogeneous region and the deep feature map is iteratively increased to generate the current iteration size corresponding to the current iteration number;
[0028] Determine the mean and local variance of all pixels within the window region; the window region is a local region of the shallow homogeneous region and a local region of the deep feature map selected in the target selection window corresponding to the current iteration size;
[0029] Determine the first-order and second-order rates of change of the mean between the current iteration number and the previous iteration number;
[0030] If the first-order rate of change is less than the preset first-order rate of change threshold and the second-order rate of change is less than the preset second-order rate of change threshold, then the iteration ends and the current iteration size at the end of the iteration is input into the preset size scale relationship algorithm to generate the corresponding spatial segmentation scale.
[0031] The local variance of each pixel in the target selection window at the end of the iteration is input into the preset scale determination algorithm to generate the corresponding spectral segmentation scale and texture segmentation scale.
[0032] Further, in the method described above, the step of segmenting the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions includes:
[0033] The mean displacement algorithm is used to segment the deep feature map and the shallow feature map based on spatial segmentation scale, spectral segmentation scale and texture segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
[0034] Furthermore, in the method described above, the step of fusing the deep farmland feature region and the shallow farmland feature region to generate a farmland feature map includes:
[0035] Construct a deep region adjacency graph corresponding to the deep farmland feature region and a shallow region adjacency graph corresponding to the shallow farmland feature region;
[0036] Based on the merging cost function and the preset merging cost threshold, the regions in the shallow region adjacency graph are merged to generate a merged shallow region adjacency graph.
[0037] Based on the merging cost function and the preset merging cost threshold, the regions in the deep region adjacency graph are merged to generate a merged deep region adjacency graph.
[0038] The merged deep region adjacency map and the merged shallow region adjacency map are fused together to generate a farmland feature map.
[0039] Further, in the method described above, the step of merging regions in the shallow region adjacency graph based on the merging cost function and a preset merging cost threshold to generate a merged shallow region adjacency graph includes:
[0040] The merging cost of each adjacent region in the shallow region adjacency graph is determined based on the merging cost function.
[0041] Select adjacent regions whose merging cost is less than the preset merging cost threshold and merge them to generate a shallow region adjacency graph after merging.
[0042] Furthermore, in the method described above, after fusing the deep farmland features and the shallow farmland features to generate a farmland feature map, the method further includes:
[0043] Farmland areas are extracted from the agricultural remote sensing image based on the farmland feature map to generate a farmland image.
[0044] A second aspect of this application provides a remote sensing image processing apparatus, comprising:
[0045] The first generation module is used to perform feature extraction and feature fusion processing on the agricultural remote sensing image to be processed based on a preset feature processing model, and generate corresponding deep feature maps and shallow feature maps; the preset feature processing model includes a feature pyramid structure with a spatial refinement module embedded in the structure of adjacent layers from top to bottom.
[0046] The second generation module is used to extract texture features from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features;
[0047] The determination module is used to determine the image segmentation scale based on the deep texture features and the shallow texture features;
[0048] The third generation module is used to segment the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
[0049] The fourth generation module is used to merge the deep farmland feature regions and the shallow farmland feature regions to generate a farmland feature map.
[0050] Furthermore, in the apparatus described above, the preset feature processing model includes: a preset feature extraction network and a feature pyramid structure;
[0051] The first generation module is specifically used for:
[0052] The agricultural remote sensing image to be processed is input into a preset feature extraction network to generate multiple feature images from bottom to top; the feature images are fused based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps.
[0053] Furthermore, in the apparatus described above, when the first generation module fuses the feature images based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps, it is specifically used for:
[0054] Based on the feature pyramid structure and the top-level feature image, a deep feature map is determined; the deep feature map is then used as the current feature map, and the following steps are repeated until a shallow feature map is generated: the current feature map is upsampled using a spatial thinning module to obtain an upsampled feature map and its corresponding upsampling offset; a differentiable bilinear sampling mechanism is used to perform global information acquisition and processing on the current feature map to generate global information corresponding to the current feature map; the upsampled feature map is fused with the same-layer feature image using a spatial thinning module based on the upsampling offset and global information to obtain the next-layer feature map corresponding to the current feature map in the top-down structure; the same-layer feature image is a feature image with the same dimension as the upsampled feature map; the next-layer feature map is then used as the current feature map.
[0055] Furthermore, in the apparatus described above, the determining module is specifically used for:
[0056] The K-means clustering algorithm is used to cluster the shallow feature map based on the deep texture features and the shallow texture features to generate shallow homogeneous regions; the shallow homogeneous regions are the regions with small pixel color differences on the shallow feature map; the image segmentation scale is determined based on the shallow homogeneous regions and the deep feature map.
[0057] Furthermore, in the apparatus described above, the image segmentation scale includes a spatial segmentation scale, a spectral segmentation scale, and a texture segmentation scale;
[0058] When determining the image segmentation scale based on the shallow homogeneous region and the deep feature map, the determining module is specifically used for:
[0059] The size of the target selection window in the shallow homogeneous region and the deep feature map is iteratively increased to generate the current iteration size corresponding to the current iteration number; the mean and local variance of all pixels within the window region are determined; the window region is a local region of the shallow homogeneous region and a local region of the deep feature map selected in the target selection window corresponding to the current iteration size; the first-order rate of change and the second-order rate of change of the mean between the current iteration number and the previous iteration number are determined; if the first-order rate of change is less than a preset first-order rate of change threshold and the second-order rate of change is less than a preset second-order rate of change threshold, the iteration ends, and the current iteration size at the end of the iteration is input into a preset size scale relationship algorithm to generate the corresponding spatial segmentation scale; the local variance of each pixel in the target selection window at the end of the iteration is input into a preset scale determination algorithm to generate the corresponding spectral segmentation scale and texture segmentation scale.
[0060] Furthermore, in the apparatus described above, when the third generation module segments the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions, it is specifically used for:
[0061] The mean displacement algorithm is used to segment the deep feature map and the shallow feature map based on spatial segmentation scale, spectral segmentation scale and texture segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
[0062] Furthermore, in the apparatus described above, the fourth generation module is specifically used for:
[0063] Construct a deep region adjacency graph corresponding to the deep farmland feature region and a shallow region adjacency graph corresponding to the shallow farmland feature region; merge the regions in the shallow region adjacency graph based on the merging cost function and a preset merging cost threshold to generate a merged shallow region adjacency graph; merge the regions in the deep region adjacency graph based on the merging cost function and a preset merging cost threshold to generate a merged deep region adjacency graph; fuse the merged deep region adjacency graph and the merged shallow region adjacency graph to generate a farmland feature map.
[0064] Furthermore, in the apparatus described above, when the fourth generation module merges the regions in the shallow region adjacency graph based on the merging cost function and a preset merging cost threshold to generate a merged shallow region adjacency graph, it is specifically used for:
[0065] The merging cost of each adjacent region in the shallow region adjacency graph is determined based on the merging cost function; adjacent regions with merging costs less than the preset merging cost threshold are selected for merging to generate the merged shallow region adjacency graph.
[0066] Furthermore, the apparatus as described above further includes:
[0067] The fifth generation module is used to extract farmland areas from the agricultural remote sensing image based on the farmland feature map and generate a farmland image.
[0068] A third aspect of this application provides an electronic device, including: a memory and a processor;
[0069] The memory stores computer-executed instructions;
[0070] The processor executes computer execution instructions stored in the memory to implement the remote sensing image processing method as described in any of the first aspects.
[0071] A fourth aspect of this application provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the remote sensing image processing method according to any one of the first aspects.
[0072] The fifth aspect of this application provides a computer program product, including a computer program that, when executed by a processor, implements the remote sensing image processing method described in any of the first aspects.
[0073] This application provides a remote sensing image processing method, apparatus, device, medium, and product. The method includes: performing feature extraction and feature fusion processing on an agricultural remote sensing image to be processed based on a preset feature processing model to generate corresponding deep feature maps and shallow feature maps; the preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in the top-to-bottom adjacent layers; extracting texture features from the deep feature maps and the shallow feature maps to generate deep texture features and shallow texture features; determining an image segmentation scale based on the deep texture features and the shallow texture features; segmenting the deep feature maps and the shallow feature maps according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions; and fusing the deep farmland feature regions and the shallow farmland feature regions to generate a farmland feature map. The remote sensing image processing method of this application, by setting a feature pyramid structure with spatial refinement modules embedded in the top-to-bottom adjacent layers in the preset feature processing model, can retain small target features and more semantic information during feature extraction and feature fusion. Meanwhile, to enhance feature edge information and reduce redundant blurring during fusion, a spatial refinement module is added during the top-down process, making the feature information more explicit. After obtaining the deep feature map and the shallow feature map, the farmland feature map obtained by segmenting and fusing according to the image segmentation scale has higher accuracy. Attached Figure Description
[0074] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.
[0075] Figure 1 This is a scene diagram illustrating how the remote sensing image processing method described in the embodiments of this application can be implemented;
[0076] Figure 2 Flowchart of the remote sensing image processing method provided in this application Figure 1 ;
[0077] Figure 3 Flowchart of the remote sensing image processing method provided in this application Figure 2 ;
[0078] Figure 4 Schematic diagram of the feature extraction and fusion process of the remote sensing image processing method provided in this application Figure 1 ;
[0079] Figure 5 Schematic diagram of the feature extraction and fusion process of the remote sensing image processing method provided in this application Figure 2 ;
[0080] Figure 6A schematic diagram of sampling offset and global information for the remote sensing image processing method provided in this application;
[0081] Figure 7a The regional adjacency diagram provided for this application Figure 1 ;
[0082] Figure 7b The regional adjacency diagram provided for this application Figure 2 ;
[0083] Figure 7c The regional adjacency diagram provided for this application Figure 3 ;
[0084] Figure 7d The regional adjacency diagram provided for this application Figure 4 ;
[0085] Figure 8 A schematic diagram of the remote sensing image processing device provided in this application;
[0086] Figure 9 A schematic diagram of the structure of the electronic device provided in this application.
[0087] The accompanying drawings illustrate specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to particular embodiments. Detailed Implementation
[0088] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0089] The collection, storage, use, processing, transmission, provision, and disclosure of user personal information involved in the technical solutions of this application comply with the provisions of relevant laws and regulations and do not violate public order and good morals.
[0090] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, use and processing of the relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions, and corresponding operation portals are provided for users to choose to authorize or refuse.
[0091] It should be noted that the remote sensing image processing methods, apparatus, devices, media, and products disclosed herein can be used in the fintech field or other related fields. They can also be used in any field other than fintech or other related fields. The application fields of the remote sensing image processing methods, apparatus, devices, media, and products disclosed herein are not limited.
[0092] The following is an explanation of the relevant terms:
[0093] Feature Pyramid Network (FPN): A basic component of multi-scale object detection and recognition models that fuses features extracted at different scales.
[0094] Shallow features: Features in the last layer of top-down feature fusion.
[0095] Deep features: the features of the first layer in the top-down feature fusion process.
[0096] Homogeneous region: Areas with small pixel color differences in agricultural remote sensing images.
[0097] Heterogeneous regions: Areas in agricultural remote sensing images with large differences in pixel color.
[0098] Texture features: Texture is an image property that reflects the spatial distribution of pixels, containing important information about the surface structure and arrangement of objects and their relationship with the surrounding environment.
[0099] A Gaussian process (GP) is a combination of a series of normally distributed random variables within an exponential set.
[0100] The technical solutions of this application will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will be described below with reference to the accompanying drawings.
[0101] To clearly understand the technical solution of this application, the existing solutions are first described in detail. Currently, existing farmland extraction technologies are mainly divided into two categories: traditional methods and deep learning. Traditional methods mainly use multi-resolution segmentation methods to obtain targets from extremely high-resolution images and extract spectral, shape, and texture features as input to the general spectral mapping (GP). Then, the GP is used to automatically generate the optimal classifier to extract farmland. Deep learning methods are mainly divided into multi-task encoder-decoder networks, where decoding (detection, continuous, and offline) utilizes deep learning based on spatial, spectral, and temporal cues to automatically extract accurate boundaries from remote sensing images. An end-to-end, single-stage panoramic segmentation method for remote sensing image time series is proposed. This module relies on a novel image sequence coding network with temporal self-attention, proposes rich and adaptive multi-scale spatiotemporal features, introduces spatial and channel attention mechanisms between feature units and max pooling, and adds multi-scale fusion.
[0102] Current traditional and deep learning methods do not consider the feature extraction and fusion problems of small farmland in complex environments. Because ridges and roads between some farmlands are often narrow, features from adjacent farmlands can be confused, leading to insufficient subdivision and inaccurate and incomplete identification of farmland plots (small in size and irregular in shape) at farmland boundaries. In complex backgrounds, remote sensing agricultural images contain many small targets with limited pixel information, which are easily lost during multiple feature sampling processes, resulting in low accuracy of farmland feature maps extracted from agricultural remote sensing images.
[0103] Therefore, addressing the problem in existing technologies where remotely sensed agricultural images in complex backgrounds contain numerous small targets with limited pixel information, leading to easy loss of features during multiple feature sampling and consequently low accuracy in extracted farmland feature maps, the inventors discovered a solution. This can be achieved by embedding a spatial refinement module within the top-to-bottom structure of a feature pyramid in the pre-defined feature processing model. This allows for the preservation of small target features and more semantic information during feature extraction and fusion. Furthermore, to enhance feature edge information and reduce redundancy and ambiguity during fusion, a spatial refinement module is added in the top-to-bottom process, making the feature information more explicit. This ultimately improves the accuracy of farmland feature maps.
[0104] Specifically, based on a pre-defined feature processing model, feature extraction and feature fusion are performed on the agricultural remote sensing image to be processed, generating corresponding deep and shallow feature maps. The pre-defined feature processing model includes a feature pyramid structure with spatial refinement modules embedded in adjacent layers from top to bottom. Texture features are extracted from the deep and shallow feature maps to generate deep and shallow texture features. The image segmentation scale is determined based on the deep and shallow texture features. The deep and shallow feature maps are then segmented according to the image segmentation scale to generate deep and shallow farmland feature regions. Finally, the deep and shallow farmland feature regions are fused to generate a farmland feature map.
[0105] The remote sensing image processing method in this embodiment embeds a spatial refinement module into a feature pyramid structure within the top-to-bottom adjacent layers of the preset feature processing model. This allows for the preservation of small target features and more semantic information during feature extraction and fusion. Simultaneously, to enhance feature edge information and reduce redundant blurring during fusion, a spatial refinement module is added in the top-to-bottom process, making the feature information more explicit. After obtaining deep and shallow feature maps, the farmland feature map obtained through segmentation and fusion according to the image segmentation scale exhibits higher accuracy.
[0106] Based on the above-mentioned inventive discovery, the inventor has proposed the technical solution of this application.
[0107] The application scenarios of the remote sensing image processing method provided in the embodiments of this application are described below. For example... Figure 1 As shown, 1 represents the first electronic device, 2 represents the second electronic device, and 3 represents the third electronic device. The network architecture of the application scenario corresponding to the remote sensing image processing method provided in this embodiment includes: a first electronic device 1, a second electronic device 2, and a third electronic device 3. The second electronic device 2 stores agricultural remote sensing images.
[0108] For example, when it is necessary to extract farmland feature maps from agricultural remote sensing images, the first electronic device 1 acquires the agricultural remote sensing image to be processed from the second electronic device 2. The first electronic device 1 performs feature extraction and feature fusion processing on the agricultural remote sensing image to be processed based on a preset feature processing model, generating corresponding deep feature maps and shallow feature maps. The preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in the top-to-bottom adjacent layers. The first electronic device 1 extracts texture features from the deep and shallow feature maps, generating deep texture features and shallow texture features. Simultaneously, it determines the image segmentation scale based on the deep and shallow texture features. The deep and shallow feature maps are segmented according to the image segmentation scale, generating deep farmland feature regions and shallow farmland feature regions. The first electronic device 1 fuses the deep and shallow farmland feature regions to generate a farmland feature map. After generating the farmland feature map, the first electronic device 1 can output the farmland feature map to a third electronic device 3 for farmland image extraction or farmland recognition processing, etc.
[0109] The embodiments of this application are described below with reference to the accompanying drawings.
[0110] Figure 2 Flowchart of the remote sensing image processing method provided in this application Figure 1 ,like Figure 2 As shown, in this embodiment, the executing entity of this application embodiment is a remote sensing image processing device, which can be integrated into an electronic device, such as a computer or mobile terminal. The remote sensing image processing method provided in this embodiment includes the following steps:
[0111] Step S101: Based on a preset feature processing model, feature extraction and feature fusion are performed on the agricultural remote sensing image to be processed to generate corresponding deep feature maps and shallow feature maps. The preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in the structure of adjacent layers from top to bottom.
[0112] In this embodiment, the preset feature processing model may include a feature pyramid structure. The feature pyramid structure is a basic component of a multi-scale object detection and recognition model that fuses features extracted at different scales. The feature pyramid structure can retain small target features and more semantic information during feature extraction and feature fusion.
[0113] The deep feature map is the first layer of feature maps in the top-down feature fusion process. In the application scenario of this embodiment, it can refer to the image feature parts in complex terrain that are difficult to distinguish and for which farmland outlines are hard to extract. The shallow feature map is the last layer of features in the top-down feature fusion process. In the application scenario of this embodiment, the shallow feature map can refer to the image feature parts that are relatively open and for which farmland outlines can be easily obtained.
[0114] The spatial refinement module is embedded between adjacent layers of the feature pyramid from top to bottom. The specific network structure is as follows: Figure 4 As shown, the overall working process of the spatial refinement module includes two sub-tasks: sampling point offset and global information refinement. It addresses the offset problem caused by upsampling farmland features, enhances feature edge information, reduces redundant blurring during fusion, and makes feature information clearer.
[0115] Step S102: Extract texture features from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features.
[0116] In this embodiment, texture features can be extracted and deep texture features and shallow texture features can be generated by performing texture feature calculation and separability analysis on deep feature maps and shallow feature maps.
[0117] Step S103: Determine the image segmentation scale based on deep texture features and shallow texture features.
[0118] Since shallow feature maps contain a significant amount of semantic information and other environmental interference, they can be further filtered based on deep and shallow texture features to remove interfering information. By determining the image segmentation scale corresponding to the shallow feature maps (after removing interfering information) and the deep feature maps, the accuracy of image segmentation can be improved.
[0119] Step S104: Segment the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
[0120] Based on the image segmentation scale, the mean shift algorithm can be used to segment deep feature maps and shallow feature maps, thereby improving segmentation accuracy.
[0121] Step S105: Merge the deep farmland feature regions and the shallow farmland feature regions to generate a farmland feature map.
[0122] Since features in shallow feature maps can be used to distinguish simple targets, while features in deep feature maps can be used to distinguish complex targets, fusing deep and shallow farmland feature regions can improve the extraction of both small and large target farmland, thereby increasing the accuracy of farmland feature maps.
[0123] This application provides a remote sensing image processing method, apparatus, device, medium, and product. The method includes: extracting and fusing features from an agricultural remote sensing image to be processed based on a preset feature processing model to generate corresponding deep feature maps and shallow feature maps. The preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in adjacent layers from top to bottom. Texture features are extracted from the deep and shallow feature maps to generate deep and shallow texture features. The image segmentation scale is determined based on the deep and shallow texture features. The deep and shallow feature maps are segmented according to the image segmentation scale to generate deep and shallow farmland feature regions. The deep and shallow farmland feature regions are fused to generate a farmland feature map.
[0124] The remote sensing image processing method of this application embeds a spatial refinement module into a feature pyramid structure within the top-to-bottom adjacent layers of a preset feature processing model. This allows for the preservation of small target features and more semantic information during feature extraction and fusion. Simultaneously, to enhance feature edge information and reduce redundant blurring during fusion, a spatial refinement module is added in the top-to-bottom process, making the feature information more explicit. After obtaining deep and shallow feature maps, the resulting farmland feature map, obtained through segmentation and fusion according to the image segmentation scale, exhibits higher accuracy.
[0125] Figure 3 Flowchart of the remote sensing image processing method provided in this application Figure 2 ,like Figure 3 As shown, the remote sensing image processing method provided in this embodiment is a further refinement of the remote sensing image processing method provided in the previous embodiment of this application. The remote sensing image processing method provided in this embodiment includes the following steps.
[0126] Step S201: Input the agricultural remote sensing image to be processed into a preset feature extraction network to generate multiple feature images from bottom to top.
[0127] In this embodiment, the preset feature extraction network can be a ResNet network (residual network), and a bottom-up network is constructed using a pre-trained ResNet network. For example, such as... Figure 4 As shown, feature extraction is performed using a ResNet network to generate multiple bottom-up feature images C2, C3, C4, and C5.
[0128] Step S202: Based on the feature pyramid structure, the feature images are fused to generate corresponding deep feature maps and shallow feature maps.
[0129] For example, such as Figure 4As shown, multiple fused feature maps P5, P4, P3 and P2 are generated after fusing the feature images. Among them, P5 is a deep feature map and P2 is a shallow feature map.
[0130] Optionally, in this embodiment, step S202 specifically includes:
[0131] Deep feature maps are determined based on the feature pyramid structure and the feature image of the top layer.
[0132] The deep feature map is selected as the current feature map, and the following steps are repeated until a shallow feature map is generated:
[0133] The spatial thinning module is used to upsample the current feature map to obtain the upsampled feature map and the corresponding upsampled offset.
[0134] A differentiable bilinear sampling mechanism is used to collect and process global information of the current feature map in order to generate global information corresponding to the current feature map.
[0135] A spatial thinning module is used to fuse the upsampled feature map and the feature image at the same layer based on the upsampling offset and global information, resulting in the next layer feature map corresponding to the current feature map in the top-down structure. The feature image at the same layer is a feature image with the same dimensions as the upsampled feature map.
[0136] The next layer feature map is selected as the current feature map.
[0137] In this embodiment, the specific details of the spatial refinement module are as follows:
[0138] The overall workflow of the spatial refinement module consists of two sub-tasks: sample point offset and global information refinement. Sample point offset is primarily performed between adjacent layers, such as... Figure 4 In the middle, C i and P i Since traditional overlay does not take into account the changes in features during upsampling, it is necessary to evaluate the offset of sampling points before feature fusion. In the image, the target is farmland. During feature processing, regional features will shift along multiple directions. The new offset will cause mapping problems in the original image. For example, feature sampling points may change along the vertical or horizontal direction, forming feature sampling displacement. This displacement will cause feature intersection and overlap during feature fusion, which will cause deformation, redundancy and blurring of image edge features.
[0139] To better record offset changes, this embodiment uses coordinate changes to represent the positional changes of sampling points. The matrix feature map is treated as a two-dimensional plane, with (x, y) representing the coordinate position of the sampling point. The coordinate changes before and after sampling are recorded, and the final change Δ(x, y) is calculated for feature correction during fusion. Subsequently, global information is added to refine the sampling point offset, further optimizing the position and semantic information of the sampling points using surrounding feature information.
[0140] The specific calculation process of the spatial refinement module is as follows: Figure 5 As shown, given two adjacent feature maps C n and P n ,for example Figure 4 In the model, C4 and P5 are compressed using a 1x1 convolutional layer to reduce computational cost. Simultaneously, P5 is upsampled to the size of C4 using a 3x3 deconvolutional network. Then, C4 and the upsampled P5 are concatenated, merging the channels, and used as input to two subnets, each containing two 3x3 convolutional layers. To improve model convergence, coordinate offsets are used to represent the sampling point positions, and the weights of the sampling points are used to incorporate contextual information. The subnet has two outputs: a sampling point offset map in both the vertical and horizontal directions. The other is the weight of each pixel rearrangement. The mathematical expression is shown below.
[0141] s = con v1(cat(decon v(C n ), C n-1 ))
[0142] w = con v2(cat(decon v(C) n ), C n-1 ))
[0143] Where cat(.) represents a connection operation, con v1 represents a dual-channel 3x3 convolutional layer, con v2 represents a single-channel 3x3 convolutional layer, decon v represents a 3x3 deconvolutional layer, and H... n-1 W represents the average length. n-1 This represents the average width. For sampling point offset and global information refinement, please refer to... Figure 6 First, sampling point offset processing is performed, followed by global information processing. Global information refinement mainly refers to refining the environmental information surrounding the current pixel. From the coarse-resolution image p... n-1 The upsampled feature map p is defined. n The position, then the offset s(p) n-1 Add to p n-1Meanwhile, to improve training stability, the offset s is divided by the feature layer C. n-1 and P n-1 The average length and width. Their mapping mathematical relationship is shown below.
[0144]
[0145] The average length and width are represented by H. n-1 and W n-1 express.
[0146] Meanwhile, to address the quantization problem caused by floating-point offset, this embodiment also employs a differentiable bilinear sampling mechanism proposed in spatial transformation networks. This mechanism uses p n The output is approximated by the four nearest neighbors of each pixel in the feature map. After all pixels have been mapped, they form a feature map of the same size as the underlying feature map, denoted as G. n-1 G n-1 The formula for calculating each position is shown below.
[0147]
[0148] Wherein, N(p) n ) indicates in G n p n The neighboring pixels of a pixel, w p This represents the weights for the bilinear kernel distance estimation.
[0149] The next step is to further refine the generated feature map G using global information. n-1 For each pixel, specifically, directly G n-1 Multiply by the weight w. Finally, add the result to the shallow feature map, and then use a 3x3 convolutional layer to output P. n-1 The mathematical formula is as follows:
[0150] P n-1 =conv3((ω⊙G n-1 +up(C n ))+C n-1 )
[0151] conv3 represents a 3x3 convolution, and up represents upsampling.
[0152] Step S203: Extract texture features from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features.
[0153] In this embodiment, the implementation of S203 is similar to that of S102 in the above embodiment, and will not be described in detail here.
[0154] Step S204: The K-means clustering algorithm is used to cluster the shallow feature map based on deep and shallow texture features to generate shallow homogeneous regions. Shallow homogeneous regions are areas with small pixel color differences on the shallow feature map.
[0155] In this embodiment, the purpose of clustering is to improve the scale prediction for fine segmentation and exclude complex objects whose colors are not easily distinguishable from farmland. Due to the complexity of small-scale agricultural landforms, the image segmentation scale h estimated based on the entire image information may not be suitable. This is mainly because the estimated h value obtained for all types of objects may ignore the specific features of the target object. For areas with a high degree of farmland fragmentation, a smaller h value is more suitable for segmenting small objects, while a larger h value is suitable for larger objects. To obtain the optimal h for different farmlands, in this embodiment, the entire shallow feature map is first divided into shallow homogeneous regions and shallow heterogeneous regions using a clustering method based on optimal texture features. In this embodiment, K-means clustering has high computational efficiency and good segmentation performance; therefore, K-means clustering is selected, and its number of clusters is set to 2.
[0156] Shallow heterogeneous regions in agricultural remote sensing images are larger objects with significant pixel color differences. Due to their rapid changes at small spatial scales, they manifest as small farmlands or sparse woodlands in agricultural applications. These regions contain overly cluttered information, leading to significant interference. Conversely, shallow homogeneous regions are areas with small pixel color differences in the shallow feature map, such as large farmlands, rivers, and large forests. To estimate the optimal h-value for segmenting plots, based on the principle that small and large farmlands have the same h-value, shallow heterogeneous regions are excluded, while shallow homogeneous regions are retained. This removes interfering factors from the shallow feature map, reducing the uncertainty in estimating the image segmentation scale and improving the accuracy of determining the segmentation scale.
[0157] Step S205: Determine the image segmentation scale based on the shallow homogeneous regions and deep feature maps. The image segmentation scale includes spatial segmentation scale, spectral segmentation scale, and texture segmentation scale.
[0158] In this embodiment, the implementation of S205 is similar to that of S103 in the above embodiment, and will not be described in detail here.
[0159] Optionally, in this embodiment, S205 specifically includes:
[0160] The size of the target selection window in the shallow homogeneous region and the deep feature map is iteratively increased to generate the current iteration size corresponding to the current iteration number.
[0161] Determine the mean and local variance of all pixels within the window region. The window region is a local region of the shallow homogeneous region and a local region of the deep feature map selected within the target selection window corresponding to the current iteration size.
[0162] Determine the first and second rates of change of the mean between the current iteration number and the previous iteration number.
[0163] If the first-order rate of change is less than the preset first-order rate of change threshold and the second-order rate of change is less than the preset second-order rate of change threshold, the iteration ends, and the current iteration size at the end of the iteration is input into the preset size scale relationship algorithm to generate the corresponding spatial segmentation scale.
[0164] The local variance of each pixel in the target selection window at the end of the iteration is input into the preset scale determination algorithm to generate the corresponding spectral segmentation scale and texture segmentation scale.
[0165] In this embodiment, the multi-scale selection method is extended to multi-layer images to predict the optimal segmentation scale h value in the spatial-spectral-texture layers, which are respectively the spatial segmentation scale h. s Spectral segmentation scale h r and texture segmentation scale h t h s The relationship between the target selection window and the window size (w), i.e., the preset size scale relationship algorithm, can be expressed as the following equation:
[0166] W = 2 × h s +1
[0167] By iteratively increasing the value of W, the local variance within the window region is calculated, and its convergence is used to predict h. s First, the local variance (LV) of each pixel is calculated using all pixels within a W×W window. For the calculation of the LV at boundaries, a symmetric padding method is used to fill in missing pixels outside the region. Simultaneously, the mean ALV of all pixels within the window is calculated, and then the first and second rates of change (FOALV) of ALV for each iteration are calculated as follows:
[0168]
[0169] SOALV i =FOALV i-1 -FOALV i
[0170] Where i and i-1 represent the current and previous iteration numbers, respectively. FOALV i SOALV represents the first-order rate of change of ALV in the i-th iteration. i Indicates FOALV iThe change is referred to as the second-order rate of change. Both FOALV and SOALV are used to evaluate the dynamics of ALV as the number of iterations increases; their values range from 0 to 1. If FOALV is less than a preset first-order rate of change threshold 'a' and SOALV is less than a preset second-order rate of change threshold 'b', then the value corresponding to the current iteration number, h, is used. si As the optimal h s Values. In this embodiment, a and b are set to 0.1 and 0.01, respectively.
[0171] Based on the obtained optimal h s h r and h t The values can be further calculated as the average local standard deviation of the spectrum and texture layer within the window, respectively. This calculation can be performed based on a preset scale-determined algorithm, as detailed below:
[0172]
[0173] Among them, LV j This represents the optimal h on the spectral or texture layer. s The value is the local variance of the j-th pixel within the derived window region. The term n represents the number of pixels included in all homogeneous regions.
[0174] Step S206: The mean displacement algorithm is used to segment the deep feature map and the shallow feature map based on the spatial segmentation scale, the spectral segmentation scale and the texture segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
[0175] In this embodiment, the mean shift algorithm is used for pixel-level image segmentation. Based on spatial segmentation scale, spectral segmentation scale, and texture segmentation scale, it can directly segment deep feature maps and shallow feature maps, thereby generating deep farmland feature regions and shallow farmland feature regions after segmentation.
[0176] Step S207: Construct the deep region adjacency graph corresponding to the deep farmland feature region and the shallow region adjacency graph corresponding to the shallow farmland feature region.
[0177] In this embodiment, the mean-shift method is used for pixel-level image segmentation, which inevitably generates many small fragments. In this embodiment, a region merging process is employed to process these small fragments, improving the accuracy of deriving farmland plots. Region merging is a bottom-up process that combines small but similar adjacent regions to obtain a larger region with specific processing rules. Adjacency determination and merging criteria are two prerequisites that need careful handling during the region merging process. A Region Adjacency Graph (RAG) is a widely used data structure for describing the adjacency relationships between large-area objects within an image.
[0178] Step S208: Based on the merging cost function and the preset merging cost threshold, merge the regions in the shallow region adjacency graph to generate the merged shallow region adjacency graph.
[0179] Optionally, in this embodiment, S208 is specifically used for:
[0180] The merging cost of each adjacent region in the shallow region adjacency graph is determined based on the merging cost function. Adjacent regions with merging costs less than a preset merging cost threshold are selected for merging to generate a merged shallow region adjacency graph.
[0181] The preset merger cost threshold can be set according to actual needs, and this embodiment does not limit it.
[0182] Step S209: Merge the regions in the deep region adjacency graph based on the merge cost function and the preset merge cost threshold to generate a merged deep region adjacency graph.
[0183] Step S210: The merged deep region adjacency map and the merged shallow region adjacency map are fused to generate a farmland feature map.
[0184] In this embodiment, the expression for RAG can be defined as G = (V, E), where V represents the vertex set of the image segmentation region, and E is the set of adjacent edges used to determine the segmentation. Regarding the merging criterion, the proposed merging cost function is adopted, which can be written as the following equation:
[0185]
[0186] Where i and j represent two adjacent regions, M i and M j These are the areas of these two regions, u i and u j Let l(v) represent the feature vectors of regions i and j respectively. i ,v j The value represents the boundary length between two regions. The merge cost function merges fragmented segments with their neighboring regions, resulting in larger areas, longer common boundary lengths, and smaller feature differences. Figure 7a , Figure 7b , Figure 7c and Figure 7d Examples of the initial segmentation process, the constructed RAG, the RAG for region merging, and the region merging and segmentation process are presented respectively. After the initial RAG is constructed, we perform a region merging process based on the RAG and the merging cost function. Specifically, the merging cost between the target region and its neighboring regions is calculated, and then the target region is merged with one of its neighboring regions by selecting the region with the minimum merging cost.
[0187] For example, if Figure 7a Region 2 in regions 1-5 needs to be merged. Based on the calculation of the merging cost between region 2 and its neighboring regions, the derived merging minimization (v) is obtained. i v j If the condition is met, then Region 1 and Region 2 will be merged to obtain a new region labeled 1. The merge then rebuilds the RAG for the next region merge process until all objects meet the region merge criteria.
[0188] Optionally, in this embodiment, after S210, the following steps are also included:
[0189] Farmland areas are extracted from agricultural remote sensing images based on farmland feature maps to generate farmland images.
[0190] After determining the farmland feature map, farmland areas in agricultural remote sensing images can be extracted based on the farmland feature map to generate farmland images, providing a basis for subsequent farmland analysis.
[0191] The method in this embodiment improves feature extraction and segmentation. In general feature extraction, many networks only use a single high-level feature, making it difficult to retain small target feature information. In complex backgrounds, remote sensing agricultural images contain many small targets with limited pixel information, which are easily lost during multiple feature sampling. Using a feature pyramid structure can retain small target features and more semantic information. Simultaneously, to enhance feature edge information and reduce redundant blurring during fusion, a spatial refinement module is added in the top-down process, making feature information more explicit and facilitating subsequent processing. Then, the image is segmented and merged through optimal segmentation spatial scale calculation and mean shift. Through testing in different agricultural image regions, this method improves the accuracy of complex farmland segmentation and exhibits good transferability across different agricultural images. Furthermore, this method outperforms other widely used methods in extracting the integrity of farmland plots. These findings demonstrate that using the method of this embodiment for image segmentation combined with feature fusion at different scales can effectively extract plots from complex agricultural remote sensing images.
[0192] In feature processing, besides adding a spatial refinement module to the top-down structure, a spatial attention mechanism can also be used to make the model or method pay more attention to edge features. Of course, spatial refinement and attention mechanisms can also be combined to complete feature fusion processing. When segmenting features, in addition to the traditional segmentation calculation method used in this embodiment, deep learning models can also be used.
[0193] Figure 8 This is a schematic diagram of the remote sensing image processing device provided in this application, as shown below. Figure 8As shown, in this embodiment, the remote sensing image processing device 300 can be installed in an electronic device, and the remote sensing image processing device 300 includes:
[0194] The first generation module 301 is used to extract and fuse features from the agricultural remote sensing image to be processed based on a preset feature processing model, generating corresponding deep feature maps and shallow feature maps. The preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in the structure of adjacent layers from top to bottom.
[0195] The second generation module 302 is used to extract texture features from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features.
[0196] The determination module 303 is used to determine the image segmentation scale based on deep texture features and shallow texture features.
[0197] The third generation module 304 is used to segment the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
[0198] The fourth generation module 305 is used to merge the deep farmland feature regions and the shallow farmland feature regions to generate a farmland feature map.
[0199] The remote sensing image processing device provided in this embodiment can perform... Figure 2 The technical solution of the method embodiment shown has the same implementation principle and technical effect as... Figure 2 The methods and embodiments shown are similar and will not be described in detail here.
[0200] The remote sensing image processing apparatus provided in this application is a further refinement of the remote sensing image processing apparatus provided in the previous embodiment. The remote sensing image processing apparatus 300 includes:
[0201] Optionally, in this embodiment, the preset feature processing model includes: a preset feature extraction network and a feature pyramid structure.
[0202] The first generation module 301 is specifically used for:
[0203] The agricultural remote sensing image to be processed is input into a preset feature extraction network to generate multiple feature images from bottom to top. The feature images are then fused based on a feature pyramid structure to generate corresponding deep and shallow feature maps.
[0204] Optionally, in this embodiment, when the first generation module 301 fuses the feature images based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps, it is specifically used for:
[0205] The deep feature map is determined based on the feature pyramid structure and the top-level feature image. This deep feature map is then used as the current feature map, and the following steps are repeated until a shallow feature map is generated: The current feature map is upsampled using a spatial thinning module to obtain an upsampled feature map and its corresponding upsampling offset. A differentiable bilinear sampling mechanism is used to collect and process global information from the current feature map to generate global information corresponding to it. The upsampled feature map is then fused with the same-layer feature image using the spatial thinning module based on the upsampling offset and global information to obtain the next-layer feature map corresponding to the current feature map in the top-down structure. The same-layer feature image is a feature image with the same dimension as the upsampled feature map. This next-layer feature map is then used as the current feature map.
[0206] Optionally, in this embodiment, the determining module 303 is specifically used for:
[0207] The K-means clustering algorithm is used to cluster the shallow feature map based on deep and shallow texture features, generating shallow homogeneous regions. These shallow homogeneous regions are areas on the shallow feature map with small pixel color differences. The image segmentation scale is determined based on the shallow homogeneous regions and the deep feature map.
[0208] Optionally, in this embodiment, the image segmentation scale includes a spatial segmentation scale, a spectral segmentation scale, and a texture segmentation scale.
[0209] When determining the image segmentation scale based on shallow homogeneous regions and deep feature maps, module 303 is specifically used for:
[0210] The size of the target selection window in the shallow homogeneous region and deep feature map is iteratively increased to generate the current iteration size corresponding to the current iteration number. The mean and local variance of all pixels within the window region are determined. The window region is a local region of the shallow homogeneous region and a local region of the deep feature map selected in the target selection window corresponding to the current iteration size. The first-order and second-order rates of change of the mean between the current iteration number and the previous iteration number are determined. If the first-order rate of change is less than a preset first-order rate of change threshold and the second-order rate of change is less than a preset second-order rate of change threshold, the iteration ends, and the current iteration size at the end of the iteration is input into a preset size-scale relationship algorithm to generate the corresponding spatial segmentation scale. The local variance of each pixel in the target selection window at the end of the iteration is input into a preset scale determination algorithm to generate the corresponding spectral segmentation scale and texture segmentation scale.
[0211] Optionally, in this embodiment, when the third generation module 304 segments the deep feature map and the shallow feature map according to the image segmentation scale to generate the deep farmland feature region and the shallow farmland feature region, it is specifically used for:
[0212] The mean displacement algorithm is used to segment deep and shallow feature maps based on spatial segmentation scale, spectral segmentation scale, and texture segmentation scale, generating deep farmland feature regions and shallow farmland feature regions.
[0213] Optionally, in this embodiment, the fourth generation module 305 is specifically used for:
[0214] Construct deep region adjacency graphs corresponding to deep farmland feature regions and shallow region adjacency graphs corresponding to shallow farmland feature regions. Merge regions in the shallow region adjacency graphs based on a merging cost function and a preset merging cost threshold to generate a merged shallow region adjacency graph. Merge regions in the deep region adjacency graphs based on the same merging cost function and a preset merging cost threshold to generate a merged deep region adjacency graph. Then, fuse the merged deep region adjacency graphs and the merged shallow region adjacency graphs to generate a farmland feature map.
[0215] Optionally, in this embodiment, when the fourth generation module 305 merges the regions in the shallow region adjacency graph based on the merging cost function and the preset merging cost threshold to generate the merged shallow region adjacency graph, it is specifically used for:
[0216] The merging cost of each adjacent region in the shallow region adjacency graph is determined based on the merging cost function. Adjacent regions with merging costs less than a preset merging cost threshold are selected for merging to generate a merged shallow region adjacency graph.
[0217] Optionally, in this embodiment, the remote sensing image processing device further includes:
[0218] The fifth generation module is used to extract farmland areas from agricultural remote sensing images based on farmland feature maps and generate farmland images.
[0219] The remote sensing image processing device provided in this embodiment can perform... Figures 2-7d The technical solution of the method embodiment shown has the same implementation principle and technical effect as... Figures 2-7d The methods and embodiments shown are similar and will not be described in detail here.
[0220] According to embodiments of this application, this application also provides an electronic device, a computer-readable storage medium, and a computer program product.
[0221] like Figure 9 As shown, Figure 9This is a schematic diagram of the electronic device provided in this application. The electronic device is intended for various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, blade servers, mainframe computers, and other suitable computers. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present application described and / or claimed herein.
[0222] like Figure 9 As shown, the electronic device includes a processor 401 and a memory 402. The various components are interconnected via different buses and can be mounted on a common motherboard or otherwise installed as needed. The processor can process instructions executed within the electronic device.
[0223] The memory 402 is the non-transitory computer-readable storage medium provided in this application. The memory stores instructions executable by at least one processor to cause at least one processor to perform the remote sensing image processing method provided in this application. The non-transitory computer-readable storage medium of this application stores computer instructions for causing a computer to perform the remote sensing image processing method provided in this application.
[0224] Memory 402, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as the program instructions / modules corresponding to the remote sensing image processing method in the embodiments of this application (e.g., attached...). Figure 8 The first generation module 301, the second generation module 302, the determination module 303, the third generation module 304, and the fourth generation module 305 are shown. The processor 401 executes various functional applications and data processing of the electronic device by running non-transient software programs, instructions, and modules stored in the memory 402, thereby implementing the remote sensing image processing method in the above method embodiments.
[0225] In addition, this embodiment also provides a computer product, which, when the instructions in the computer product are executed by the processor of an electronic device, enables the electronic device to perform the remote sensing image processing method of the above embodiment.
[0226] Other embodiments of the present application will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the embodiments of this application that follow the general principles of the embodiments of this application and include common knowledge or customary techniques in the art not disclosed in the embodiments of this application.
[0227] It should be understood that the embodiments of this application are not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from their scope. The scope of the embodiments of this application is limited only by the appended claims.
Claims
1. A method of processing a remote sensing image, characterized in that, include: Based on a preset feature processing model, feature extraction and feature fusion are performed on the agricultural remote sensing image to be processed to generate corresponding deep feature maps and shallow feature maps; the preset feature processing model includes a feature pyramid structure with spatial refinement modules embedded in the structure of adjacent layers from top to bottom. Texture features are extracted from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features; The K-means clustering algorithm is used to cluster the shallow feature map based on the deep texture features and the shallow texture features to generate shallow homogeneous regions; the shallow homogeneous regions are the regions with small pixel color differences on the shallow feature map; The image segmentation scale is determined based on the shallow homogeneous region and the deep feature map; The deep feature map and the shallow feature map are segmented according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions; The deep farmland feature regions and the shallow farmland feature regions are merged to generate a farmland feature map.
2. The method of claim 1, wherein, The preset feature processing model includes: a preset feature extraction network and a feature pyramid structure; The process involves feature extraction and feature fusion of the agricultural remote sensing image to be processed based on a preset feature processing model, generating corresponding deep and shallow feature maps, including: The agricultural remote sensing image to be processed is input into a preset feature extraction network to generate multiple feature images from bottom to top. The feature images are fused based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps.
3. The method of claim 2, wherein, The process of fusing the feature images based on the feature pyramid structure to generate corresponding deep feature maps and shallow feature maps includes: Determine deep feature maps based on the feature pyramid structure and the feature image of the top layer; The deep feature map is selected as the current feature map, and the following steps are repeated until a shallow feature map is generated: The current feature map is upsampled using a spatial thinning module to obtain an upsampled feature map and the corresponding upsampled offset. A differentiable bilinear sampling mechanism is used to perform global information acquisition and processing on the current feature map to generate global information corresponding to the current feature map; A spatial refinement module is used to fuse the upsampled feature map with the feature image at the same layer based on the upsampled offset and global information to obtain the next layer feature map corresponding to the current feature map in the top-down structure; the feature image at the same layer is a feature image with the same dimension as the upsampled feature map; The next layer feature map is selected as the current feature map.
4. The method of claim 3, wherein, The image segmentation scale includes spatial segmentation scale, spectral segmentation scale, and texture segmentation scale; Determining the image segmentation scale based on the shallow homogeneous region and the deep feature map includes: The size of the target selection window in the shallow homogeneous region and the deep feature map is iteratively increased to generate the current iteration size corresponding to the current iteration number; Determine the mean and local variance of all pixels within the window region; the window region is a local region of the shallow homogeneous region and a local region of the deep feature map selected in the target selection window corresponding to the current iteration size; Determine the first-order and second-order rates of change of the mean between the current iteration number and the previous iteration number; If the first-order rate of change is less than the preset first-order rate of change threshold and the second-order rate of change is less than the preset second-order rate of change threshold, then the iteration ends and the current iteration size at the end of the iteration is input into the preset size scale relationship algorithm to generate the corresponding spatial segmentation scale. The local variance of each pixel in the target selection window at the end of the iteration is input into the preset scale determination algorithm to generate the corresponding spectral segmentation scale and texture segmentation scale.
5. The method of claim 4, wherein, The step of segmenting the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions includes: The mean displacement algorithm is used to segment the deep feature map and the shallow feature map based on spatial segmentation scale, spectral segmentation scale and texture segmentation scale to generate deep farmland feature regions and shallow farmland feature regions.
6. The method of claim 5, wherein, The step of fusing the deep farmland feature regions and the shallow farmland feature regions to generate a farmland feature map includes: Construct a deep region adjacency graph corresponding to the deep farmland feature region and a shallow region adjacency graph corresponding to the shallow farmland feature region; Based on the merging cost function and the preset merging cost threshold, the regions in the shallow region adjacency graph are merged to generate a merged shallow region adjacency graph. Based on the merging cost function and the preset merging cost threshold, the regions in the deep region adjacency graph are merged to generate a merged deep region adjacency graph. The merged deep region adjacency map and the merged shallow region adjacency map are fused together to generate a farmland feature map.
7. The method according to claim 6, characterized in that, The process of merging regions in the shallow region adjacency graph based on a merging cost function and a preset merging cost threshold to generate a merged shallow region adjacency graph includes: The merging cost of each adjacent region in the shallow region adjacency graph is determined based on the merging cost function. Select adjacent regions whose merging cost is less than the preset merging cost threshold and merge them to generate a shallow region adjacency graph after merging.
8. The method according to any one of claims 1 to 7, characterized in that, After fusing the deep farmland features and the shallow farmland features to generate a farmland feature map, the process further includes: Farmland areas are extracted from the agricultural remote sensing image based on the farmland feature map to generate a farmland image.
9. A remote sensing image processing apparatus, characterized by comprising: include: The first generation module is used to perform feature extraction and feature fusion processing on the agricultural remote sensing image to be processed based on a preset feature processing model, and generate corresponding deep feature maps and shallow feature maps; the preset feature processing model includes a feature pyramid structure with a spatial refinement module embedded in the structure of adjacent layers from top to bottom. The second generation module is used to extract texture features from the deep feature map and the shallow feature map to generate deep texture features and shallow texture features; The determination module is used to perform clustering processing on the shallow feature map based on the deep texture features and the shallow texture features using the K-means clustering algorithm to generate shallow homogeneous regions; the shallow homogeneous regions are regions with small pixel color differences on the shallow feature map; and the image segmentation scale is determined based on the shallow homogeneous regions and the deep feature map. The third generation module is used to segment the deep feature map and the shallow feature map according to the image segmentation scale to generate deep farmland feature regions and shallow farmland feature regions. The fourth generation module is used to merge the deep farmland feature regions and the shallow farmland feature regions to generate a farmland feature map.
10. An electronic device, comprising: include: Memory and processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory to implement the remote sensing image processing method as described in any one of claims 1 to 8.
11. A computer readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the remote sensing image processing method as described in any one of claims 1 to 8.
12. A computer program product, comprising a computer program, characterized in that, When executed by a processor, the computer program implements the remote sensing image processing method as described in any one of claims 1 to 8.