An adaptive interpolation method for reconstruction of national land space investigation data

By using an adaptive interpolation method that combines multi-dimensional data features and spatial correlations to dynamically calculate the number of neighbors and their weights, the problems of low accuracy due to data gaps and boundary offsets in traditional interpolation algorithms are solved, thus achieving high-precision reconstruction of national spatial data.

CN122285652APending Publication Date: 2026-06-26GEOPHYSICAL SURVEY TEAM OF SHANDONG COALFIELD GEOLOGY BUREAU +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GEOPHYSICAL SURVEY TEAM OF SHANDONG COALFIELD GEOLOGY BUREAU
Filing Date
2026-05-27
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Traditional interpolation algorithms in land spatial surveys suffer from low accuracy in filling data gaps and boundary shifts in heterogeneous data due to fixed parameters and a lack of multi-dimensional collaborative constraints, which reduces the reliability of data reconstruction and decision-making.

Method used

An adaptive interpolation method is adopted. By obtaining the feature vectors and positions of each dimension, the number of adaptive neighbors is calculated. Combining the spatial correlation and stability of multi-dimensional data, the interpolation weight is calculated to achieve adaptive interpolation and alignment reconstruction of multi-dimensional data.

Benefits of technology

It effectively overcomes the problems of excessive smoothing in dense data areas and distortion in sparse areas, improves the reliability and accuracy of multidimensional spatial data reconstruction, suppresses noise interference, and achieves high-precision land spatial data reconstruction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122285652A_ABST
    Figure CN122285652A_ABST
Patent Text Reader

Abstract

This invention relates to the field of data processing technology, and in particular to an adaptive interpolation method for reconstructing land spatial survey data. The method includes: acquiring the locations of sampling points, feature vectors, and interpolation location sets for each dimension; calculating the adaptive neighbor number for any interpolation location in the current dimension, combining the spatial distance within a preset detection range and globally optimized constraint parameters; obtaining the comprehensive reliability of the interpolation location by analyzing the spatial correlation between the current dimension and other dimensions, and the data stability of other dimensions of the interpolation location; calculating the reference degree of each nearest neighbor point determined by the adaptive neighbor number to the interpolation location; and normalizing to obtain the adaptive interpolation weight of each nearest neighbor point, thereby obtaining the feature value of the interpolation location and achieving multi-dimensional data spatial alignment reconstruction. This invention improves the fidelity and analytical reliability of heterogeneous geospatial boundary reconstruction through dynamic evaluation and multi-source collaborative constraints.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data processing technology, and in particular to an adaptive interpolation method for reconstructing land and space survey data. Background Technology

[0002] With the advancement of refined land and space management, spatial surveys are shifting from single-data statistics to comprehensive analysis through multi-source collaboration. To understand land use and resource distribution, it is usually necessary to integrate multi-source data such as spatial raster data, Geographic Information System (GIS) data, and survey statistics to reconstruct and analyze multi-dimensional spatial data. However, data acquired by different sensing devices exhibit significant inconsistencies in spatial sampling resolution. High-resolution fine data layers and low-resolution coarse data layers cannot be directly matched and stitched together. Data fusion technology is needed to fill in the missing values ​​caused by resolution differences, thereby constructing a unified, high-precision spatial analysis foundation.

[0003] To address the spatial gaps caused by resolution differences, spatial interpolation algorithms are typically used to align data. For example, the traditional K-nearest neighbor interpolation algorithm sets a fixed number of neighbors around the blank positions where data needs to be filled, assigns weights based on physical spatial distance, and calculates the missing features at that position through a weighted average method, thereby expanding and aligning the low-resolution data layer.

[0004] However, the actual distribution of geographic elements in the real land space exhibits strong spatial heterogeneity. If the number of fixed neighbors is set too large, in data-dense areas such as towns, the algorithm will cross micro-physical boundaries and mix irrelevant features, resulting in excessive smoothing of high-frequency details. If the number of fixed neighbors is set too small, in data-sparse areas such as mountainous regions, the interpolation results will be completely dominated by a very small number of isolated far endpoints, failing to form statistically stable support and thus producing severe local distortions. Furthermore, traditional interpolation relies only on the spatial distance of a single data layer, ignoring the collaborative guidance provided by other multi-source data layers. For example, when interpolating at the boundary of abrupt changes in data features, without the structural constraints of multi-source data layers, it is easy to incorrectly fill features on one side of the boundary into the region on the other side. This fixed parameter processing method, lacking multi-dimensional collaborative constraints, leads to severe offsets and logical conflicts in heterogeneous data boundaries after spatial alignment of multi-source data, reducing the reliability of land space data reconstruction and subsequent decision-making. Summary of the Invention

[0005] In view of this, embodiments of the present invention provide an adaptive interpolation method for reconstructing territorial spatial survey data, in order to solve the problems of low accuracy of spatial missing value interpolation and heterogeneous data boundary offset caused by the use of fixed parameters and lack of multi-dimensional collaborative constraints in the interpolation algorithm during the fusion of multi-source heterogeneous data.

[0006] This invention provides an adaptive interpolation method for reconstructing land spatial survey data, comprising the following steps: obtaining the feature vectors and positions of sampling points in each dimension, and extracting the interpolation position set for each dimension; for any interpolation position in the current dimension: using the spatial distance between the interpolation position and all nearby detection points within its preset baseline detection range, and the globally optimized fixed minimum neighbor number and mapping relationship coefficient, calculating the adaptive neighbor number of the interpolation position; obtaining the corresponding paired points of any sampling point in the current dimension and any other dimension, calculating the similarity of the feature vectors between adjacent points of the sampling point in the current dimension and its corresponding paired points after sorting, and constructing a first sequence and a second sequence; calculating the spatial association between the current dimension and any other dimension based on the correlation between the two sequences and the spatial distance between any sampling point and its corresponding paired point. The method involves: extracting all nearest neighbors in the current dimension and all neighbor points in any other dimension based on the adaptive neighbor number of the interpolation position; calculating the data stability of the interpolation position in any other dimension based on the similarity of the feature vectors of each pair of neighbor points; weighting the spatial correlation with the data stability of the corresponding other dimension to obtain the comprehensive reliability of the interpolation position; combining the spatial distance between the interpolation position and any nearest neighbor point, the similarity of the feature vectors between the nearest neighbors, and the globally optimized weight allocation parameters to calculate the reference degree of any nearest neighbor point to the interpolation position, and performing linear normalization to obtain the adaptive interpolation weight of any nearest neighbor point; inputting all nearest neighbors and their adaptive interpolation weights into the K-nearest neighbor interpolation algorithm to obtain the feature value of the interpolation position; thereby realizing adaptive interpolation of missing data and spatial alignment and reconstruction of multidimensional data.

[0007] Preferably, the step of extracting the interpolation position set for each dimension includes: marking all sampling point coordinates in all dimensions as a global position set, marking all sampling point coordinates in each dimension as a position set for each dimension, performing a spatial difference operation between the global position set and the position set for each dimension, extracting position coordinates that exist in the global position set but not in the position set for each dimension, and obtaining the interpolation position set for each dimension.

[0008] Preferably, the calculation of the adaptive neighbor number for the interpolation position includes: obtaining all detected nearest neighbors within the preset baseline detection range using the preset baseline detected neighbor number and the K-nearest neighbor algorithm; and calculating the interpolation position for the current dimension. Adaptive neighbor number , In the formula, This represents the fixed minimum number of neighbors for global optimization in the current dimension; This represents the mapping coefficient for global optimization in the current dimension; , Indicates the index and total number of the baseline probe neighbors; Indicates the interpolation position in the current dimension. Its first Normalized value of the Euclidean distance between each nearest neighbor point; This represents the function for rounding up.

[0009] Preferably, obtaining the corresponding pairing point between any sampling point in the current dimension and any other dimension includes: traversing to obtain the sampling point corresponding to the minimum Euclidean distance between the sampling point and all sampling points in any other dimension, and using it as the corresponding pairing point between the sampling point and any other dimension.

[0010] Preferably, the step of calculating the similarity of feature vectors between adjacent points after sorting the sampling points and their corresponding paired points in the current dimension includes: for the current dimension, after sorting all sampling points in ascending order according to the initially stored grid numbers, calculating the cosine similarity of feature vectors between adjacent points; after sorting all sampling points and all paired points corresponding to any other dimension in ascending order according to the initially stored network numbers, calculating the cosine similarity of feature vectors between adjacent paired points.

[0011] Preferably, the spatial correlation between the current dimension and any other dimension satisfies the expression: In the formula, Indicates the relationship between the current dimension and other dimensions. Spatial correlation in multiple dimensions; This represents the absolute value of the Spearman rank correlation coefficient between the first and second sequences; Represents the sampling points of the current dimension. Compared with other Each dimension corresponds to a pairing point. The normalized value of the Euclidean distance between them; , This indicates the index and total number of sampling points for the current dimension.

[0012] Preferably, the method for obtaining the comprehensive reliability of the interpolation position includes: calculating the cosine similarity of the feature vectors of all pairwise neighboring points, taking the absolute value of all cosine similarities and averaging them to obtain the data stability of any other dimension; calculating the product of the spatial correlation and the data stability, summing all the product results to obtain a summation result, calculating the ratio of the summation result to the sum of all spatial correlations and normalizing it to obtain the comprehensive reliability.

[0013] Preferably, the degree of reference of any nearest neighbor point to the interpolation position satisfies the expression: In the formula, Indicates nearest neighbor points For interpolation position The degree of reference; Indicates nearest neighbor points With interpolation position The normalized value of the Euclidean distance between them; Assign parameters to the weights for global optimization in the current dimension; Indicates the interpolation position in the current dimension. Overall reliability at the location; Indicates the interpolation position Nearest neighbor The local feature consistency is equal to the mean of the absolute values ​​of the cosine similarity between any nearest neighbor and the feature vectors of all other nearest neighbors.

[0014] Preferably, the step of performing linear normalization to obtain the adaptive interpolation weight of any nearest neighbor point includes: dividing the reference degree of any nearest neighbor point to the interpolation position by the sum of the reference degrees of all nearest neighbors at the interpolation position to obtain the adaptive interpolation weight of any nearest neighbor point.

[0015] Preferably, the fixed minimum neighbor number, mapping coefficient, and weight allocation parameters for global optimization are obtained as follows: Test interpolation positions are extracted from all sampling points in the current dimension using spatial hierarchical random sampling; a constrained search space containing the fixed minimum neighbor number, mapping coefficient, and weight allocation parameters is constructed; in the particle swarm optimization iteration, the particle parameter combination is substituted into the calculation of the adaptive interpolation weights of each nearest neighbor point at the test interpolation position, and the predicted feature value is obtained through the K-nearest neighbor interpolation algorithm; the parameter combination that minimizes the root mean square error between the predicted feature value and the true value is selected as the fixed minimum neighbor number, mapping coefficient, and weight allocation parameters for global optimization.

[0016] The beneficial effects of the embodiments of the present invention compared with the prior art are as follows:

[0017] This invention abandons the fixed neighbor number setting and dynamically calculates the adaptive neighbor number that best matches local characteristics by combining the physical spatial distance within the benchmark detection range with the globally optimized mapping relationship coefficient, effectively overcoming the problems of excessive smoothing in dense data areas and distortion in sparse areas. Secondly, it introduces the spatial correlation of multi-source data and the stability of local data to evaluate the overall reliability, and combines local feature consistency to calculate the adaptive interpolation weight under multi-source collaborative constraints through multi-dimensional feature product modulation and weighted fusion. This mechanism, combined with global parameter optimization, effectively suppresses the interference of invalid noise points that cross abrupt surface boundaries, achieving high-precision reconstruction of real geographical boundaries in complex and heterogeneous regions, and improving the reliability of multi-dimensional spatial data reconstruction and intelligent analysis decision-making. Attached Figure Description

[0018] To more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0019] Figure 1 This is a flowchart of an adaptive interpolation method for reconstructing territorial spatial survey data, provided in Embodiment 1 of the present invention. Detailed Implementation

[0020] Embodiments of this disclosure are described in detail below, with examples of these embodiments illustrated in the accompanying drawings. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain this disclosure, and should not be construed as limiting it.

[0021] To illustrate the technical solution of the present invention, specific embodiments are described below.

[0022] This invention provides an adaptive interpolation method for reconstructing land spatial survey data, such as... Figure 1 As shown, the method includes the following steps:

[0023] Step S101: Obtain the set of position points required for alignment interpolation in each dimension.

[0024] It should be noted that in order to accurately interpolate and align low-resolution data to a high-resolution unified grid, they must be converted into a unified feature representation format with the same dimensions. In addition, since the sampling points of data in different dimensions have spatial differences, before blind interpolation, it is necessary to compare each data layer to determine which spatial coordinate points of the current dimension are missing data, thereby locating the specific location where interpolation is needed, and providing an accurate basic location input source for the subsequent adaptive interpolation algorithm.

[0025] Specifically, multidimensional raw data of national land space is acquired through various sensing and detection devices and business databases. The first dimension is spatial grid data: physical signal detection of the target area is performed using preset multispectral sensors, LiDAR, or thermal infrared detection devices. The data is essentially a two-dimensional numerical matrix array generated by discretizing and mapping the physical energy values ​​captured by the sensors according to the spatial grid, and the physical energy value corresponding to each grid slice is obtained. The second dimension is geographic information system (GIS) data: for each grid slice, the GIS data of the area is obtained from the geographic information database, and its vectorized business attribute information is extracted. The third dimension is survey statistics data: discretized statistical indicators associated with the spatial unit of each grid slice are extracted from the administrative management or special survey system.

[0026] After obtaining the multidimensional raw data of the national land space, a unified coordinate system transformation is performed, and feature vectors for each sampling point of each dimension of the data are constructed as follows:

[0027] For spatial raster data, based on the feature encoding network in the geographic information system, the gridded discrete physical observations are input into the hidden layer of the network, and the output is a numerical vector of fixed dimensions for each grid tile, which serves as the feature vector for each sampling point of the spatial raster data. The feature vector is specifically represented as a 128-bit floating-point array, such as [0.12, 0.45, ..., 0.88]. This vector quantitatively characterizes the physical distribution attributes of the grid tiles in mathematical space.

[0028] For GIS data, the vectorized business attribute fields corresponding to each grid tile are extracted and standardized, such as land use type codes, building density percentages, and average floor area ratios. Since this type of business data contains a large amount of discrete category information, such as non-continuous values ​​like land use type codes, or continuous attribute values ​​like building density percentages and average floor area ratios, the system inputs this data into a pre-built multilayer perceptron network with an embedding layer. The embedding layer maps high-dimensional, sparse, discrete business category codes to low-dimensional, dense, continuous features through pre-trained weights. These features are then fused and their dimensions compressed by the fully connected layer of the multilayer perceptron, forcing the output to be a numerical feature vector completely consistent with the first dimension. This vector is then used as the feature vector for each sampling point in the GIS data.

[0029] For the survey statistics, the indicators associated with each grid slice spatial unit are extracted and standardized, such as per capita green coverage area, output value per unit area, and population density distribution value. The same dimensionality compression process is then performed to obtain the feature vector of each sampling point of the survey statistics.

[0030] Furthermore, the location coordinates of each sampling point in each dimension of data are obtained. For spatial raster data, the geographic coordinates of the center of each grid tile are calculated and extracted using the starting coordinates of the data array in that dimension and the grid sampling step size, and used as the location coordinates of each sampling point in the spatial raster data. For GIS data, the centroid of the isometric geometry or the node coordinates of the point data of each grid tile are extracted using a geometry engine, and used as the location coordinates of each sampling point in the GIS data. For survey statistics, the center coordinates of the corresponding administrative region boundary are extracted based on the administrative division code associated with each grid tile, and used as the location coordinates of each sampling point in the survey statistics.

[0031] This embodiment takes the current dimension, i.e., the spatial raster data dimension, as an example for detailed analysis:

[0032] For the current dimension, extract the position coordinates of all sampling points in the current dimension, and denote them as the position set of the current dimension. Simultaneously, the location coordinates of all sampling points across all dimensions in this survey are denoted as the global location set. Perform a spatial difference operation between the global location set and the current dimension location set, that is, extract the locations that exist in the global location set. It exists in the set of locations in the current dimension but not in the current dimension. The position coordinates are used to obtain the set of position points required for the current dimension alignment interpolation, thus providing an accurate input source for subsequent missing data imputation algorithms based on adaptive weights.

[0033] At this point, the set of position points required for the current dimension alignment interpolation has been obtained.

[0034] Step S102: Obtain the adaptive number of neighbors for each interpolation position in the current dimension.

[0035] It should be noted that after determining the specific interpolation location, considering that in actual land spatial surveys, forcibly using the same number of neighbors in dense and sparse areas would lead to excessive smoothing in dense areas due to interference from distant heterogeneous points, while sparse areas would suffer distortion due to insufficient reference information, a local density assessment mechanism is established. This mechanism defines a detection benchmark range, assesses the density state around the interpolation point, and combines optimized global hyperparameters to dynamically calculate the adaptive number of neighbors that best matches the local characteristics of each specific interpolation coordinate point.

[0036] Specifically, a baseline number of probe neighbors is preset, and the K-nearest neighbor algorithm is used to obtain all probe nearest neighbors at any interpolation position in the current dimension.

[0037] It should be added that the baseline number of neighbors is used to define the baseline range of density detection. According to the central limit theorem in statistics, when the number of neighbors is greater than or equal to 30, the calculated mean spatial distance has sufficient statistical stability and significance, which can effectively avoid density misjudgment caused by a very small number of isolated outliers. On the other hand, strictly controlling the upper limit to within 100 can ensure that the detection range is always close to the microscopic physical neighborhood of the point to be interpolated, maximizing the preservation of the detailed features of local features. In this embodiment, 50 is selected. Implementers can fine-tune it within this reasonable threshold range according to the physical resolution of the actual grid and the computing power requirements of the system.

[0038] Based on the spatial distance between any interpolation position in the current dimension and any of its detected nearest neighbors, and the hyperparameters for global optimization, calculate the adaptive neighbor count for any interpolation position in the current dimension; the specific calculation formula is as follows:

[0039]

[0040] In the formula, Indicates the interpolation position in the current dimension. The adaptive number of neighbors; This represents the fixed minimum number of neighbors for global optimization in the current dimension; This represents the mapping coefficient for global optimization in the current dimension; , Indicates the index and total number of the baseline probe neighbors; Indicates the interpolation position in the current dimension. Its first Normalized value of the Euclidean distance between each nearest neighbor point; This represents the floor function; where, , These parameters are used to limit the minimum amount of baseline information required for interpolation and the upper limit of scaling for adaptively increasing the number of neighbors, respectively. In this step, these two parameters are only used as inputs to the model variables, and their values ​​will be dynamically obtained through subsequent parameter optimization algorithms based on spatial cross-validation.

[0041] in, Reflects the interpolation position in the current dimension. The sparsity of the local detection space; the smaller this value, the better the interpolation position in the current dimension. Under the same detection ratio, the closer the detected nearest neighbors are in physical space; at this time... As it decreases, the calculated It will approach the minimum benchmark value. This avoids introducing distant heterogeneous points in dense areas, preserving detail; conversely, a larger mean indicates that the local density of detected nearest neighbors is extremely low in physical space, i.e., the distribution is sparse. The value will be significantly increased to expand the search range to absorb sufficient reference information and enhance the stability of sparse region interpolation; among them, the floor function ensures that the final calculation result meets the mathematical requirement of the KNN algorithm that the number of nodes must be discrete integers.

[0042] At this point, the adaptive number of neighbors for each interpolation position in the current dimension has been obtained.

[0043] Step S103: Obtain the overall reliability of each interpolation position in the current dimension.

[0044] It should be noted that when interpolating data dimensions, since different dimensions of data reflect the same spatial object or regional characteristics, the data of each dimension often exhibit certain coordinated change patterns. In order to overcome the limitations of missing information in single-dimensional data, the spatial distribution correlation between the current dimension and other dimensions of data can be analyzed so that the data of other dimensions can be used together as a reference during the interpolation process to improve the interpolation accuracy. Since the spatial resolution and coordinates of different data are not completely aligned, it is necessary to first find the closest position in space to establish a mapping pair, and analyze the synchronous change correlation of the paired points on the spatial feature sequence, so as to provide data basis for the subsequent accurate interpolation process.

[0045] Specifically, for each sampling point in the current dimension, iterate through and obtain the sampling point corresponding to the minimum Euclidean distance between the sampling point and all sampling points in any other dimension, and use it as the pairing point between the sampling point and any other dimension; obtain all pairing points between all sampling points in the current dimension and any other dimension.

[0046] Sort all sampling points in the current dimension in ascending order according to their initial stored grid numbers, extract and calculate the cosine similarity between the feature vectors of all adjacent sampling points after sorting, and construct the first sequence; at the same time, sort all paired points in ascending order according to their initial stored network numbers, extract and calculate the cosine similarity between the feature vectors of all adjacent paired points after sorting, and construct the second sequence.

[0047] Based on the correlation between the current dimension and any other dimension, and the spatial distance between each sampling point of the current dimension and its corresponding paired point in any other dimension, the spatial correlation between the current dimension and any other dimension is calculated; the specific calculation formula is as follows:

[0048]

[0049] In the formula, Indicates the relationship between the current dimension and other dimensions. Spatial correlation in multiple dimensions; This represents the absolute value of the Spearman rank correlation coefficient between the first and second sequences; Represents the sampling points of the current dimension. Compared with other Each dimension corresponds to a pairing point. The normalized value of the Euclidean distance between them; , This indicates the index and total number of sampling points for the current dimension.

[0050] in, The larger the value, the better the current dimension is compared to other dimensions. The greater the synchronous change trend of the data features in each dimension, the more likely the current dimension is to be considered to be related to other dimensions. The stronger the correlation between the dimensions; Reflects all sampling points of the current dimension and other dimensions. The normalized mean of the average spatial distance between paired points in each dimension. The smaller this value, the closer the current dimension is to the other dimensions. The closer the spatial locations of the data in each dimension, the better; in summary, if... The larger and The smaller the value, the stronger the spatial correlation between the two dimensions of data. This correlation can be fully referenced in subsequent interpolation to improve accuracy.

[0051] It should be noted that in the complex geospatial environment, other dimensions may also have local data mutations or abnormal noise. For example, spatial raster data may be affected by local sensor acquisition anomalies or signal distortions, or it may be located at sharp data boundaries that cross different business attribute characteristics. If we directly adopt other dimension regions with local mutations to guide the interpolation of the current dimension, abnormal errors may be introduced into the final result. Therefore, after assessing the global spatial correlation, it is necessary to further analyze the internal data stability of other dimensions in the neighborhood of the interpolation point.

[0052] Specifically, the data stability of any interpolation position in the current dimension for any other dimension is calculated as follows: obtain the adaptive neighbor number for any interpolation position in the current dimension; take the interpolation position as the center and perform a nearest neighbor search in any other reference dimension to extract a number of nearest neighbor sampling points equal to the adaptive neighbor number as neighbor data points; for all neighbor data points, calculate the cosine similarity of the feature vectors of each pair of neighbor data points, and take the absolute value of all calculated cosine similarities and then calculate the average value to obtain the data stability of any interpolation position in the current dimension for any other dimension.

[0053] The higher the data stability value, the more consistent the feature performance of each neighboring data point of any other dimension within the selected detection neighborhood for any interpolation position of the current dimension. This means that the data distribution of any other dimension in this local area is extremely smooth, without any drastic changes in terrain, boundary breaks, or abnormal interference. It is considered that the spatial structure of any other dimension in this local area is more stable.

[0054] Given that there are multiple reference data in the national spatial survey, and the actual guiding value of each dimension of data for the current dimension of the interpolation point varies, relying solely on a single reference dimension during interpolation can easily lead to systematic bias. Therefore, it is necessary to comprehensively and weightedly measure the collaborative contribution capability of all available dimensions in order to calculate the comprehensive reliability under the collaborative constraints of multi-source data at the current interpolation position.

[0055] Furthermore, the spatial correlation is used to perform a weighted average of the data stability to obtain the comprehensive reliability of any interpolation position in the current dimension. The comprehensive reliability is calculated as follows: the spatial correlation between the current dimension and the data of each other dimension is multiplied by the data stability of any other dimension corresponding to any interpolation position in the current dimension. All the calculated product results are summed to obtain a summation result. The ratio of the summation result to the sum of all the spatial correlations is calculated. The ratio is normalized to obtain the comprehensive reliability of any interpolation position in the current dimension.

[0056] The higher the overall reliability, the more likely that at any interpolation position in the current dimension, other reference dimensions not only have a strong global correlation with the current dimension but also have a stable local data structure. This means that with the joint and collaborative support of multi-dimensional data, the prior data environment for interpolation at this position is extremely superior, and the interpolation calculation results based on this are highly reliable.

[0057] At this point, the overall reliability of each interpolation position in the current dimension has been obtained.

[0058] Step S104: Obtain the adaptive interpolation weights of the nearest neighbor points at each interpolation position in the current dimension.

[0059] It should be noted that traditional interpolation relies solely on spatial physical distance to assign weights. In complex land and water areas, such as the water-land boundary zone, it is easy to assign high weights to invalid points that are close in distance but cross topographic boundaries or are themselves noise, leading to boundary ambiguity and error propagation. Therefore, this step constructs a distance attenuation weight based on physical distance and analyzes local feature consistency as a penalty term for local outliers, aiming to suppress the interference of data noise or distorted points on the weights. Combined with the comprehensive reliability obtained above, it provides multi-source prior constraints and uses reference dimension data to verify whether neighboring points cross abrupt boundaries of the surface structure. Through the nonlinear coupling of the above three indicators, invalid sample points that are spatially adjacent but cross topological boundaries of land features are effectively suppressed, thereby achieving high-precision preservation and reconstruction of real geographical boundaries in complex and heterogeneous regions.

[0060] Specifically, for each interpolation position in the current dimension, a nearest neighbor search is performed in the current dimension data with each interpolation position as the center, extracting a number of nearest neighbors equal to the number of adaptive neighbors for each interpolation position in the current dimension; the absolute value of the cosine similarity between the feature vectors of any nearest neighbor at each interpolation position and other nearest neighbors is calculated, and the mean of all calculated absolute values ​​is used as the local feature consistency of any nearest neighbor at each interpolation position.

[0061] Based on the overall reliability of each interpolation position, the consistency of local features of any nearest neighbor point, and the spatial distance between each interpolation position and any nearest neighbor point, the reference level of any nearest neighbor point to each interpolation position is calculated; the specific calculation formula is as follows:

[0062]

[0063] In the formula, Indicates nearest neighbor points For interpolation position The degree of reference; Indicates nearest neighbor points With interpolation position The normalized value of the Euclidean distance between them; Assign parameters to the weights for global optimization in the current dimension; Indicates the interpolation position in the current dimension. Overall reliability at the location; Indicates the interpolation position Nearest neighbor Consistency of local features.

[0064] in, Reflecting the distance decay of spatial location, subject to normalized spatial distance The closer the distance, the higher the basic reference level; and It embodies the collaborative constraints of multi-source features, utilizing comprehensive reliability. The feature consistency of the nearest point itself is amplified a second time. The larger this product is, the more stable the local features of the nearest point are, and the more it conforms to the collaborative evolution law of multi-dimensional data. This means that even if the nearest point is slightly remote in physical space, the information it carries has a strong guiding significance for the interpolation point, thereby correcting the interpolation deviation caused by simply relying on distance. It should be added that the weight allocation parameter is used to adjust the proportion of spatial distance constraints and feature collaboration constraints, and is also obtained as a global hyperparameter optimization.

[0065] Furthermore, the reference level of any nearest neighbor point to each interpolation position calculated above is transformed into the final interpolation weight. Through the weight allocation mechanism, the optimal selection of high-quality reference points and the suppression of noise points are achieved, thereby ensuring the best interpolation quality.

[0066] To ensure that the sum of the weight contributions of each nearest neighbor point is 1 during interpolation calculation, the reference levels obtained from the above calculation are linearly normalized. Specifically, the reference level of any nearest neighbor point to each interpolation position is divided by the sum of the reference levels of all nearest neighbors at that interpolation position to obtain the adaptive interpolation weight of any nearest neighbor point.

[0067] At this point, the adaptive interpolation weights of the nearest neighbors at each interpolation position in the current dimension are obtained.

[0068] Step S105: Obtain the parameters for global optimization in the current dimension to obtain the feature values ​​at each interpolation position, thereby achieving data alignment and reconstruction.

[0069] It should be noted that adaptive interpolation weights reflecting local spatial heterogeneity and cross-dimensional collaborative characteristics were obtained. These weights were directly used to perform weighted interpolation on data points missing in the target dimension, achieving spatial resolution alignment of multi-source heterogeneous data. However, considering that the parameters for global optimization during the adaptive interpolation weight acquisition process are highly sensitive to data distribution, it is difficult to guarantee the model's universality in different geographical environments if set solely based on experience. Therefore, simulation verification was performed beforehand using experimental data to find the optimal parameter combination, ensuring that the entire model can anchor the data patterns of the current region and achieve globally optimal constraints.

[0070] Specifically, from all known sampling points in the current dimension, a certain proportion of sampling points are selected as test interpolation positions using spatial stratified random sampling. In this embodiment, 10% is selected, which can be adjusted by the implementer based on the actual total number of sampling points. A constrained search space is constructed, including a fixed minimum neighbor number, mapping relationship coefficients, and weight allocation parameters. A swarm of particles is initialized in this space. In each iteration of the particle swarm optimization algorithm, the parameter combination represented by each particle is substituted into all the above calculation formulas to obtain the adaptive interpolation weights of each nearest neighbor point at each test interpolation position. These weights are then input into the K-nearest neighbor interpolation algorithm to obtain the predicted feature value of each test interpolation position. The root mean square error between the predicted feature values ​​and the true values ​​of all test interpolation positions is calculated. The particle parameter combination that minimizes the root mean square error is selected as the fixed minimum neighbor number, mapping relationship coefficient, and weight allocation parameters for global optimization in the current dimension.

[0071] It should be added that the constraint boundaries of the parameters are set based on prior geographical knowledge, such as the "Technical Regulations for Land Survey" and the Spatial Mapping Technical Manual.

[0072] Furthermore, the fixed minimum number of neighbors, mapping relationship coefficients, and weight allocation parameters of the current dimension are substituted into all the above calculation formulas to obtain the nearest neighbor points and their adaptive interpolation weights at each interpolation position in the current dimension. These are then input into the K-nearest neighbor interpolation algorithm to obtain the feature values ​​at each interpolation position in the current dimension.

[0073] The same process is applied to other dimensions of data to obtain the feature values ​​of all interpolation positions, thereby achieving high-precision alignment and reconstruction of multidimensional data in space. This process enables adaptive numerical interpolation of missing data caused by differences in sampling resolution or abnormal equipment acquisition, ensuring a high degree of consistency between mathematical and business logic in the feature vectors of each dimension.

[0074] After obtaining the reconstructed data, it can be used for the multidimensional structural restoration and underlying analysis foundation construction of land and space survey data. For example, at the level of spatial multidimensional data analysis: based on high-precision aligned unified multidimensional features, the system can perform structural feature extraction of land and space elements, spatial distribution pattern characterization, and high-granularity regional difference identification.

[0075] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should all be included within the protection scope of the present invention.

Claims

1. A self-adaptive interpolation method for land space investigation data reconstruction, characterized in that, include: Obtain the feature vectors and positions of the sampling points in each dimension, and extract the set of interpolation positions in each dimension; For any interpolation position in the current dimension: using the spatial distance between the interpolation position and all nearby points within its preset baseline detection range, as well as the globally optimized fixed minimum number of neighbors and mapping relationship coefficients, calculate the adaptive number of neighbors for the interpolation position; Obtain any sampling point in the current dimension and its corresponding pairing point in any other dimension. Calculate the similarity of the feature vectors between adjacent points after sorting for each sampling point in the current dimension and its corresponding pairing point, and construct a first sequence and a second sequence. Based on the correlation between the two sequences and the spatial distance between any sampling point and its corresponding pairing point, calculate the spatial correlation between the current dimension and any other dimension. Based on the adaptive neighbor number of the interpolation position, extract all nearest neighbor points in the current dimension and all neighbor points in any other dimension. Calculate the data stability of any other dimension at the interpolation position based on the similarity of the feature vectors of each pair of neighbor points. The spatial correlation is weighted with the data stability of any other dimension to obtain the comprehensive reliability of the interpolation position. The spatial distance between the interpolation position and any nearest neighbor, the similarity of the feature vectors between the nearest neighbors, and the weight allocation parameters of global optimization are combined to calculate the reference degree of any nearest neighbor to the interpolation position and perform linear normalization to obtain the adaptive interpolation weight of any nearest neighbor. By inputting all nearest neighbor points and their adaptive interpolation weights into the K-nearest neighbor interpolation algorithm, the feature values ​​of the interpolation positions are obtained; thus, adaptive interpolation of missing data and spatial alignment and reconstruction of multidimensional data are achieved.

2. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The extraction of the interpolation position set for each dimension includes: The coordinates of all sampled points in all dimensions are designated as the global position set, and the coordinates of all sampled points in each dimension are designated as the position set for each dimension. The spatial difference operation is performed between the global position set and the position set for each dimension to extract the position coordinates that exist in the global position set but not in the position set for each dimension, thus obtaining the interpolated position set for each dimension.

3. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The calculation of the adaptive neighbor number for the interpolation position includes: All nearest neighbors within the preset baseline detection range are obtained by detecting the number of neighbors of the preset baseline and using the K-nearest neighbor algorithm; the interpolation position of the current dimension is calculated. Adaptive neighbor number , In the formula, This represents the fixed minimum number of neighbors for global optimization in the current dimension; This represents the mapping coefficient for global optimization in the current dimension; , Indicates the index and total number of the baseline probe neighbors; Indicates the interpolation position in the current dimension. Its first Normalized value of the Euclidean distance between each nearest neighbor point; This represents the function for rounding up.

4. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The step of obtaining the corresponding pairing point between any sampling point in the current dimension and any other dimension includes: The minimum Euclidean distance between any given sampling point and all sampling points in any other dimension is obtained by iterating through the sample points, and the corresponding sampling point is used as the pairing point between any given sampling point and any other dimension.

5. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The step of calculating the similarity of feature vectors between adjacent points after sorting for each sampling point and its corresponding paired point in the current dimension includes: For the current dimension, after sorting all sampling points in ascending order according to the initially stored grid numbers, calculate the cosine similarity of the feature vectors between adjacent points; after sorting all sampling points and all paired points corresponding to any other dimension in ascending order according to the initially stored network numbers, calculate the cosine similarity of the feature vectors between adjacent paired points.

6. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The spatial correlation between the current dimension and any other dimension satisfies the expression: ; In the formula, Indicates the relationship between the current dimension and other dimensions. Spatial correlation in multiple dimensions; This represents the absolute value of the Spearman rank correlation coefficient between the first and second sequences; Represents the sampling points of the current dimension. Compared with other Each dimension corresponds to a pairing point. The normalized value of the Euclidean distance between them; , This indicates the index and total number of sampling points for the current dimension.

7. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The overall reliability of obtaining the interpolation position includes: Calculate the cosine similarity of the feature vectors of all pairwise neighboring points, and average the absolute values ​​of all cosine similarities to obtain the data stability of any other dimension; calculate the product of the spatial correlation and the data stability, sum all the product results to obtain the summation result, calculate the ratio of the summation result to the sum of all spatial correlations and normalize it to obtain the comprehensive reliability.

8. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The degree of reference of any nearest neighbor point to the interpolation position satisfies the expression: ; In the formula, Indicates nearest neighbor points For interpolation position The degree of reference; Indicates nearest neighbor points With interpolation position The normalized value of the Euclidean distance between them; Assign parameters to the weights for global optimization in the current dimension; Indicates the interpolation position in the current dimension. Overall reliability at the location; Indicates the interpolation position Nearest neighbor The local feature consistency is equal to the mean of the absolute values ​​of the cosine similarity between any nearest neighbor and the feature vectors of all other nearest neighbors.

9. The adaptive interpolation method for reconstructing territorial spatial survey data according to claim 1, characterized in that, The process of obtaining the adaptive interpolation weights for any nearest neighbor point through linear normalization includes: The adaptive interpolation weight of any nearest neighbor is obtained by dividing the reference degree of any nearest neighbor to the interpolation position by the sum of the reference degrees of all nearest neighbors at the interpolation position.

10. An adaptive interpolation method for reconstructing territorial spatial survey data according to any one of claims 1 to 9, characterized in that, The methods for obtaining the fixed minimum neighbor number, mapping relationship coefficients, and weight allocation parameters for global optimization are as follows: Spatial hierarchical random sampling is used to extract test interpolation positions from all sampling points in the current dimension; a constrained search space is constructed, including a fixed minimum neighbor number, mapping relationship coefficients, and weight allocation parameters; in the particle swarm optimization iteration, the particle parameter combination is substituted into the calculation of the adaptive interpolation weights of each nearest neighbor point of the test interpolation position, and the predicted feature value is obtained by the K-nearest neighbor interpolation algorithm; the parameter combination that minimizes the root mean square error between the predicted feature value and the true value is selected as the fixed minimum neighbor number, mapping relationship coefficients, and weight allocation parameters for global optimization.