A soil profile hyperspectral image preprocessing method and system
By generating hyperspectral data cubes, pseudo-color images, and TIF data, key points are automatically marked and perspective transformation matrices are generated, solving the problem of low cropping efficiency of hyperspectral image data of soil profiles and achieving efficient and high-precision automatic cropping.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INST OF AGRI RESOURCES & REGIONAL PLANNING CHINESE ACADEMY OF AGRI SCI
- Filing Date
- 2026-03-20
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies for cropping hyperspectral image data of soil profiles are inefficient, require the installation of multiple software programs on multiple computers, are cumbersome to operate, and make it difficult to guarantee cropping accuracy.
A preprocessing method and system for hyperspectral images of soil profiles is proposed. By generating a hyperspectral data cube, extracting the center wavelength, generating a pseudo-color image and TIF data, automatically marking and sorting key points, generating a perspective transformation matrix, and achieving high-precision automatic cropping.
It simplifies the cropping process, improves the efficiency of cropping hyperspectral data of soil profiles, and achieves high-precision automatic cropping without the need to switch between multiple platforms.
Smart Images

Figure CN122265613A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing technology, and specifically to a preprocessing method and system for hyperspectral images of soil profiles. Background Technology
[0002] In soil science, hyperspectral image data of soil typically contains rich spectral information and can be used for data-level analysis of soil.
[0003] Currently, the preprocessing of hyperspectral image data of soil profiles mainly relies on various professional software such as SR Analysis (Spectral Resolution Analysis), ENVI (Environment for Visualizing Images), QGIS (Quantum Geographic Information System), and ArcGIS (Arc Geographic Information System). It can only be completed through multiple operation steps, including data calibration, radiometric transformation, data format conversion, grid creation, geometric correction, and cropping of the entire soil profile area.
[0004] The above preprocessing workflow requires the simultaneous installation of multiple specialized software programs on a single computer, or separate installations on multiple computers. The entire process involves repeated saving and switching, is cumbersome, and requires a large amount of manual operation. In particular, the cropping of entire soil profile regions in hyperspectral image data still requires manual configuration of key points, resulting in low efficiency and difficulty in guaranteeing cropping accuracy. Summary of the Invention
[0005] The purpose of this invention is to address the problem of low cropping efficiency of hyperspectral data of soil profiles in the prior art, and to provide a preprocessing method and system for hyperspectral images of soil profiles; this can improve the cropping efficiency of hyperspectral data of soil profiles.
[0006] To solve the above-mentioned technical problems, the present invention adopts the following technical solution:
[0007] In a first aspect, embodiments of the present invention provide a preprocessing method for a hyperspectral image of a soil profile. The method includes: a preprocessing system generating a first hyperspectral data cube based on acquired hyperspectral data of a first soil profile, and extracting the center wavelengths of each band corresponding to the first soil profile to obtain a first wavelength set; the preprocessing system generating a target object to be processed based on the first hyperspectral data cube and the center wavelengths in the first wavelength set; wherein the target object includes at least one of the following: a pseudo-color image to be cropped, and TIF (Tagged Image File Format) data to be processed; the pseudo-color image to be cropped is a color image obtained by stitching together three single-channel color images (red, green, and blue) extracted by the preprocessing system based on the first hyperspectral data cube; the preprocessing system acquiring key points of the marked soil profile region in the target object and extracting the coordinates of the key points; the preprocessing system sorting the marked key points and generating a target perspective transformation matrix based on the sorted key points; the preprocessing system cropping the soil profile region in the target object based on the target perspective transformation matrix and the target size of the target object, and outputting the cropped target object if the cropping effect meets preset conditions.
[0008] Secondly, embodiments of the present invention provide a preprocessing system for hyperspectral images of soil profiles. The preprocessing system includes: a data reading module, a preprocessing object generation module, a key point extraction module, a perspective transformation matrix generation module, and a hyperspectral data cropping module. The data reading module is used to generate a first hyperspectral data cube based on the collected hyperspectral data of a first soil profile, and extract the center wavelengths of each band corresponding to the first soil profile to obtain a first wavelength set. The preprocessing object generation module is used to generate a target object to be processed based on the first hyperspectral data cube and the center wavelengths in the first wavelength set. The target object includes at least one of the following: a pseudo-color image to be cropped, a preprocessing object to be processed, and a preprocessing object to be processed. The system uses TIF data in the marked image file format; the pseudo-color image to be cropped is a color image obtained by stitching together the red, green and blue single-channel color images extracted by the preprocessing system based on the first hyperspectral data cube; the key point coordinate extraction module is used to obtain the key points of the marked soil profile area in the target object and extract the coordinates of the key points; the perspective transformation matrix generation module is used to sort the marked key points and generate a target perspective transformation matrix based on the sorted key points; the hyperspectral data cropping module is used to crop the soil profile area in the target object based on the target perspective transformation matrix and the target size of the target object, and output the cropped target object when the cropping effect meets the preset conditions.
[0009] Thirdly, embodiments of the present invention provide a computer device, including: a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement any step in the preprocessing method for soil profile hyperspectral images as described in the first aspect above.
[0010] Fourthly, embodiments of the present invention provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, can implement any step in the preprocessing method for the hyperspectral image of the soil profile as described in the first aspect above.
[0011] The preprocessing system of this invention first generates a first hyperspectral data cube based on the collected hyperspectral data of a first soil profile, and extracts the center wavelengths of each band corresponding to the first soil profile to obtain a first wavelength set. Then, based on the center wavelengths in the first hyperspectral data cube and the first wavelength set, a target object to be processed is generated. That is, a pseudo-color image and TIF data are automatically generated for subsequent processing. Since the pseudo-color image is a true-color approximation image obtained by stitching together the red, green and blue single-channel color images extracted from the first hyperspectral cube, the key points of the marked soil profile area in the target object can be obtained based on the texture in the pseudo-color image, and the coordinates of the key points are extracted. Then, the preprocessing system sorts the marked key points and generates a target perspective transformation matrix based on the sorted key points. Finally, the preprocessing system crops the soil profile area in the target object based on the target perspective transformation matrix and the target size of the target object. If the cropping effect meets the preset conditions, the cropped target object is output. Because it can automatically mark and sort key points in the generated pseudo-color image, it can quickly generate a perspective transformation matrix to uniformly correct tilt and perspective distortion caused by shooting angle, simplifying the traditional cropping steps. It can achieve high-precision automatic cropping of soil profile areas without switching between multiple platforms, thus improving the cropping efficiency of hyperspectral data of soil profiles. Attached Figure Description
[0012] Figure 1 This is an architecture diagram of a preprocessing system for a soil profile hyperspectral image provided in an embodiment of the present invention.
[0013] Figure 2 This is a schematic flowchart of a preprocessing method for a hyperspectral image of a soil profile provided in an embodiment of the present invention.
[0014] Figure 3 This is a schematic diagram illustrating the generation logic of a pseudo-color image according to an embodiment of the present invention.
[0015] Figure 4 This is a schematic diagram illustrating the generation logic of a TIF file according to an embodiment of the present invention.
[0016] Figure 5 This is a schematic diagram of an interactive visualization interface for a preprocessing system provided in an embodiment of the present invention.
[0017] Figure 6 This is a schematic diagram of an overall trimming logic provided in an embodiment of the present invention.
[0018] Figure 7 This is a hardware structure diagram of a computer device used for a preprocessing method of a soil profile hyperspectral image provided in an embodiment of the present invention. Detailed Implementation
[0019] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with some aspects of the invention as detailed in the appended claims.
[0020] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular forms “a,” “the,” and “the” used in this invention and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any or all possible combinations of one or more of the associated listed items.
[0021] It should be understood that although the terms first, second, third, etc., may be used in this invention to describe various information, this information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, first information may also be referred to as second information without departing from the scope of this invention, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to a determination."
[0022] The embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
[0023] Figure 1 This is an architecture diagram of a preprocessing system for hyperspectral images of soil profiles provided in an embodiment of the present invention. Figure 1As shown, the preprocessing system 100 for soil profile hyperspectral images includes: a data reading unit 101, a preprocessing object generation module 102, a soil profile key point extraction unit 103, a perspective transformation matrix generation module 104, and a hyperspectral data cropping unit 105. The data reading unit 101 reads the hyperspectral data of the soil profile to generate a hyperspectral data cube and extracts the center wavelength of each band of the soil profile. The preprocessing object generation module 102 generates a pseudo-color image and TIF data of the soil profile for subsequent analysis. The soil profile key point extraction unit 103 extracts key points of the soil profile region from the acquired hyperspectral data; the perspective transformation matrix generation module 104 generates a perspective transformation matrix for correcting the acquired image; and the hyperspectral data cropping unit 105 crops the object to be cropped according to its target size.
[0024] Figure 2 A schematic flowchart of a preprocessing method for a hyperspectral image of a soil profile provided in an embodiment of the present invention is shown below. Figure 2 As shown, the method includes the following steps S201 to S205:
[0025] S201. The preprocessing system generates a first hyperspectral data cube based on the collected hyperspectral data of the first soil profile, and extracts the center wavelength of each band corresponding to the first soil profile to obtain a first wavelength set.
[0026] A hyperspectral data cube typically includes two spatial dimensions and one spectral dimension. The spatial dimension indicates the spatial information of each pixel, including the row and column where the pixel is located; the spectral dimension (bands) indicates the physicochemical properties of each pixel, specifically indicating the number of bands corresponding to each pixel, and each pixel corresponds to a spectral curve.
[0027] Furthermore, the first hyperspectral data cube can be represented using a three-dimensional array with dimensions (rows, cols, bands).
[0028] Specifically, during batch preprocessing of hyperspectral data from soil profiles, the folder containing the hyperspectral data for each soil profile is traversed, and files with the first and second extensions are read and loaded to generate a hyperspectral data cube for each soil profile. The file with the first extension stores the actual collected hyperspectral data from the soil profile; the file with the second extension stores the metadata of the hyperspectral data, including image size, number of bands, and center wavelength of each band.
[0029] Iterate through the folder containing the hyperspectral data for each soil profile, extracting the center wavelengths of each band from the files with the second extension for each soil profile. The resulting set of center wavelengths for each soil profile is: ,in, Indicates the center wavelength. N represents the total number of bands in the hyperspectral data of the soil profile, where N is a positive integer.
[0030] For example, the first extension is .cube (Cube Data File), and the second extension is .hdr (Header File). The system iterates through the folders containing the raw data for each pre-collected soil profile, searching for files with the .hdr and .cube extensions. It loads these files to generate a hyperspectral data cube for each soil profile and extracts the center wavelengths of each band from the hyperspectral data of each soil profile from the .hdr file.
[0031] S202, The preprocessing system generates the target object to be processed based on the first hyperspectral data cube and the center wavelength in the first wavelength set.
[0032] The target object to be processed includes at least one of the following: a pseudo-color image to be cropped, or TIF data to be processed; the pseudo-color image to be cropped is a color image obtained by stitching together the red, green and blue single-channel color images extracted by the preprocessing system based on the first hyperspectral data cube.
[0033] Optionally, the pseudo-color image to be cropped can be used by the preprocessing system to crop the hyperspectral data of the soil profile. If the preprocessing system includes a visualization interface, it can also be used to display and process the data in the visualization interface.
[0034] The TIF data to be processed is TIF format hyperspectral data of the first hyperspectral data cube after radiometric correction and georegistration, and configured with geographic reference information. It can be used for data analysis of the first soil profile in GIS (Geographic Information System).
[0035] Specifically, if the object to be processed is only a pseudo-color image to be cropped, the TIF data to be processed may not be required; if the object to be processed is TIF data to be processed, for example when cropping the TIF data to be processed, it is necessary to combine the key points marked in the pseudo-color image to be cropped for cropping.
[0036] S203. Obtain the key points of the marked soil profile area in the target object and extract the coordinates of the key points.
[0037] Among them, key points indicate the four corner points of the quadrilateral region corresponding to the soil profile.
[0038] Optionally, the annotation method for key points in the soil profile area can be to call a key point annotation model pre-trained based on a deep learning model to automatically annotate the target object, or to manually annotate the pseudo-color image displayed by the user in the visualization interface. This embodiment of the invention does not specifically limit this method.
[0039] Optionally, when the system automatically annotates, the preprocessing system can simultaneously display the automatically annotated key points in the pseudo-color image in the visualization interface.
[0040] For example, the key point annotation model is obtained by pre-training the YOLOv8-pose (You Only Look Once version 8- Pose estimation) model.
[0041] When the object to be processed includes a pseudo-color image to be cropped, the preprocessing system uses a pre-trained YOLOv8-pose model to annotate key points of the soil profile region in the pseudo-color image to be cropped during automatic annotation. These automatically annotated key points are then simultaneously displayed on the pseudo-color image to be cropped in the visualization interface. When the user performs manual annotation, the preprocessing system displays the pseudo-color image to be cropped in the visualization interface and extracts the key points manually annotated by the user in the pseudo-color image to be cropped.
[0042] When the object to be processed includes TIF data, the preprocessing system, during automatic annotation, calls the YOLOv8-pose model to automatically annotate key points in the soil profile areas of both the pseudo-color image and the TIF data, and extracts the coordinates of these key points. When the user performs manual annotation, the preprocessing system obtains the coordinates of the key points based on the positions manually annotated in the pseudo-color image to be cropped, as displayed in the visualization interface. The preprocessing system then reuses these manually annotated key point coordinates from the pseudo-color image to be cropped during subsequent cropping to crop the TIF data.
[0043] S204. The key points marked by the preprocessing system are sorted, and the target perspective transformation matrix is generated based on the sorted key points.
[0044] Optionally, when automatically annotating key points in the soil profile area, if the number of key points automatically identified by the key point annotation model is equal to four, sorting begins; if the number of key points automatically identified by the key point annotation model is less than four, manual annotation is prompted, and sorting is performed based on the manually annotated key points.
[0045] S205. The preprocessing system, based on the target perspective transformation matrix and the target size of the target object, trims the soil profile area in the target object and outputs the trimmed target object if the trimming effect meets the preset conditions.
[0046] Wherein, if the target object is the original pseudo-color image, the cropped target object includes: the target pseudo-color image; the target size is the output size of the target pseudo-color image;
[0047] If the target object is the original pseudo-color image and the TIF data to be processed, then the cropped target object includes the target pseudo-color image and the corresponding target TIF data, and the target size is the output size of the target pseudo-color image and the output size of the target TIF data.
[0048] It should be noted that when the target objects include pseudo-color images and TIF data of the same soil profile, the output size of the pseudo-color image and the output size of the TIF data can be the same or different. When the output sizes are the same, the perspective transformation matrix corresponding to the pseudo-color image can be directly reused when performing perspective transformation on the TIF data. When the output sizes are different, the perspective transformation matrix of the pseudo-color image and the perspective transformation matrix of the TIF data are different, and each uses its corresponding perspective transformation matrix.
[0049] This invention provides a preprocessing method for hyperspectral images of soil profiles. The preprocessing system first generates a first hyperspectral data cube based on the acquired hyperspectral data of a first soil profile, and extracts the center wavelengths of each band corresponding to the first soil profile to obtain a first wavelength set. Then, based on the center wavelengths in the first hyperspectral data cube and the first wavelength set, a target object to be processed is generated. Specifically, a pseudo-color image and TIF data are automatically generated for subsequent processing. Since the pseudo-color image is a true-color approximation image obtained by stitching together the red, green, and blue single-channel color images extracted from the first hyperspectral cube, key points of the marked soil profile area in the target object can be obtained based on the texture in the pseudo-color image. The coordinates of the key points are extracted. Then, the preprocessing system sorts the marked key points and generates a target perspective transformation matrix based on the sorted key points. Finally, based on the target perspective transformation matrix and the target size of the target object, the preprocessing system crops the soil profile area in the target object. If the cropping effect meets preset conditions, the cropped target object is output. Because it can automatically mark and sort key points in the generated pseudo-color image, it can quickly generate a perspective transformation matrix to uniformly correct tilt and perspective distortion caused by shooting angle, simplifying the traditional cropping steps. It can achieve high-precision automatic cropping of soil profile areas without switching between multiple platforms, thus improving the cropping efficiency of hyperspectral data of soil profiles.
[0050] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, S202 may include S202a1 to S202a3 as described below.
[0051] S202a1. The preprocessing system determines the index of the target color band based on the first wavelength set and the preset color wavelengths.
[0052] The color wavelengths include preset wavelengths: red, green, and blue.
[0053] For example, the preset blue light wavelength is 450nm, the preset green light wavelength is 550nm, and the preset red light wavelength is 650nm.
[0054] In this embodiment of the invention, the target color band is the band containing the center wavelength that is closest to the preset color wavelength in the first wavelength set.
[0055] For example, the preprocessing system can determine the index of the band in the first wavelength set that is closest to the preset color wavelength based on formula (1).
[0056] Formula (1)
[0057] in, Indicates the index of the band. Indicates the band in the first wavelength set The center wavelength, This indicates the preset color wavelength, which can be any one of the red, green, and blue wavelengths. argmin (Argument of the Minimum) represents the index of the minimum value.
[0058] It is understandable that, through the above method, the preprocessing system can automatically extract the indices of the blue, green, and red bands and select a true-color approximate band combination without human interaction.
[0059] S202a2, The preprocessing system extracts three single-channel color images of red, green and blue light corresponding to the index of the target color band from the first hyperspectral data cube.
[0060] Based on the extracted red, green, and blue light band indices, the indices of the dimensions of the bands are fixed in the first hyperspectral data cube, thereby extracting the corresponding red, green, and blue single-channel images from the first hyperspectral data cube.
[0061] S202a3. The preprocessing system performs comparison stretching and normalization on the three extracted single-channel color images, and stitches them together in the order of R, G, B to obtain the original pseudo-color image of the three channels.
[0062] For example, based on formula (2), the contrast of each single-channel color image is stretched and the pixel values are normalized to the range of 0 to 255.
[0063] Formula (2)
[0064] in, This represents the normalized pixel value. Represents the original pixel value. Represents the original pixel value At the 0.5 percentile, Represents the original pixel value At the 99.5th percentile.
[0065] Typically, due to factors such as illumination, exposure, and sensor gain, hyperspectral data often has histograms concentrated in a very narrow range. Without stretching, the stitched color image will appear grayish or blackish overall, with very low contrast. The stitched color image will appear hazy, with indistinct brightness differences and more pronounced overexposure and underexposure. The contrast stretching process described above avoids insufficient or concentrated contrast distribution of the DN values in the acquired hyperspectral data, which leads to poor display quality after stitching the three extracted single-channel color images. For example, it reduces contrast at edges and corners, making subsequent keypoint detection more difficult.
[0066] It should be noted that, in this embodiment of the invention, the pseudo-color image is not only used for display but also for keypoint detection in the YOLOv8-pose model. Through stretching, the problem of low contrast at edges and corners, and difficulty in keypoint detection, can be avoided, improving the cross-scene generalization ability of training data and preventing the model from mistaking brightness changes for shape changes.
[0067] After stretching and normalizing the three extracted single-channel images using the above formula (2), the three normalized single-channel images are stitched together in RGB order to obtain a pseudo-color image with the shape (rows, cols, 3).
[0068] Based on this scheme, by reading hyperspectral data and selecting bands, three single-channel images (R, G, and B) can be extracted from the original hyperspectral data cube. Then, the three single-channel images are subjected to contrast stretching and normalization. Finally, the three single-channel images are stitched together in RGB order, thereby converting hyperspectral data that cannot be displayed on the display interface into a pseudo-color image with a visualization effect that approximates true color. This makes it convenient for users to process the pseudo-color image that can be displayed on the display interface, such as for manual observation and annotation.
[0069] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, the above-mentioned S202 may further include the following S202a4.
[0070] S202a4. The preprocessing system corrects the display direction of the original pseudo-color image to the display direction of the key point recognition model, saves it as a format file that the key point recognition model can read, and obtains the pseudo-color image to be cropped.
[0071] The key point recognition model is obtained by pre-training a deep learning model.
[0072] Specifically, the original pseudo-color image is flipped and rotated to correct the image display orientation, and then saved in a file format readable by the key point recognition model to obtain the pseudo-color image to be cropped.
[0073] For example, if a keypoint recognition model trained on YOLOv8-pose is used, the original pseudo-color image is vertically mirrored and rotated 90° counterclockwise, and saved as a JPEG (Joint Photographic Experts Group) format file. This ensures that the pseudo-color image to be cropped is consistent with the image orientation and file format required by the keypoint recognition model.
[0074] The pseudo-color image to be cropped is stored in H rows × W columns, where H and W are both positive integers.
[0075] Based on this scheme, the preprocessing system can also display the orientation of the image based on the key point recognition model and correct the orientation of the pseudo-color image, which can provide standardized data input for the subsequent preprocessing system to automatically crop.
[0076] For example, Figure 3 The present invention provides a schematic diagram of the generation logic of a pseudo-color image, including the following steps S301 to S305.
[0077] S301. Specify the folder for the hyperspectral data.
[0078] S302. Traverse all files in the specified folder, find files with the extensions .hdr and .cube, load and generate a hyperspectral data cube, and extract the center wavelength of each band based on the .hdr file.
[0079] S303. Obtain the set target wavelengths of red, green and blue light, and select the band index closest to the target wavelength based on the target wavelength.
[0080] S304. Fix the band index in the hyperspectral data cube, extract the three single-channel color images of red, green and blue, and perform contrast stretching and normalization on the three single-channel color images.
[0081] S305. The three single-channel color images are stitched together in RGB order to obtain a three-channel pseudo-color image. The pseudo-color image is then vertically mirrored and rotated 90 degrees counterclockwise, and stored as a pseudo-JPEG file.
[0082] Understandably, the preprocessing system can generate pseudo-color images of each soil profile for cropping based on the large amount of hyperspectral data collected from the soil profiles, according to requirements.
[0083] Optionally, if the target object to be processed includes TIF data of soil profiles for analysis in GIS, then in the preprocessing method for hyperspectral images of soil profiles provided in this embodiment of the invention, the above-mentioned S202 may further include the radiometric correction process of S202b1 to S202b4 and the georeferenced information setting process of S202b5. Specifically, S202b2 is executed only the first time the method is executed. After the target interpolation function has been constructed, S202b2 is not executed again when processing hyperspectral data of other soil profiles; the target interpolation function is directly called instead.
[0084] S202b1, The preprocessing system obtains a reference DN value array based on the DN (Digital Number) file of the reference board, and obtains a first reference reflectance value array based on the reflectance file of the reference board.
[0085] Specifically, the preprocessing system reads the reference board DN value file, removes the first two lines of descriptive information, separates each line of the remaining content with a space character, and parses the second value of each line into a reference DN value for a band, obtaining the reference DN values corresponding to N bands. Each reference DN value is greater than 0. The obtained reference DN values are stored in the reference DN value array `reference_dn`. Indicates band The corresponding reference DN value, , This indicates the number of center bands in the hyperspectral data cube.
[0086] It should be noted that N comes from the .hdr metadata and is the number of bands in the hyperspectral data cube; each band has a center wavelength. The reference board DN value file is derived from the reference board area of the same acquisition by band, so it naturally has one DN for each band and should have N DN values.
[0087] Specifically, the preprocessing system reads the reference reflectance file, skips the first two lines of the reflectance file, and then reads each of the remaining lines in sequence. It parses the first column of each line into a wavelength value and stores the parsed wavelength value in the first wavelength array; it parses the second column of each line into a reflectance value and stores the parsed reflectance value in the first reflectance array.
[0088] It should be noted that the number of sampling points in the reflectance file is inconsistent with the number of center wavelengths in the hyperspectral data cube. Therefore, the number of wavelengths for the reference reflectance resolved in the reference version is usually different from the number of center wavelengths in the hyperspectral data cube, and they cannot be aligned. Reflectance reference data generally comes from the reference board manufacturer or laboratory measurements. It is usually in the form of: reflectance = f (wavelength) of a set of discrete points, such as one point per 1 nm. The number of wavelength points obtained may be much greater than N, for example, one point per nm in the 350-2500 nm range, or it may be less than N, for example, when only a few key wavelength points are given, or when the measurement resolution is low. Even with a large amount of data, it is often difficult to be exactly equal to the center wavelength of each band in the hyperspectral data cube.
[0089] Therefore, after S202b1 above, the following interpolation processes S202b2 to S202b5 can be performed for alignment to obtain a reference reflectance with the same number of center wavelengths as the first specular data cube.
[0090] S202b2, the preprocessing system is based on the first wavelength value array, the first reflectivity array, and the first wavelength set, and adopts... (Linear Interpolation), construct the target interpolation function based on formula (3).
[0091] Formula (3)
[0092] in, This represents the target interpolation function, and ref_wl represents the first wavelength array. This represents the first reflectivity array. This represents the center wavelength in the first wavelength set.
[0093] It should be noted that the target interpolation function constructed in this invention depends only on the wavelength and reflectance values resolved from the reference plate. Therefore, after the target interpolation function is constructed, it can be reused for any subsequent hyperspectral data cube. The preprocessing system can then use the target interpolation function to determine the center wavelength of each hyperspectral band of each soil profile, the wavelength and reflectance values resolved from the reference plate. Interpolation is performed to obtain the reference reflectance corresponding to the center wavelength of each band in each specular data cube.
[0094] S202b3. The preprocessing system calls the pre-built target interpolation function and interpolates the first wavelength value array and the first reflectivity value array to obtain the second reference reflectivity array corresponding to the first wavelength set.
[0095] It is understandable that the length of the second reference reflectance array is the same as the number of bands in the first hyperspectral data cube.
[0096] S202b4. The preprocessing system converts the DN value of each band of each spatial pixel in the first hyperspectral data cube into a reflectance value based on the DN value of each band, the reference DN value array, and the second reference reflectance array of each band in the first hyperspectral data cube, thus obtaining the first reflectance cube.
[0097] For example, the DN value of each band of each spatial cell in the first hyperspectral data cube can be converted into a reflectance value based on formula (4).
[0098] Formula (4)
[0099] in, Represents the first reflectivity cube in spatial pixels and band The reflectivity value, Represents the first dimension of a spatial cell. This represents the second dimension of a spatial cell; This represents the first hyperspectral data cube in spatial pixels. and band The DN value, Indicates the band in the reference DN value array DN value; Indicates the bands in the second reference reflectivity array Reference reflectivity, , Indicates band The center wavelength.
[0100] The first reflectance cube is the spectral data cube after radiometric correction of the first hyperspectral data cube. The hyperspectral information of each spatial pixel of the first reflectance cube is the reflectance value. The spatial shape of the first reflectance cube is the same as that of the first hyperspectral data cube.
[0101] Based on this scheme, the preprocessing system can construct a target interpolation function based on the DN file and reflectance file of the reference board, and perform radiometric correction on the DN value of the high light data cube of the soil profile to obtain the reflectance cube of the soil profile.
[0102] It is understandable that after S202b4, if it is determined that the orientation of the first reflectivity cube is inconsistent with the orientation of the geographic coordinate system, the preprocessing system performs geometric correction on the first reflectivity cube. Specifically, the preprocessing system transforms the first reflectivity cube from (row, column, band) to (column, row, band); then the preprocessing system performs vertical and horizontal flips to adjust the orientation and obtain a reflectivity cube with the shape of (column', row', band).
[0103] S202b5. After the preprocessing system performs geometric correction on the first reflectivity cube according to the geographic coordinate system direction, it performs BSQ (Band Sequential) format conversion based on the writing requirements of TIF data, sets geographic reference information according to the preset projection model, and generates TIF data to be processed.
[0104] Typically, when writing GeoTIFF (Geographic Reference TIFF format) multi-band data, band-first is commonly used. If the format of the geometrically corrected first reflectance cube does not match the TIF data writing format requirements, the preprocessing system will convert the geometrically corrected first reflectance cube from (column', row', band) to (band, row', column'), i.e., BSQ format.
[0105] In this embodiment of the invention, after converting the format to BSQ format, the preprocessing system generates a first CRS (Coordinate Reference System) as a spatial reference based on preset projection information. That is, the preprocessing system sets geographic reference information for the first reflectance cube of the soil profile, and finally generates the TIF data of the soil profile to be processed.
[0106] The first CRS matches ESRI (Environmental Systems Research Institute):102025. ESRI:102025 corresponds to the Albers Conic Equal Area in northern Asia. The WKT (Well-Known Text) description of ESRI:102025 is shown in Table 1. The geographic coordinate system of ESRI:102025 is based on WGS84 (World Geodetic System 1984).
[0107] Table 1 - WKT Description
[0108] Next, the TIF data to be processed from the hyperspectral data is created and output. The parameters of the TIF data to be processed are set as shown in Table 2.
[0109] Table 2 - Parameter Settings for TIF Data to be Processed
[0110] It should be noted that when generating the TIF data to be processed, the affine transformation matrix of the TIF data parameter is initially set to the identity matrix. The affine identity matrix is generated by calling the Affine.identity() function. After subsequent cropping, the affine transformation matrix can be further adjusted according to the pixel size and the actual geographical range. The photometric interpretation MINISBLACK in Table 2 (Minimum-is-Black) is applicable to grayscale images.
[0111] It should be noted that the generated TIF data to be cropped is multi-band TIF data, with each band corresponding to a two-dimensional raster matrix. This two-dimensional raster matrix stores the reflectance value of each pixel in that band. Therefore, in this embodiment of the invention, the band data indicates the geometrically corrected reflectance cube in the band. The corresponding two-dimensional reflectance raster matrix is generated. The two-dimensional reflectance array for each band is written into the corresponding band in GeoTIFF, i.e., into the band data of the hyperspectral data of the soil profile, and the statistical parameters of the effective spatial pixels are calculated. The statistical parameters include: minimum, maximum, mean, and standard deviation. When calculating the statistical parameters, the statistical sampling factor (STATISTICS_SKIPFACTORX) in the X direction and the statistical sampling factor (STATISTICS_SKIPFACTORY) in the Y direction are both set to 1 to facilitate subsequent quality control and data analysis of the spectral data of the soil profile.
[0112] The calculated statistical parameters are written into the metadata corresponding to each band of the TIF data to be processed. Table 3 shows the statistical parameters of the effective spatial pixels for each band of the hyperspectral data.
[0113] Table 3 - Statistical parameters of effective spatial pixels
[0114] It should be noted that writing metadata typically includes writing information describing how the above two-dimensional raster matrix is interpreted, located, and statistically analyzed into tags, including CRS, transform, statistics (min / max / mean / std), STATISTICS_SKIPFACTORX / Y, and NoData.
[0115] Based on this scheme, the preprocessing system automatically converts the DN values of hyperspectral data into reflectance values to complete radiometric correction. Combined with the reflectance obtained from the target interpolation function and the corrected reference plate, a reflectance accurately corresponding to the center wavelength can be obtained. The orientation of the reflectance cube is adjusted through transpose and flip operations, and then the data format is converted to BSQ format. A geographic reference (CRS) is automatically set, ultimately generating high-precision TIF data suitable for GIS analysis. Compared to the traditional manual process of generating TIF data, which requires step-by-step processing using software such as SRAnalysis, ENVI, QGIS, and ArcGIS, and compared to the traditional manual process of processing data step-by-step using different software and then switching to another, this significantly shortens the preprocessing steps and time for hyperspectral data, greatly improving the efficiency of hyperspectral data preprocessing.
[0116] For example, Figure 4 This is a schematic diagram illustrating the generation logic of a TIF file according to an embodiment of the present invention. Figure 4 As shown, this includes S401 to S407 described below.
[0117] S401. Specify the folder for the hyperspectral data.
[0118] S402. Traverse all files in the specified folder, find files with the extension .hdr and .cube, load and generate a hyperspectral data cube, and extract the center wavelength of each band based on the .hdr file.
[0119] S405. For each center wavelength, call the target interpolation function to calculate the reference reflectance for each band based on the reference DN value array, and obtain the reflectance data cube.
[0120] S406. Convert the reflectance data cube from (rows, columns, bands) to (columns, rows, bands), first perform a vertical flip, then perform a horizontal flip.
[0121] S407. Set projection information, write band data and calculate statistical information, and write the statistical information into the metadata of the corresponding band.
[0122] It is understandable that S403 and S404 may also be included before S405.
[0123] S403. Read the reflectivity file of the reference board, parse it to obtain the wavelength array and reflectivity array, and construct the target interpolation function.
[0124] S404. Read the DN value file of the reference board and parse it to obtain the reference DN value array.
[0125] It is understandable that when this method is executed for the first time, S403 and S404 can be executed to store the obtained target interpolation function and reference DN value array in the preprocessing system, which can be directly called later.
[0126] It should be noted that the preprocessing system can generate TIF data for each soil profile for cropping in batches from the large amount of hyperspectral data collected from soil profiles, according to requirements.
[0127] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, the above-mentioned S204 may specifically include S204a1 to S204a3 as described below.
[0128] S204a1, The preprocessing system extracts four key points based on the marked positions in the pseudo-color image to be cropped, and obtains all permutations of the four key points located at the four corner positions of the quadrilateral.
[0129] The order of the four key points is arbitrary.
[0130] For example, the explanation will be based on the following example: the four corners of a quadrilateral are arranged in the following order: top left corner (denoted as TL), top right corner (denoted as TR), bottom right corner (denoted as BR), and bottom left corner (denoted as BL).
[0131] S204a2. The preprocessing system calculates the geometric consistency index for each permutation, and determines the comprehensive score for each permutation based on the geometric consistency index and the preset comprehensive scoring function.
[0132] Among them, the geometric consistency indexes include: side length consistency error, rectangle orthogonality error, and direction prior error; the side length consistency error includes: width consistency error and height consistency error.
[0133] S204a3. The preprocessing system uses the order of the key points with the smallest comprehensive score as the order of the four corners of the quadrilateral.
[0134] Suppose the four extracted keypoints are P1, P2, P3, and P4. Each possible arrangement of these four keypoints at the four corner positions is denoted as... For example, permutation 1 is denoted as: . Then define TL=π11, TR=π12, BR=π13, BL=π14.
[0135] Based on this scheme, the preprocessing system can extract key points of the soil profile area based on the marked positions in the pseudo-color image, and accurately select the arrangement order of the four corners of the quadrilateral from the marked positions through geometric consistency index and comprehensive scoring function.
[0136] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, S204a2 may specifically include A21 to A25 below.
[0137] A21. The preprocessing system calculates the top and bottom side lengths of the quadrilateral formed by the order of the four key points in the arrangement, and calculates the width consistency error based on the top and bottom side lengths.
[0138] The top side length is the absolute value of the difference between the x-coordinates of the top right and top left corners; the bottom side length is the absolute value of the difference between the x-coordinates of the bottom right and bottom left corners.
[0139] For example, the width consistency is calculated based on the top side length, the bottom side length and formula (5).
[0140] Formula (5)
[0141] in, Indicates the length of the top side. , Indicates the length of the bottom side. .
[0142] A22. The preprocessing system calculates the left and right heights of the quadrilateral formed by the order of the four key points in the arrangement, and calculates the height consistency error based on the left and right heights.
[0143] Wherein, the left height is the absolute value of the difference between the ordinates of the bottom left corner and the top left corner, and the right height is the absolute value of the difference between the ordinates of the bottom right corner and the top right corner.
[0144] For example, the height consistency is calculated based on the left height, the right height and formula (6).
[0145] Formula (6)
[0146] in, It indicates that the left side is higher. , It indicates that the right side is higher. .
[0147] Specifically, the included angle of the quadrilateral can be calculated, and based on the orthogonality of the rectangle, the error between the included angle and the right angle can be determined. Therefore, the above A2 can specifically include the following A23.
[0148] A23. The preprocessing system calculates the top and left side vectors of the quadrilateral formed by the order of the four key points in the arrangement, and calculates the included angle deviation value based on the top and left side vectors of the quadrilateral.
[0149] The upper vector indicates the difference between the coordinates of the upper right corner and the upper left corner, while the left vector indicates the difference between the coordinates of the lower left corner and the upper left corner.
[0150] For example, the included angle deviation is calculated based on the top side vector, the left side vector of the quadrilateral and formula (7).
[0151] Formula (7)
[0152] in, Represents the vector above. , Represents the left-hand vector. .
[0153] It should be noted that the smaller the deviation of the included angle, the closer the included angle is to a right angle.
[0154] With images The axis is horizontal. The axis is vertical. The direction prior knowledge is used to determine the direction prior error, with the top and bottom sides of the target quadrilateral parallel to the horizontal direction and the left and right sides parallel to the vertical direction. Furthermore, the aforementioned A2 can include the following A24.
[0155] A24. The preprocessing system calculates the horizontal and vertical deviations of the quadrilateral formed by the arrangement of four key points based on the order of the four key points. Based on the horizontal and vertical deviations, the comprehensive directional error is determined.
[0156] Among them, the horizontal deviation represents the deviation of the angle between the upper vector and the x-axis, and the vertical deviation represents the deviation of the angle between the left vector and the x-axis.
[0157] For example, the comprehensive direction error is calculated based on the horizontal direction deviation, the vertical direction deviation and formula (8).
[0158] Formula (8)
[0159] in, Indicates the overall directional error. Indicates the deviation in the horizontal direction. This indicates the deviation in the direction of the vertical side.
[0160] A25. The preprocessing system calculates the comprehensive score for each permutation based on the width consistency error, height consistency error, included angle deviation, and comprehensive direction error.
[0161] For example, the overall score for each permutation is calculated based on formula (9).
[0162] Formula (9)
[0163] in, , , , These are the weighting coefficients.
[0164] By way of example, the present invention sets , , , This is used to balance the influence of side length consistency, rectangular orthogonality, and directional prior on the rationality of sorting.
[0165] Specifically, among the 24 candidate permutations, The smallest permutation is taken as the corner order of the final target quadrilateral, and it is denoted as: The sorted keypoint array is as follows: .
[0166] Based on the sorting method provided by this scheme, even when the keypoint input order is arbitrary and the prediction contains noise, the scheme integrates the influence of side length consistency, rectangular orthogonality, and direction prior on the rationality of the sorting. It selects the arrangement with the smallest comprehensive score as the order of the four corner points of the quadrilateral, avoiding problems such as perspective transformation flipping, misalignment, or cropping area drift caused by incorrect corner point order. This improves the accuracy of combining keypoints into quadrilaterals close to rectangles in sequence. In other words, by using a four-point robust sorting method and perspective transformation algorithm, the tilt and perspective distortion caused by the shooting angle are uniformly corrected into standard rectangular areas. This enables high-precision automatic cropping of soil profile areas and ensures a one-to-one correspondence between the cropped areas in the pseudo-color image and TIF data.
[0167] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, the above-mentioned S204 may further include S204b1 to S204b2 as described below.
[0168] S204b1. The preprocessing system determines the target width based on the horizontal distance between the upper and lower edges of the sorted key points; and determines the target height based on the vertical distance between the left and right edges of the sorted key points.
[0169] The dimensions of the target pseudo-color image are: target width × target height.
[0170] For example, the target width is calculated based on formula (10), and the target height is calculated based on formula (10).
[0171] Formula (10)
[0172] Wherein, the first width represents the horizontal distance of the bottom edge. The second width represents the horizontal distance from the top edge. ; This indicates that the larger of the first width and the second width is selected as the target width.
[0173] Formula (11)
[0174] in, Indicates the horizontal distance from the right edge, first height. The second height represents the horizontal distance from the left edge. . This indicates that the larger of the first and second altitudes is selected as the target altitude.
[0175] S204b2. The preprocessing system generates a first target matrix based on the target width and target height, and calculates a first perspective transformation matrix based on the sorted keypoint coordinates and the first target matrix.
[0176] The first target matrix indicates the coordinates of the four corners of the soil profile in the target pseudo-color image and the size of the target pseudo-color image. The first target matrix is a 4×2 matrix. Each row of the first target matrix represents the coordinates of a target corner point in the target pseudo-color image. The first row represents the upper left corner of the target pseudo-color image, the second row represents the upper right corner, the third row represents the lower right corner, and the fourth row represents the lower left corner.
[0177] Specifically, after obtaining the first target matrix, that is, after obtaining the key point coordinates of the target pseudo-color image, the first perspective transformation matrix is calculated based on the key point coordinates of the pseudo-color image to be cropped, the key point coordinates of the target pseudo-color image, and formula (13).
[0178] Formula (13)
[0179] Where (x, y) are the coordinates of the sorted key points in the pseudo-color image to be cropped. The coordinates of the target key points in the pseudo-color image. This represents the first perspective transformation matrix.
[0180] Based on this scheme, the target width and target height are determined according to the sorted key points. The tilt and perspective distortion caused by the shooting angle can be uniformly corrected into a standard rectangular area. Based on the corrected standard rectangular area, the first perspective transformation matrix is generated, which provides support for the subsequent high-precision automatic cropping of the soil profile area.
[0181] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, the above-mentioned S205 can be specifically performed by the following S205a1 and S205a2.
[0182] S205a1. The preprocessing system calls the perspective transformation algorithm. Based on the first perspective transformation matrix, the irregular region determined by the sorted key points in the pseudo-color image to be cropped is transformed into a rectangular region through perspective transformation, and the output size of the pseudo-color image is obtained.
[0183] It should be noted that due to errors in the shooting angle of the soil profile, the resulting quadrilateral is usually a non-standard rectangle, i.e., perspective distortion exists. Therefore, the perspective transformation algorithm described above can correct the soil profile area into a standard rectangle.
[0184] S205a2, The preprocessing system crops the pseudo-color image to be cropped based on the transformed rectangular region and the output size of the pseudo-color image, thus obtaining the target pseudo-color image.
[0185] Specifically, the perspective transformation function is called, and based on the first perspective transformation matrix calculated above, the pseudo-color image to be cropped is resampled and remapped, thereby transforming the irregular region determined by the marked key points into a rectangle. Based on the first perspective transformation matrix and the output size, cropping is performed to obtain the target pseudo-color image.
[0186] The cropped target pseudo-color image contains only the soil profile area and can be used for display and subsequent manual review.
[0187] Based on this scheme, the tilt and perspective distortion caused by the shooting angle are uniformly corrected into a standard rectangular area by using the first perspective transformation matrix and the output size, thereby achieving high-precision automatic cropping of the soil profile area.
[0188] Optionally, if the target object includes TIF data to be processed, then in the preprocessing method for soil profile hyperspectral images provided in this embodiment of the invention, the above-mentioned S205 may further include S205b1 to S205b5.
[0189] S205b1 When the target object includes TIF data to be processed, the preprocessing system obtains the set cell size and geographic range of the TIF data.
[0190] The geographical range can be represented as: (minimum X coordinate of the target area, minimum Y coordinate of the target area, maximum X coordinate of the target area, maximum Y coordinate of the target area).
[0191] For example, the user sets the pixel size (pixel_size) = 1.5955 meters, and the geographic range of the output target TIF data is ( )for:( ); where (0,0) represents the minimum X coordinate (i.e., eastward) and minimum Y coordinate (i.e., southward) of the target area, respectively, and (1000,220) represents the maximum X coordinate and maximum Y coordinate of the target area, respectively. The unit of coordinates is meters.
[0192] S205b2. The preprocessing system generates a geographic affine transformation matrix based on the set pixel size and geographic range, calculates the output size of the TIF data, and calculates the second perspective transformation matrix based on the height of the output size of the TIF data.
[0193] The affine transformation matrix is used to align TIF data with the geographic coordinate system. In other words, it tells the GIS the actual coordinates of the (row, column)th pixel in the output image, without changing the pixel content; it only handles georeferencing.
[0194] It should be noted that if the output size of the TIF data is the same as the output size of the pseudo-color image, then there is no need to calculate the second perspective transformation matrix, and the first perspective transformation matrix can be reused as the second perspective transformation matrix.
[0195] For example, the output width of TIF data Output height of TIF data The geographic affine transformation matrix is then: .in, The X coordinate of the top-left corner of the target area. , The Y-coordinate of the upper right corner of the target area. .
[0196] Using the output width and height of the TIF data, a second target matrix is generated, wherein the second target matrix is: Then, based on the sorted key points and target points, the second perspective transformation matrix is calculated by combining the above formula (13).
[0197] S205b3, The preprocessing system calls the perspective transformation function to perform perspective transformation on the TIF data to be processed based on the second perspective transformation matrix.
[0198] S205b4. Based on the transformed rectangular region and the output size of the TIF data, the TIF data to be processed is cropped, and the target TIF data is geo-registered based on the geographic affine transformation matrix to obtain the target TIF data.
[0199] Each band of data is converted to a 32-bit floating-point number.
[0200] After cropping the TIF data based on the second perspective transformation matrix, a linear interpolation method is used to align the spatial pixel positions of the cropped TIF data with the spatial pixel positions in the TIF data to be processed, and the spatial pixel values outside the transformation area are set to invalid values NaN (Not a Number) to facilitate subsequent processing.
[0201] For example, perspective transformation is performed by applying the perspective transformation matrix using the cv2.warpPerspective function (OpenCV module cv2 function warp PerspectiveOpenCV) of OpenCV (Open Source Computer Vision Library). The linear interpolation method is cv2.INTER_LINEAR (OpenCV module cv2 interpolation constant INTER LINEAR).
[0202] S205b5: The preprocessing system updates the metadata of the target TIF data, writes data to each band, and calculates and updates the statistical information of the target TIF data.
[0203] The metadata of the target TIF data includes the affine transformation matrix.
[0204] Specifically, based on the TIF data parameter settings, the metadata of the target TIF data that needs to be updated includes: target TIF data size, second affine transformation matrix, and second CRS.
[0205] For example, a second CRS can be generated based on the Albers projection of ESRI:102025 and added to the metadata of the target TIF data so that the target pseudocolor image can be accurately located in GIS.
[0206] For each cropped band of data, select the non-NaN effective spatial pixels and calculate the minimum, maximum, mean, and standard deviation of the effective pixels for each band. Write the calculated statistical information into the metadata tags of the TIF data for each band to facilitate subsequent quality control and data analysis.
[0207] Minimum value: ;
[0208] Maximum value: ;
[0209] Mean: ;
[0210] Standard deviation: .
[0211] Based on this scheme, when the object to be processed is TIF data, the preprocessing system can call the previously generated TIF data and correct the tilt and perspective distortion caused by the shooting angle into a standard rectangular area according to the above processing method. This achieves high-precision automatic cropping and georegistration of TIF data in soil profile areas, simplifying the steps of manual operation.
[0212] Optionally, the preprocessing method for soil profile hyperspectral images provided in the embodiments of the present invention also provides an interactive interface. Figure 5 A schematic diagram of an interactive visual interface for a preprocessing system provided in an embodiment of the present invention, as shown below. Figure 5 As shown, the interactive visualization interface includes multiple navigation buttons and two display areas. The top navigation buttons include automatic cropping (501), manual cropping (502), and manual annotation (503). The first display area (504) displays the pseudo-color image before preprocessing, and the second display area (505) displays the pseudo-color image after preprocessing. The "Previous" (506) and "Next" (507) buttons switch between the pseudo-color images to be cropped in the first display area (504), and the "Previous" (508) and "Next" (509) buttons switch between the cropped pseudo-color images in the second display area (505). The bottom navigation buttons include buttons for opening the soil profile folder (510), correction and cropping (511), and exporting the cropped TIF data (512).
[0213] Furthermore, the preprocessing method for hyperspectral images of soil profiles provided in this embodiment of the invention may also include the following step S206.
[0214] S206. In response to the user's target input in the interactive interface, the preprocessing system displays the pseudo-color image after the target input in the second display area.
[0215] The target input can be any of the following: browsing the original pseudo-color image, scaling the original pseudo-color image, key point annotation, cropping the original pseudo-color image, cropping the TIF data to be processed, previewing the cropped target pseudo-color image, or outputting the cropping result.
[0216] For example, the interactive visual interface described above is an interactive interface based on PyQt5.
[0217] Optionally, users can choose to automatically annotate key points in the interactive interface, or they can choose to manually annotate key points in the interactive interface.
[0218] The cropping output includes at least one of the following: target TIF data and target pseudo-color image.
[0219] When a user chooses to manually annotate key points, they can manually select four key points of the soil profile in the first display area. The order in which the user selects the key points is unrestricted. After acquiring the four key points annotated by the user on the displayed pseudo-color image to be cropped using an external input device, the preprocessing system sorts the user-annotated key points according to the top left, top right, bottom right, and bottom left.
[0220] It is understandable that cropping using the four sorted key points can ensure that the cropping results of the manually annotated pseudo-color image to be cropped and the TIF data to be processed remain completely consistent in spatial range.
[0221] It should be noted that, for both automatic and manual cropping methods, the preprocessing system can store the first soil profile data set and the second soil profile data set respectively. Each soil profile data set includes: file path and cropping result.
[0222] Based on this scheme, the preprocessing system can implement both automatic and manual cropping. Through user operations on the display interface, users can browse, zoom, manually or automatically annotate key points, view automatic annotation results, and preview the cropped target pseudo-color image effect (including automatic and manual). In other words, the original pseudo-color image, key point detection results, and cropped soil profile area can be quickly previewed and navigated on the same interface. After cropping, the corresponding TIF data cropping results and metadata are associated and saved. This reduces the operational costs caused by repeated switching between multiple software programs and repeated saving of intermediate results in traditional processes. It achieves end-to-end synchronous management from image display to hyperspectral analysis data, improving the operability and traceability of batch task processing.
[0223] Optionally, in the preprocessing method for a hyperspectral image of a soil profile provided in this embodiment of the invention, after S204 above, S207 and S208 may also be included.
[0224] S207. If the clipping effect does not meet the preset conditions when the key point annotation model is automatically annotated, the preprocessing system will prompt you to switch the annotation method to manual annotation.
[0225] Specifically, the user uses the mouse to click on four key points of the first soil profile in a free-order sequence on the pseudo-color image.
[0226] The manual annotation of key points can be done by the user manually annotating four key points in the pseudo-color image to be cropped in the first display area, or by adjusting the position of at least one of the four key points based on the result of automatic annotation of the pseudo-color image to be cropped in the first display area.
[0227] S208. The preprocessing system acquires and sorts four key points of the soil profile region marked by the user in the pseudo-color image to be cropped displayed on the interactive interface, and regenerates the third perspective transformation matrix based on the sorted four key points.
[0228] Next, the irregular quadrilateral formed by the re-annotated key points is transformed into a rectangle through third perspective transformation, and the soil profile area is clipped based on the transformed rectangle.
[0229] In other words, after the user manually re-annotates, the preprocessing system automatically sorts the key points according to their coordinates and differences and constructs the corresponding perspective transformation matrix. The pseudo-color image to be cropped and the TIF data to be processed are then simultaneously subjected to perspective correction and cropping.
[0230] Based on this scheme, when automatic cropping is ineffective, users can freely click on four key points of the soil profile on the pseudo-color image to annotate them in a free order. The system automatically sorts these points based on their coordinate sums and differences and constructs the corresponding perspective transformation matrix. Perspective correction and cropping are then performed simultaneously on the pseudo-color image and TIF data. If automatic annotation results in missed or false detections, or if the cropping effect fails to meet preset conditions due to blurred boundaries, the system can prompt a switch to manual annotation and automatically sort and reconstruct the perspective matrix for the manually annotated key points. This allows for human-machine collaborative correction of abnormal samples without compromising geometric consistency, preventing a small number of failed samples from causing the entire batch of tasks to be interrupted and improving the system's robustness and usability in complex scenarios.
[0231] Optionally, in the preprocessing method for hyperspectral images of soil profiles provided in the embodiments of the present invention, after S208 above, S209 and S210 may also be included.
[0232] S209. The preprocessing system uses the pseudo-color image to be cropped and the user-annotated key points as online self-learning training samples.
[0233] S210. When the online self-learning conditions are met, the preprocessing system performs online self-learning training of the key point annotation model based on the samples in the online self-learning training sample set.
[0234] Among them, the online self-learning conditions must meet at least one of the following: the number of times the preprocessing system switches from automatic labeling to manual labeling is greater than the first preset number, the total number of labels in the preprocessing system is greater than the second preset number, and the number of samples in the online self-learning training sample set meets the preset number.
[0235] Based on this scheme, the YOLOv8-pose model is incrementally fine-tuned online through manual annotation results. Key points confirmed by the user during the manual cropping process are automatically added to the training set as new samples. When preset trigger conditions are met, the key point annotation model is periodically fine-tuned and trained, so that the model can gradually adapt to different soil types, imaging illumination and noise levels, reduce the proportion of manual intervention in subsequent samples, and thus achieve continuous improvement in model accuracy and optimization of system efficiency in the long term.
[0236] For example, Figure 5 is a schematic diagram of an overall cropping logic provided by an embodiment of the present invention. As shown in Figure 5, when preparing for cropping, the type of the cropping object is first determined. If the cropping object is a color image, the first pseudo-color image is read; if the cropping object is TIF data, the first TIF data and the first pseudo-color image are read. Then, it is determined whether the cropping method is automatic or manual annotation. If the cropping method is manual annotation, the coordinates of the manually annotated positions in the first pseudo-color image displayed by the user on the interface are obtained. If the cropping method is automatic annotation, the key point recognition model is called to automatically annotate the first pseudo-color image, obtain the key point coordinates of the soil profile, and determine whether the number of key points is equal to 4. If the number of key points is not equal to 4, manual annotation is switched; if the number of key points is equal to 4, the key point coordinates are sorted. For pseudo-color image cropping, the actual size of the output image is calculated based on the sorted key points, the perspective transformation matrix is calculated based on the actual size of the output image, and the perspective transformation function is called for cropping. For TIF data cropping, the set pixel size and geographic range of the output TIF data are obtained. Based on the set pixel size and geographic range of the output TIF data, the width and height of the cropped spectral data are calculated. It is then determined whether the output size of the pseudo-color image and the output size of the TIF data are the same. If they are the same, the perspective transformation matrix of the first pseudo-color image is reused; otherwise, a perspective transformation matrix for the first TIF data is generated. Then, the perspective transformation function is called to crop the first TIF data. Based on the width and height of the cropped spectral data, an affine transformation matrix is generated, metadata is updated, data for each band is written, and statistical information is calculated and updated.
[0237] It is understood that the preprocessing system provided in this embodiment of the invention integrates steps such as pseudo-color image generation, radiometric correction, geometric correction, georegistration, and automatic / manual cropping into one processing system. It manages data import, parameter configuration, processing progress, and result preview through a unified graphical user interface. The preprocessing system performs batch task processing in a queue and result export, which can improve the preprocessing efficiency of hyperspectral data in soil profile areas.
[0238] Corresponding to the embodiments of the aforementioned methods, the present invention also provides embodiments of the apparatus and the terminal to which it is applied.
[0239] The embodiments of the preprocessing method for hyperspectral images of soil profiles of the present invention can be applied to computer devices, such as servers or terminal devices. The method embodiments can be implemented in software, hardware, or a combination of both. Taking software implementation as an example, as a logical device, it is formed by the processor of the soil profile hyperspectral image preprocessing device loading the corresponding computer program instructions from non-volatile memory into memory for execution. From a hardware perspective, such as... Figure 7 The diagram shown is a hardware structure diagram of a computer device used in a preprocessing method for a soil profile hyperspectral image provided in an embodiment of the present invention, except for... Figure 7 In addition to the processor 710, memory 730, network interface 720, and non-volatile memory 740 shown, the server or electronic device where the preprocessing method for the hyperspectral image of the soil profile in the embodiment is located may also include other hardware depending on the actual function of the computer device, which will not be described in detail here.
[0240] In one embodiment, a computer device is provided, comprising: a memory and a processor, the memory storing a computer program, the processor executing the computer program to implement any step in the preprocessing method for the above-described hyperspectral image of a soil profile.
[0241] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, can perform any step in the preprocessing method for the above-described hyperspectral image of a soil profile.
[0242] The specific implementation process of the functions and roles of each module in the above device can be found in the implementation process of the corresponding steps in the above method, and will not be repeated here.
[0243] For the device embodiments, since they basically correspond to the method embodiments, the relevant parts can be referred to in the description of the method embodiments. The device embodiments described above are merely illustrative. The modules described as separate components may or may not be physically separate, and the components shown as modules may or may not be physical modules, that is, they may be located in one place or distributed across multiple network modules. Some or all of the modules can be selected to achieve the purpose of the present invention according to actual needs. Those skilled in the art can understand and implement this without creative effort.
[0244] The foregoing has described specific embodiments of the invention. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired results. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
[0245] Other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the invention claimed herein. This invention is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not claimed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of the invention are indicated by the following claims.
[0246] It should be understood that the present invention is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of the invention is limited only by the appended claims.
[0247] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A preprocessing method for hyperspectral images of soil profiles, characterized in that, The method includes: The preprocessing system generates a first hyperspectral data cube based on the hyperspectral data of the first soil profile, and extracts the center wavelength of each band corresponding to the first soil profile to obtain a first wavelength set. The preprocessing system generates a target object to be processed based on the first hyperspectral data cube and the center wavelength in the first wavelength set; wherein, the target object includes at least one of the following: a pseudo-color image to be cropped, and TIF data in the marked image file format to be processed; the pseudo-color image to be cropped is a color image obtained by stitching together the red, green and blue single-channel color images extracted by the preprocessing system based on the first hyperspectral data cube; The preprocessing system acquires key points of the marked soil profile area in the target object and extracts the coordinates of the key points; The preprocessing system sorts and marks the key points, and generates a target perspective transformation matrix based on the sorted key points; The preprocessing system, based on the target perspective transformation matrix and the target size of the target object, clips the soil profile area in the target object, and outputs the clipped target object if the clipping effect meets preset conditions.
2. The method according to claim 1, characterized in that, The preprocessing system generates a target object to be processed based on the first hyperspectral data cube and the center wavelength in the first wavelength set, including: The preprocessing system determines the index of the target color band based on the first wavelength set and the preset color wavelengths; The preprocessing system extracts three single-channel color images of red, green and blue light corresponding to the index of the target color band from the first hyperspectral data cube. The preprocessing system performs comparison stretching and normalization on the three extracted single-channel color images, and then stitches them together in the order of red, green, and blue to obtain the original pseudo-color image of the three channels.
3. The method according to claim 2, characterized in that, The preprocessing system determines the index of the target color band based on the first wavelength set and a preset color wavelength, including: Based on the first preset formula, determine the index of the band in the first wavelength set that is closest to the preset color wavelength; The first preset formula includes: ; in, Indicates the index of the band. Indicates the first wavelength in the first wavelength set The center wavelength of each band This represents the preset color wavelength, which can be any one of the red, green, and blue wavelengths. argmin represents the index of the minimum value.
4. The method according to claim 3, characterized in that, The method further includes: The preprocessing system corrects the display orientation of the original pseudo-color image to the orientation of the key point recognition model's display image, saves it as a format file that the key point recognition model can read, and obtains the pseudo-color image to be cropped. The key point recognition model is obtained by pre-training a deep learning model.
5. The method according to claim 1, characterized in that, The preprocessing system generates a target object to be processed based on the first hyperspectral data cube and the center wavelength in the first wavelength set, including: The preprocessing system obtains a reference DN value array based on the digital quantization value (DN) file of the reference plate, and obtains a first reference reflectance value array based on the reflectance file of the reference plate; The preprocessing system calls a pre-built target interpolation function to interpolate the first wavelength value array and the first reflectance value array to obtain the second reference reflectance array corresponding to the first wavelength set. The preprocessing system converts the DN value of each band of each spatial pixel in the first hyperspectral data cube into a reflectance value based on the DN value of each band of the first hyperspectral data cube, the reference DN value array, and the second reference reflectance array, to obtain a first reflectance cube; After performing geometric correction on the first reflectivity cube according to the geographic coordinate system direction, the preprocessing system performs band order BSQ format conversion based on the writing requirements of TIF data, sets geographic reference information according to the preset projection model, and generates TIF data to be processed.
6. The method according to claim 5, characterized in that, Before the preprocessing system calls the pre-built target interpolation function, the method further includes: The preprocessing system, based on the first wavelength value array, the first reflectivity array, and the first wavelength set, uses linear interpolation to construct the target interpolation function based on a second preset formula. The second preset formula includes: ; in, The target interpolation function is represented by ref_wl, which represents the first wavelength array. This represents the first reflectivity array. This represents the center wavelength in the first wavelength set.
7. The method according to claim 1, characterized in that, The key points of the sorting mark in the preprocessing system include: The preprocessing system extracts four key points based on the marked positions in the pseudo-color image to be cropped, and obtains all permutations of the four key points located at the four corner positions of the quadrilateral. The preprocessing system calculates the geometric consistency index for each arrangement, and determines the comprehensive score for each arrangement based on the geometric consistency index and a preset comprehensive scoring function. The geometric consistency index includes: side length consistency error, rectangle orthogonality error, and direction prior error. The side length consistency error includes: width consistency error and height consistency error. The preprocessing system uses the order of the key points with the lowest overall score as the order of the four corners of the quadrilateral.
8. The method according to claim 7, characterized in that, The preprocessing system calculates the geometric consistency index for each permutation, including: The preprocessing system calculates the top and bottom side lengths of the quadrilateral formed by the arrangement based on the order of the four key points in the arrangement, and calculates the width consistency error based on the top and bottom side lengths. The preprocessing system calculates the left and right heights of the quadrilateral formed by the arrangement based on the order of the four key points in the arrangement, and calculates the height consistency error based on the left and right heights. The preprocessing system calculates the top and left side vectors of the quadrilateral formed by the order of the four key points in the arrangement, and calculates the included angle deviation value based on the top and left side vectors of the quadrilateral. The preprocessing system calculates the horizontal and vertical deviations of the quadrilateral formed by the arrangement of four key points, and determines the comprehensive directional error based on the horizontal and vertical deviations. The preprocessing system calculates a comprehensive score for each arrangement based on width consistency error, height consistency error, included angle deviation, and comprehensive direction error.
9. The method according to claim 1, characterized in that, The preprocessing system sorts and marks key points, and generates a target perspective transformation matrix based on the sorted key points, including: The preprocessing system determines the target width based on the horizontal distance between the upper and lower edges of the sorted key points; and determines the target height based on the vertical distance between the left and right edges of the sorted key points. The preprocessing system generates a first target matrix based on the target width and the target height, and calculates a first perspective transformation matrix based on the sorted keypoint coordinates and the first target matrix.
10. The method according to claim 1, characterized in that, The preprocessing system, based on the target perspective transformation matrix and the target size of the target object, trims the soil profile region in the target object, including: The preprocessing system calls the perspective transformation algorithm. Based on the first perspective transformation matrix, it transforms the irregular regions determined by the sorted key points in the pseudo-color image to be cropped into rectangular regions through perspective transformation, and obtains the output size of the pseudo-color image. The preprocessing system crops the pseudo-color image to be cropped based on the transformed rectangular region and the output size of the pseudo-color image, thus obtaining the target pseudo-color image.
11. The method according to claim 10, characterized in that, The preprocessing system, based on the target perspective transformation matrix and the target size of the target object, trims the soil profile region in the target object, including: When the target object includes the TIF data to be processed, the preprocessing system obtains the set cell size and geographic range of the TIF data; The preprocessing system generates a geographic affine transformation matrix based on the set pixel size and geographic range, calculates the output size of the TIF data, and calculates a second perspective transformation matrix based on the output size of the TIF data. The preprocessing system calls the perspective transformation function to perform perspective transformation on the TIF data to be processed according to the second perspective transformation matrix. Based on the transformed rectangular region and the output size of the TIF data, the TIF data to be processed is cropped, and georegistration is performed based on the geographic affine transformation matrix to obtain the target TIF data; The preprocessing system updates the metadata of the target TIF data, writes data for each band, calculates and updates the statistical information of the target TIF data, and the metadata of the target TIF data includes the affine transformation matrix.
12. The method according to claim 1, characterized in that, The preprocessing system includes an interactive visual interface, which includes a first display area and a second display area; the method further includes: In response to a user’s target input in the interactive interface, the preprocessing system displays the pseudo-color image to be cropped after the manipulation input in a second display area; The target input can be any one of the following: browsing the pseudo-color image to be cropped, scaling the pseudo-color image to be cropped, key point annotation, cropping the pseudo-color image to be cropped, cropping the TIF data to be processed, previewing the cropped target pseudo-color image, or outputting the cropping result.
13. The method according to claim 12, characterized in that, The method further includes: If the clipping effect does not meet the preset conditions when the key point annotation model is automatically annotated, the preprocessing system will prompt you to switch the annotation method to manual annotation. The preprocessing system acquires and sorts the key points of the soil profile area marked by the user in the pseudo-color image to be cropped displayed on the interactive interface, and regenerates the target perspective transformation matrix based on the sorted key points.
14. A preprocessing system for hyperspectral images of soil profiles, characterized in that, The preprocessing system includes: a data reading module, a preprocessing object generation module, a key point extraction module, a perspective transformation matrix generation module, and a hyperspectral data cropping module; The data reading module is used to generate a first hyperspectral data cube based on the hyperspectral data of the first soil profile, and extract the center wavelength of each band corresponding to the first soil profile to obtain a first wavelength set. The preprocessing object generation module is used to generate a target object to be processed based on the first hyperspectral data cube and the center wavelength in the first wavelength set; wherein, the target object includes at least one of the following: a pseudo-color image to be cropped, and TIF data in the marker image file format to be processed; the pseudo-color image to be cropped is a color image obtained by stitching together the red, green and blue single-channel color images extracted by the preprocessing system based on the first hyperspectral data cube; The key point coordinate extraction module is used to obtain key points in the marked soil profile area of the target object and extract the coordinates of the key points. The perspective transformation matrix generation module is used to sort the marked key points and generate a target perspective transformation matrix based on the sorted key points. The hyperspectral data cropping module is used to crop the soil profile region in the target object based on the target perspective transformation matrix and the target size of the target object, and output the cropped target object when the cropping effect meets the preset conditions.