Insect recognition method based on hyperspectral image and multi-scale fusion network
By constructing an insect identification method based on hyperspectral images and multi-scale fusion networks, the problems of low insect identification accuracy and high computational resource consumption in existing technologies are solved, realizing a lightweight and efficient insect identification model that is suitable for real-time monitoring in agricultural fields.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN AGRI UNIV
- Filing Date
- 2026-05-28
- Publication Date
- 2026-06-26
AI Technical Summary
Existing deep learning-based insect identification methods fail to fully exploit the multi-scale, three-dimensional coupling features of hyperspectral data, resulting in low identification accuracy, poor generalization ability, complex models, high computational resource consumption, and difficulty in real-time monitoring in agricultural fields.
An insect recognition method based on hyperspectral images and multi-scale fusion networks is adopted. A lightweight and efficient insect recognition model is constructed by using a three-layer feature extraction backbone network, a multi-scale fusion module and a width learning system classifier. The 3D convolutional composite attention module is used to process the three-dimensional data cube of insects, fuse multi-scale features, and calculate the output weight matrix through a pseudo-inverse solving module.
It significantly improves the accuracy and robustness of insect identification, enhances the model's generalization ability to closely related species and complex backgrounds, and achieves efficient real-time identification on resource-constrained devices.
Smart Images

Figure CN122290179A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image recognition technology, and more specifically to an insect recognition method based on hyperspectral images and multi-scale fusion networks. Background Technology
[0002] Accurate identification and dynamic monitoring of insect species are crucial for agricultural pest and disease control, ecosystem assessment, and early warning of invasive species. Traditional methods relying on manual morphological identification suffer from low efficiency and high subjectivity, making them unsuitable for large-scale real-time monitoring. Deep learning recognition technology based on visible light RGB (Red, Green, Blue) images has limited accuracy due to the small size of insects and the morphological similarity of closely related species, relying solely on two-dimensional features such as color and texture.
[0003] Hyperspectral imaging technology can simultaneously acquire the spatial morphology and fine spectral information of insects, providing a new technical approach for distinguishing morphologically similar species. However, existing deep learning-based hyperspectral insect identification methods still have significant shortcomings: First, most methods use 1D or 2D convolutional neural networks, failing to fully exploit the inherent "spatial-spectral" three-dimensional coupling characteristics of hyperspectral data, resulting in weak joint feature extraction capabilities; Second, the high cost of collecting and labeling insect hyperspectral samples leads to limited training samples, making traditional deep models prone to overfitting and exhibiting poor generalization ability; Third, insect physical characteristics have multi-scale features, and existing models often rely on single-scale deep features for classification, losing crucial fine-grained texture details and resulting in incomplete feature representation; Fourth, existing high-performance networks are typically complex in structure and have a large number of parameters, resulting in long training and inference times, making them difficult to deploy on agricultural field equipment with limited computing resources for real-time monitoring.
[0004] Therefore, how to fully explore the multi-scale, three-dimensional coupled features of insect hyperspectral images with limited samples, and to construct a lightweight and efficient recognition model, has become a technical problem that urgently needs to be solved in this field. Summary of the Invention
[0005] To address the aforementioned shortcomings in existing technologies, this invention provides an insect identification method based on hyperspectral images and multi-scale fusion networks.
[0006] To achieve the above-mentioned objectives, the technical solution adopted by this invention is as follows: The insect identification method based on hyperspectral images and multi-scale fusion networks includes the following steps: The training data of the insect hyperspectral image training data is preprocessed to obtain the training data of the insect three-dimensional data cube; An insect recognition model is constructed based on a multi-scale fusion network. The insect recognition model is trained using training data from insect 3D data cubes to obtain the trained insect recognition model. The insect hyperspectral image data to be identified is preprocessed to obtain the insect three-dimensional data cube data to be identified. The insect identification result is obtained based on the insect three-dimensional data cube data to be identified and the trained insect identification model.
[0007] Furthermore, the training data of the insect hyperspectral images is preprocessed to obtain the training data of the insect three-dimensional data cube. The specific process is as follows: Black and white board radiometric correction was performed on the insect hyperspectral image training data to obtain the corrected insect hyperspectral image training data. Principal component analysis was used to reduce the spectral dimension of the corrected insect hyperspectral image training data to obtain insect hyperspectral image training data with redundant bands removed. Training data for a three-dimensional insect data cube is constructed using insect hyperspectral images with redundant bands removed, with each insect hyperspectral image serving as a training sample.
[0008] Furthermore, the insect identification model comprises a three-layer feature extraction backbone network, a multi-scale fusion module, and a width learning system classifier connected in sequence. The three-layer feature extraction backbone network is used to extract highly discriminative spatial-spectral joint features from the insect's three-dimensional data cube layer by layer according to the cascaded 3D convolutional composite attention module, obtaining shallow features, mid-level features, and deep features of the insect hyperspectral image. The multi-scale fusion module is used to fuse the shallow, mid-level, and deep features of the insect hyperspectral image, obtaining the multi-scale fused features of the insect hyperspectral image. The width learning system classifier is used to identify the multi-scale fused features of the insect hyperspectral image through random mapping and pseudo-inverse analysis, obtaining the insect species identification result.
[0009] Furthermore, the three-layer feature extraction backbone network includes a first 3D convolutional composite attention module, a first reshaping layer, a first global average pooling layer, a second 3D convolutional composite attention module, a second reshaping layer, a second global average pooling layer, a third 3D convolutional composite attention module, a third reshaping layer, a third global average pooling layer, and a 2D convolutional layer. The input of the first 3D convolutional composite attention module is connected to the input insect 3D data cube, the output of the first 3D convolutional composite attention module is connected to the input of the first reshaping layer, the output of the first reshaping layer is connected to the input of the first global average pooling layer, and the output of the first global average pooling layer is simultaneously connected to the second 2D convolutional layer. The input of the 3D convolutional composite attention module and the multi-scale fusion module are connected. The output of the second 3D convolutional composite attention module is connected to the input of the second reshaping layer. The output of the second reshaping layer is connected to the input of the second global average pooling layer. The output of the second global average pooling layer is simultaneously connected to the input of the third 3D convolutional composite attention module and the multi-scale fusion module. The output of the third 3D convolutional composite attention module is connected to the input of the third reshaping layer. The output of the third reshaping layer is connected to the input of the third global average pooling layer. The output of the third global average pooling layer is connected to the input of the 2D convolutional layer. The output of the 2D convolutional layer is connected to the multi-scale fusion module.
[0010] Furthermore, the first 3D convolutional composite attention module, the second 3D convolutional composite attention module, and the third 3D convolutional composite attention module all include a 3D convolutional layer, a compressed activation attention branch, a Rega attention branch, and a spectral attention branch connected in parallel with the output of the 3D convolutional layer, and a feature fusion layer that simultaneously connects the output of the compressed activation attention branch, the output of the Rega attention branch, and the output of the spectral attention branch.
[0011] Furthermore, the data processing procedure for compressing the attention branch is as follows: global average pooling is performed on the feature map output by the 3D convolutional layer to obtain the global description vector of each channel; then, the dependency relationship between different channels is learned through a bottleneck mapping structure containing two fully connected layers to generate the weight coefficients of each channel; finally, the weight coefficients of each channel are applied to the feature map output by the 3D convolutional layer by multiplying the channel elements one by one to obtain the channel-enhanced feature map.
[0012] Furthermore, the data processing procedure of the Rega attention branch is as follows: the feature map output by the 3D convolutional layer is convolved using a non-uniform sampling convolutional kernel with high weights in the central region and low weights in the peripheral region to expand the effective receptive field and enhance the response of the central key region; then, a Rega attention weight map is generated based on the output of the convolution operation, and the Rega attention weight map is applied to the feature map output by the 3D convolutional layer by element-wise multiplication to obtain a spatially enhanced feature map.
[0013] Furthermore, the data processing procedure for the spectral attention branch is as follows: the feature map output by the 3D convolutional layer is compressed in the spatial dimension while retaining the spectral response information to construct a band description vector; then, based on the band description vector, the correlation and importance between different bands are learned through the mapping unit to generate the weight coefficients of each band; finally, the weight coefficients of each band are applied to the spectral dimension of the feature map output by the 3D convolutional layer by element-wise multiplication to obtain the spectral enhanced feature map.
[0014] Furthermore, the multi-scale fusion module includes a shallow feature extraction branch, a mid-level feature extraction branch, a deep feature extraction branch, and a concatenation layer. The input of the shallow feature extraction branch is connected to the output of the first global average pooling layer in the three-layer feature extraction backbone network, and the output of the shallow feature extraction branch is connected to the input of the concatenation layer. The input of the mid-level feature extraction branch is connected to the output of the second global average pooling layer in the three-layer feature extraction backbone network, and the output of the mid-level feature extraction branch is connected to the input of the concatenation layer. The input of the deep feature extraction branch is connected to the output of the 2D convolutional layer in the three-layer feature extraction backbone network, and the output of the deep feature extraction branch... The input end of the stitching layer is connected to the output end of the stitching layer, which serves as the output end of the multi-scale fusion module. The shallow feature extraction branch is used to extract shallow features, including local texture, edge contours, and fine-grained spatial information in the insect hyperspectral image. The mid-level feature extraction branch is used to extract mid-level features that combine spatial details and semantic expression capabilities. The deep feature extraction branch is used to extract deep features, including high-level abstract semantic features and global discriminative information. The stitching layer is used to stitch and fuse the shallow, mid-level, and deep features to obtain the multi-scale fusion features of the insect hyperspectral image, and the multi-scale fusion features are used as the output of the multi-scale fusion module.
[0015] Furthermore, the width learning system classifier includes a feature mapping module, an enhancement node generation module, a feature stitching layer, and a pseudo-inverse solving module. The input of the feature mapping module is connected to the output of the multi-scale fusion module. The output of the feature mapping module is simultaneously connected to the inputs of the enhancement node generation module and the feature stitching layer. The output of the enhancement node generation module is connected to the input of the feature stitching layer. The output of the feature stitching layer is connected to the input of the pseudo-inverse solving module, and the output of the pseudo-inverse solving module serves as the output of the width learning system classifier. The feature mapping module is used to randomly and linearly map the multi-scale fusion features of the insect hyperspectral image into a set of features. The system consists of a feature node generation module and an enhancement node generation module. The enhancement node generation module performs a nonlinear transformation on the feature nodes based on randomly initialized weights and biases to generate enhancement nodes, thereby enhancing the classifier's ability to express complex nonlinear features. The feature concatenation layer concatenates the feature nodes and enhancement nodes laterally to form a system state matrix for classification. The pseudo-inverse solving module analyzes and solves the output weight matrix based on the system state matrix and the insect species labels corresponding to the insect 3D data cube training data. During insect recognition, it uses the system state matrix corresponding to the hyperspectral image of the insect to be identified and the output weight matrix to perform classification calculations and obtain the insect recognition results.
[0016] The beneficial effects of this invention are as follows: (1) This invention constructs an insect recognition model based on a multi-scale fusion network and introduces a first 3D convolutional composite attention module, a second 3D convolutional composite attention module and a third 3D convolutional composite attention module into the insect recognition model. It directly processes the three-dimensional data cube of insects and adaptively focuses on key spectral channels, which effectively solves the problem of insufficient spatial-spectral feature correlation mining in traditional methods. Thus, it significantly improves the recognition accuracy and feature robustness of the model in distinguishing closely related species and in high-noise scenarios. (2) By designing a multi-scale fusion module, this invention simultaneously integrates shallow features, mid-level features and deep features of insect hyperspectral images, thereby overcoming the limitations of single-scale features and enabling the insect recognition model to have stronger generalization ability and feature expression completeness for insect posture changes and complex backgrounds. (3) This invention replaces the traditional deep classifier with a width learning system classifier and uses a pseudo-inverse solving module to calculate the pseudo-inverse of the system state matrix to obtain the final output weight matrix. This greatly reduces the number of parameters and training time of the insect recognition model, effectively solves the problems of high computational load and difficult deployment of traditional deep models, and realizes high-efficiency, low-latency insect recognition on resource-constrained field edge devices. Attached Figure Description
[0017] Figure 1 This is a schematic diagram of the insect identification method based on hyperspectral images and multi-scale fusion networks; Figure 2 This is a schematic diagram of the insect recognition model structure; Figure 3 This is a schematic diagram of the 3D convolutional composite attention module structure. Detailed Implementation
[0018] The specific embodiments of the present invention are described below to enable those skilled in the art to understand the present invention. However, it should be understood that the present invention is not limited to the scope of the specific embodiments. For those skilled in the art, various changes are obvious as long as they are within the spirit and scope of the present invention as defined and determined by the appended claims. All inventions utilizing the concept of the present invention are protected.
[0019] like Figure 1 As shown, the insect identification method based on hyperspectral images and multi-scale fusion networks includes steps S1-S3, as detailed below: S1. Preprocess the training data of the insect hyperspectral image to obtain the training data of the insect three-dimensional data cube.
[0020] In an optional embodiment of the present invention, the present invention preprocesses the insect hyperspectral image training data to obtain training data for the insect three-dimensional data cube. The specific process is as follows: performing black-and-white radiometric correction on the insect hyperspectral image training data to obtain corrected insect hyperspectral image training data; performing principal component analysis to reduce the spectral dimension of the corrected insect hyperspectral image training data to obtain insect hyperspectral image training data with redundant bands removed; and constructing the training data for the insect three-dimensional data cube based on the insect hyperspectral image training data with redundant bands removed, using each insect hyperspectral image as a training sample.
[0021] Specifically, this invention is based on the training data of insect hyperspectral images after removing redundant bands. Each insect hyperspectral image is used as a training sample. The spatial dimension and spectral dimension data corresponding to a single insect hyperspectral image are organized into an insect three-dimensional data cube to obtain the training data of the insect three-dimensional data cube.
[0022] This invention performs black-and-white board radiometric correction on insect hyperspectral image training data to obtain corrected insect hyperspectral image training data, including steps A1-A3, as follows: A1. Collect the digital quantization values of the standard whiteboard image and the dark background image from the insect hyperspectral image training data.
[0023] A2. Based on the digital quantization values of the standard white background image and the dark background image in the insect hyperspectral image training data, calculate the relative reflectance data of the insect hyperspectral image training data. The expression is as follows: in: The relative reflectance of the training data for insect hyperspectral images. The original grayscale values of the training data for insect hyperspectral images. For the digital quantization values of dark background images in the insect hyperspectral image training data, This refers to the digital quantization value of the standard whiteboard image in the insect hyperspectral image training data.
[0024] A3. Convert the original grayscale values of the insect hyperspectral image training data into the corresponding relative reflectance data to obtain the corrected insect hyperspectral image training data.
[0025] S2. Construct an insect recognition model based on a multi-scale fusion network, and train the insect recognition model using training data from insect 3D data cubes to obtain the trained insect recognition model.
[0026] In an optional embodiment of the present invention, the insect recognition model includes a three-layer feature extraction backbone network, a multi-scale fusion module, and a width learning system classifier connected in sequence, such as Figure 2 As shown, a three-layer feature extraction backbone network is used to extract highly discriminative spatial-spectral joint features from the insect's three-dimensional data cube layer by layer based on cascaded 3D convolutional composite attention modules, resulting in shallow, mid-level, and deep features of the insect hyperspectral image. A multi-scale fusion module is used to fuse these features to obtain multi-scale fused features of the insect hyperspectral image. A width learning system classifier is used to identify the multi-scale fused features of the insect hyperspectral image through random mapping and pseudo-inverse analysis, resulting in insect species identification.
[0027] The three-layer feature extraction backbone network includes a first 3D convolutional composite attention module, a first reshaping layer, a first global average pooling layer, a second 3D convolutional composite attention module, a second reshaping layer, a second global average pooling layer, a third 3D convolutional composite attention module, a third reshaping layer, a third global average pooling layer, and a 2D convolutional layer. The input of the first 3D convolutional composite attention module is connected to the input of the insect 3D data cube, the output of the first 3D convolutional composite attention module is connected to the input of the first reshaping layer, the output of the first reshaping layer is connected to the input of the first global average pooling layer, and the output of the first global average pooling layer is simultaneously connected to the second 3D convolutional composite attention module. The input of the multi-scale attention module and the multi-scale fusion module are connected. The output of the second 3D convolutional multi-scale attention module is connected to the input of the second reshaping layer. The output of the second reshaping layer is connected to the input of the second global average pooling layer. The output of the second global average pooling layer is simultaneously connected to the input of the third 3D convolutional multi-scale attention module and the multi-scale fusion module. The output of the third 3D convolutional multi-scale attention module is connected to the input of the third reshaping layer. The output of the third reshaping layer is connected to the input of the third global average pooling layer. The output of the third global average pooling layer is connected to the input of the 2D convolutional layer. The output of the 2D convolutional layer is connected to the multi-scale fusion module.
[0028] like Figure 3 As shown, the first, second, and third 3D convolutional composite attention modules all include a 3D convolutional layer, a compressed activation attention branch, a Rega attention branch, and a spectral attention branch connected in parallel with the output of the 3D convolutional layer, and a feature fusion layer that simultaneously connects the outputs of the compressed activation attention branch, the Rega attention branch, and the spectral attention branch. The feature fusion layer is used to integrate the features output by the three attention branches to obtain the output feature map of the corresponding 3D convolutional composite attention module.
[0029] The data processing procedure for the compressed attention branch is as follows: global average pooling is performed on the feature map output by the 3D convolutional layer to obtain the global description vector of each channel; then, the dependency relationship between different channels is learned through the bottleneck mapping structure containing two fully connected layers to generate the weight coefficients of each channel; finally, the weight coefficients of each channel are applied to the feature map output by the 3D convolutional layer by element-wise multiplication of each channel to obtain the channel-enhanced feature map.
[0030] The data processing procedure for the Rega attention branch is as follows: a non-uniform sampling convolution kernel with high weights in the central region and low weights in the peripheral region is used to perform convolution operations on the feature map output by the 3D convolutional layer in order to expand the effective receptive field and enhance the response of the central key region; then, a Rega attention weight map is generated based on the output of the convolution operation, and the Rega attention weight map is applied to the feature map output by the 3D convolutional layer by element-wise multiplication to obtain a spatially enhanced feature map.
[0031] The data processing procedure for the spectral attention branch is as follows: the feature map output by the 3D convolutional layer is compressed in the spatial dimension while retaining the spectral response information to construct a band description vector; then, based on the band description vector, the correlation and importance between different bands are learned through the mapping unit to generate the weight coefficients of each band; finally, the weight coefficients of each band are applied to the spectral dimension of the feature map output by the 3D convolutional layer by element-wise multiplication to obtain the spectral enhanced feature map.
[0032] Specifically, in this invention, the input insect three-dimensional data cube is set as follows: The feature extraction process of the three-layer feature extraction backbone network is as follows: For the A multi-layer 3D convolutional composite attention module, in which The feature extraction process is uniformly represented as follows: first, the input features are processed... The 3D convolution operation yields the _th _ ... The 3D convolutional layer outputs a feature map; then inputs the 3D convolutional layer into the 3D convolutional layer. The first compressed stimulus attention branch, the first The Rega attention branch and the first Each spectral attention branch yields a channel-enhanced feature map, a spatial-enhanced feature map, and a spectral-enhanced feature map. Subsequently, the outputs of the three attention branches are concatenated, and then... Convolution performs channel integration to obtain the first... The first 3D convolutional composite attention module outputs a feature map; then it is processed by the... The reshaping layer adjusts the feature dimensions and inputs the first layer. The global average pooling layer is used to perform global average pooling to obtain the first global average pooling layer. The layer outputs a feature map. Its unified expression is: in: For the first Feature maps output by a 3D convolutional layer For the first Channel-enhanced feature maps output from each compressed attention branch. For the first Spatial augmented feature maps output by each Rega attention branch For the first Spectral enhancement feature maps output by each spectral attention branch. Output the concatenated feature map for the three attention branches. For the first Feature maps output by a 3D convolutional composite attention module For the first The feature map output by each reshaping layer For the first Feature maps output by a global average pooling layer Indicates the first A 3D convolution operation, Indicates the first A compressed, attention-weighted computation. Indicates the first Rega attention weighted operations, Indicates the first Each spectral attention weighted operation This is an element-wise multiplication operation. For splicing operations, Indicates the first In each fusion layer Convolution operation, Indicates the first A reshaping operation. Indicates the first A global average pooling operation.
[0033] The expression for the compressed attention-weighted operation is: in: For the first The channel description vector is obtained by global average pooling of the feature maps output by each 3D convolutional layer. This is a global average pooling operation. For the first Bottleneck hidden layer output For the first The channel weight vector generated by the compressed attention branch and The first Two weight matrices for a fully connected layer and The first The two bias terms corresponding to each fully connected layer Represents a non-linear activation function. This represents the Sigmoid activation function.
[0034] The expression for Rega attention weighting is: in, For the first The feature map output from a 3D convolutional layer is enhanced by Rega convolution. For the first The Rega attention branch uses a non-uniform sampling convolution kernel. For the first The spatial attention weight graph generated by the Rega attention branch. This represents the Rega convolution operation. This indicates the average pooling operation.
[0035] The expression for spectral attention weighting is: in: For the first The band description vector is obtained by spatially compressing the feature maps output by each 3D convolutional layer. For the first Layered spectral mapping functions are used to learn the correlation and importance between different bands. For the first The band weight vector generated by the layer spectral attention branch, This indicates a space compression operation.
[0036] Through the data processing of the aforementioned 3D convolutional composite attention module, this invention can jointly model the feature maps output by the 3D convolutional layer from the channel dimension, spatial dimension, and spectral dimension, thereby achieving adaptive enhancement of key discriminative feature channels, key spatial regions, and key spectral bands, and effectively suppressing redundant information, thus improving the expressive power of the spatial-spectral joint features of insect hyperspectral images.
[0037] The multi-scale fusion module includes shallow feature extraction branches, mid-level feature extraction branches, deep feature extraction branches, and a concatenation layer. The input of the shallow feature extraction branch is connected to the output of the first global average pooling layer in the three-layer feature extraction backbone network, and its output is connected to the input of the concatenation layer. The input of the mid-level feature extraction branch is connected to the output of the second global average pooling layer in the three-layer feature extraction backbone network, and its output is connected to the input of the concatenation layer. The input of the deep feature extraction branch is connected to the output of the 2D convolutional layer in the three-layer feature extraction backbone network, and its output is connected to... The input of the stitching layer is connected to the output of the multi-scale fusion module. The shallow feature extraction branch is used to extract shallow features, including local texture, edge contours, and fine-grained spatial information in the insect hyperspectral image. The mid-level feature extraction branch is used to extract mid-level features that have both spatial detail and semantic expression capabilities. The deep feature extraction branch is used to extract deep features, including high-level abstract semantic features and global discriminative information. The stitching layer is used to stitch and fuse the shallow, mid-level, and deep features to obtain the multi-scale fusion features of the insect hyperspectral image, and the multi-scale fusion features are used as the output of the multi-scale fusion module.
[0038] The width learning system classifier includes a feature mapping module, an enhancement node generation module, a feature concatenation layer, and a pseudo-inverse solving module. The input of the feature mapping module is connected to the output of the multi-scale fusion module. The output of the feature mapping module is simultaneously connected to the inputs of both the enhancement node generation module and the feature concatenation layer. The output of the enhancement node generation module is connected to the input of the feature concatenation layer. The output of the feature concatenation layer is connected to the input of the pseudo-inverse solving module, which serves as the output of the width learning system classifier. The feature mapping module is used to randomly and linearly map the multi-scale fusion features of insect hyperspectral images into a set of feature nodes. The system consists of several modules: a feature node generation module, an enhancement node generation module, and a feature concatenation module. The former performs nonlinear transformations on feature nodes based on randomly initialized weights and biases to generate enhancement nodes, thereby enhancing the classifier's ability to express complex nonlinear features. The latter concatenates feature nodes and enhancement nodes laterally to form a system state matrix for classification. The former uses a pseudo-inverse solving module to analytically solve for the output weight matrix based on the system state matrix and the insect species labels corresponding to the insect 3D data cube training data. During insect recognition, the former uses the system state matrix corresponding to the hyperspectral image of the insect to be identified and the output weight matrix to perform classification calculations and obtain the insect recognition result.
[0039] The operational expression for the width learning system classifier in this invention is: in: The multi-scale fusion features output by the multi-scale fusion module. The feature node matrix, To enhance the node matrix, The system state matrix, and These are the random weight matrix and bias vector of the feature mapping module, respectively. and These are the random weight matrix and bias vector of the enhanced node generation module, respectively. Represents the feature mapping function. Represents a non-linear activation function. System state matrix The pseudo-inverse matrix, The label matrix of the training samples, This is the final output weight matrix.
[0040] In this invention, the output weight matrix For the width learning system classifier during the training phase, based on the system state matrix and training sample label matrix The final classification parameters obtained through learning are output as a weight matrix. After training, the samples are retained for insect species identification in subsequent samples to be identified.
[0041] In the insect identification stage, the hyperspectral images of the insects to be identified are first preprocessed in the same way as in the training stage to obtain a three-dimensional data cube of the insects to be identified. Then, this cube is input into the trained three-layer feature extraction backbone network and multi-scale fusion module to obtain the fused features of the samples to be identified. ; then, Input width to learn the classifier and generate the feature node matrix corresponding to the sample to be identified. Enhanced node matrix and system state matrix Finally, The output weight matrix obtained during the training phase Multiply to obtain the category response matrix The category with the largest response value is selected as the insect identification result for the sample to be identified. The corresponding expression is: in: For the final insect identification category results, the category response matrix Each column corresponds to an insect category. This indicates that the category with the largest response value is selected as the final classification result for the sample to be identified.
[0042] This invention utilizes training data from insect 3D data cubes to train an insect recognition model, resulting in a trained insect recognition model. The specific process is as follows: First, using training data from insect 3D data cubes labeled with insect species, the insect recognition model undergoes end-to-end supervised training using cross-entropy loss and backpropagation algorithms to optimize its spatial-spectral feature extraction and fusion capabilities. Then, fixing all parameters of the three-layer feature extraction backbone network, the multi-scale fusion features of the insect hyperspectral images obtained from the forward propagation of the insect 3D data cube training data, along with their insect species labels, are input into a width learning system classifier. In the width learning system classifier, the multi-scale fusion features of the insect hyperspectral images are randomly mapped to feature nodes and enhancement nodes, and concatenated to form a system state matrix. Finally, the output weight matrix is obtained directly through a one-time pseudo-inverse operation, thus completing the training of the entire model.
[0043] S3. Preprocess the insect hyperspectral image data to be identified to obtain the insect three-dimensional data cube data to be identified, and obtain the insect identification result based on the insect three-dimensional data cube data to be identified and the trained insect identification model.
[0044] In an optional embodiment of the present invention, the present invention uses the same preprocessing method as step S1 to preprocess the insect hyperspectral image data to be identified, and obtains the insect three-dimensional data cube data to be identified; then the insect three-dimensional data cube data to be identified is input into the trained insect identification model to obtain the insect identification result.
[0045] Those skilled in the art will recognize that the embodiments described herein are intended to help the reader understand the principles of the invention, and should be understood that the scope of protection of the invention is not limited to such specific statements and embodiments. Those skilled in the art can make various other specific modifications and combinations based on the technical teachings disclosed in this invention without departing from the spirit of the invention, and these modifications and combinations are still within the scope of protection of this invention.
Claims
1. An insect identification method based on hyperspectral images and multi-scale fusion networks, characterized in that, Includes the following steps: The training data of the insect hyperspectral image training data is preprocessed to obtain the training data of the insect three-dimensional data cube; An insect recognition model is constructed based on a multi-scale fusion network. The insect recognition model is trained using training data from insect 3D data cubes to obtain the trained insect recognition model. The insect hyperspectral image data to be identified is preprocessed to obtain the insect three-dimensional data cube data to be identified. The insect identification result is obtained based on the insect three-dimensional data cube data to be identified and the trained insect identification model.
2. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 1, characterized in that, The training data for the insect hyperspectral image training is preprocessed to obtain the training data for the insect 3D data cube. The specific process is as follows: Black and white board radiometric correction was performed on the insect hyperspectral image training data to obtain the corrected insect hyperspectral image training data. Principal component analysis was used to reduce the spectral dimension of the corrected insect hyperspectral image training data to obtain insect hyperspectral image training data with redundant bands removed. Training data for a three-dimensional insect data cube is constructed using insect hyperspectral images with redundant bands removed, with each insect hyperspectral image serving as a training sample.
3. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 1, characterized in that, The insect identification model consists of a three-layer feature extraction backbone network, a multi-scale fusion module, and a width learning system classifier connected in sequence. The three-layer feature extraction backbone network is used to extract highly discriminative spatial-spectral joint features from the insect 3D data cube layer by layer based on the cascaded 3D convolutional composite attention module, resulting in shallow, mid-level, and deep features of the insect hyperspectral image; the multi-scale fusion module is used to fuse the shallow, mid-level, and deep features of the insect hyperspectral image, resulting in multi-scale fused features of the insect hyperspectral image; The width learning system classifier is used to identify the multi-scale fusion features of insect hyperspectral images through random mapping and pseudo-inverse analysis, and obtain the insect species identification results.
4. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 3, characterized in that, The three-layer feature extraction backbone network includes a first 3D convolutional composite attention module, a first reshaping layer, a first global average pooling layer, a second 3D convolutional composite attention module, a second reshaping layer, a second global average pooling layer, a third 3D convolutional composite attention module, a third reshaping layer, a third global average pooling layer, and a 2D convolutional layer. The input of the first 3D convolutional composite attention module is connected to the input of the insect 3D data cube, the output of the first 3D convolutional composite attention module is connected to the input of the first reshaping layer, the output of the first reshaping layer is connected to the input of the first global average pooling layer, and the output of the first global average pooling layer is simultaneously connected to the second 3D convolutional composite attention module. The input of the multi-scale attention module and the multi-scale fusion module are connected. The output of the second 3D convolutional multi-scale attention module is connected to the input of the second reshaping layer. The output of the second reshaping layer is connected to the input of the second global average pooling layer. The output of the second global average pooling layer is simultaneously connected to the input of the third 3D convolutional multi-scale attention module and the multi-scale fusion module. The output of the third 3D convolutional multi-scale attention module is connected to the input of the third reshaping layer. The output of the third reshaping layer is connected to the input of the third global average pooling layer. The output of the third global average pooling layer is connected to the input of the 2D convolutional layer. The output of the 2D convolutional layer is connected to the multi-scale fusion module.
5. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 4, characterized in that, The first, second, and third 3D convolutional composite attention modules all include a 3D convolutional layer, a compressed activation attention branch, a Rega attention branch, and a spectral attention branch connected in parallel with the output of the 3D convolutional layer, and a feature fusion layer that simultaneously connects the outputs of the compressed activation attention branch, the Rega attention branch, and the spectral attention branch.
6. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 5, characterized in that, The data processing procedure for the compressed attention branch is as follows: global average pooling is performed on the feature map output by the 3D convolutional layer to obtain the global description vector of each channel; then, the dependency relationship between different channels is learned through the bottleneck mapping structure containing two fully connected layers to generate the weight coefficients of each channel; finally, the weight coefficients of each channel are applied to the feature map output by the 3D convolutional layer by element-wise multiplication of each channel to obtain the channel-enhanced feature map.
7. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 5, characterized in that, The data processing procedure for the Rega attention branch is as follows: a non-uniform sampling convolution kernel with high weights in the central region and low weights in the peripheral region is used to perform convolution operations on the feature map output by the 3D convolutional layer in order to expand the effective receptive field and enhance the response of the central key region; then, a Rega attention weight map is generated based on the output of the convolution operation, and the Rega attention weight map is applied to the feature map output by the 3D convolutional layer by element-wise multiplication to obtain a spatially enhanced feature map.
8. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 5, characterized in that, The data processing procedure for the spectral attention branch is as follows: the feature map output by the 3D convolutional layer is compressed in the spatial dimension while retaining the spectral response information to construct a band description vector; then, based on the band description vector, the correlation and importance between different bands are learned through the mapping unit to generate the weight coefficients of each band; finally, the weight coefficients of each band are applied to the spectral dimension of the feature map output by the 3D convolutional layer by element-wise multiplication to obtain the spectral enhanced feature map.
9. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 3, characterized in that, The multi-scale fusion module includes a shallow feature extraction branch, a mid-level feature extraction branch, a deep feature extraction branch, and a splicing layer. The input of the shallow feature extraction branch is connected to the output of the first global average pooling layer in the three-layer feature extraction backbone network, and the output of the shallow feature extraction branch is connected to the input of the splicing layer. The input of the mid-level feature extraction branch is connected to the output of the second global average pooling layer in the three-layer feature extraction backbone network, and the output of the mid-level feature extraction branch is connected to the input of the splicing layer. The input of the deep feature extraction branch is connected to the output of the 2D convolutional layer in the three-layer feature extraction backbone network, and the output of the deep feature extraction branch is connected to the input of the splicing layer. The output of the splicing layer serves as the output of the multi-scale fusion module. The shallow feature extraction branch is used to extract shallow features, including local texture, edge contours, and fine-grained spatial information in insect hyperspectral images; The mid-level feature extraction branch is used to extract mid-level features that combine spatial details and semantic expression capabilities; the deep feature extraction branch is used to extract deep features, including high-level abstract semantic features and global discriminative information; the stitching layer is used to stitch and fuse shallow, mid-level and deep features to obtain multi-scale fusion features of insect hyperspectral images, and the multi-scale fusion features are used as the output of the multi-scale fusion module.
10. The insect identification method based on hyperspectral images and multi-scale fusion networks according to claim 3, characterized in that, The width learning system classifier includes a feature mapping module, an enhancement node generation module, a feature concatenation layer, and a pseudo-inverse solving module. The input of the feature mapping module is connected to the output of the multi-scale fusion module. The output of the feature mapping module is simultaneously connected to the inputs of both the enhancement node generation module and the feature concatenation layer. The output of the enhancement node generation module is connected to the input of the feature concatenation layer. The output of the feature concatenation layer is connected to the input of the pseudo-inverse solving module, which serves as the output of the width learning system classifier. The feature mapping module is used to randomly and linearly map the multi-scale fusion features of insect hyperspectral images into a set of feature nodes. The system consists of several modules: a feature node generation module, an enhancement node generation module, and a feature concatenation module. The former performs nonlinear transformations on feature nodes based on randomly initialized weights and biases to generate enhancement nodes, thereby enhancing the classifier's ability to express complex nonlinear features. The latter concatenates feature nodes and enhancement nodes laterally to form a system state matrix for classification. The former uses a pseudo-inverse solving module to analytically solve for the output weight matrix based on the system state matrix and the insect species labels corresponding to the insect 3D data cube training data. During insect recognition, the former uses the system state matrix corresponding to the hyperspectral image of the insect to be identified and the output weight matrix to perform classification calculations and obtain the insect recognition result.