A biological individual species identification method based on a deep neural network
By constructing a multi-level neural network and data fusion method, and utilizing multi-angle image features, the problem of high-precision identification in biological individual species identification was solved, achieving efficient and low-cost biological individual species identification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGZHOU UNIVERSITY
- Filing Date
- 2023-03-06
- Publication Date
- 2026-06-23
Smart Images

Figure CN116310532B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of zoological image recognition technology, specifically to a method for identifying biological species based on deep neural networks. Background Technology
[0002] With the development of computer science and technology and the rapidly increasing computing speed of computer chips under Moore's Law, deep neural network technology is demonstrating unparalleled effectiveness in many fields, especially in image recognition. Early image recognition tasks were primarily used to distinguish objects with large differences in features, such as classifying cats and dogs, and already achieved high classification accuracy. However, it falls short in the field of fine-grained image classification. For example, distinguishing the species of a certain type of mole, although different species have subtle differences, even manual classification by experts is quite difficult, requiring detailed morphological data measurement and statistical analysis of samples, the construction of different decision trees, and consuming a significant amount of human and material resources. Summary of the Invention
[0003] The purpose of this invention is to propose a biological individual species identification method based on deep neural networks. By constructing a multi-level neural network and using a data fusion method, the data flow in the network model is more refined, and the multi-angle features of a single biological individual sample are comprehensively utilized to achieve higher identification accuracy than traditional image recognition.
[0004] To achieve the above objectives, this invention provides a method for biological individual species identification based on deep neural networks, the method comprising:
[0005] Step 1: Obtaining the dataset and preprocessing the data:
[0006] Images of the dorsal, ventral, lateral, and mandibular sides of the desired biological individual were obtained as a set of individual dataset samples, and the image format was standardized.
[0007] Step 2: Dataset partitioning:
[0008] Based on the principle of not segmenting individual images, all dataset samples are divided into test set and training set in an 8:2 ratio. For each complete biological specimen, the data folder of the biological specimen contains image data of the dorsal, ventral, lateral, and mandibular sides of the skull of the current biological individual. At the same time, data augmentation is performed on the images of the training set.
[0009] Step 3: Construction of the species identification network model:
[0010] A deep neural network model is used as the main body of the entire network model. The network model is pre-trained using the ImageNet image dataset. After the multi-angle skull sample image data of biological individuals is input into the neural network to extract features, the probability distribution data of the biological samples in each labeled category is output by the classifier.
[0011] Step 4: Classifier Training
[0012] The number of output channels of the fully connected layer in the model is changed to the number of labeled species of the biological samples to be identified. After the adjustment, the divided training set data is input into the pre-trained neural network model to adjust the network parameters so that it can be adapted to the identification task of this type of organism. The weights in the network model are adjusted by using the backpropagation method.
[0013] Step 5: Obtain the basic weights of image data in each direction.
[0014] After the classifier training is completed, the image data in the test set is input into the network model separately, and the weight of the image data in each direction of the biological sample data in the species identification task is obtained by the algorithm.
[0015] Step 6: Species Identification Process
[0016] After the classifier is trained, data samples of each biological individual are input into the network model. Multiple image data of each individual sample are sequentially fed into the recognition network to obtain the species identification probability distribution data of each image. Finally, the identification result of the biological sample is obtained through a data fusion algorithm.
[0017] Furthermore, the image format is .JPG.
[0018] Furthermore, the labels of the individual dataset samples include s for maxilla, m for mandible, d for posterior surface, v for ventral surface, and l for lateral surface, and the genus and species are marked at the beginning of the label, that is, a complete image label is: genus#species#image number#maxilla / mandible#image orientation.
[0019] Furthermore, the algorithm is as follows:
[0020] First, the multi-angle skull image data of the biological individual is input into the network model separately. The identification probability of each image direction is extracted, and the corresponding weight of each image data in the identification of this type of biological is obtained. The specific formula is as follows:
[0021]
[0022]
[0023]
[0024]
[0025] Where sd, sl, sv, and ml represent the accuracy of classifying the four types of images—dorsolateral, lateral, ventral, and lateral aspects of the maxilla and mandible—in this biometric identification task, respectively. W sd W sl W sv W ml These represent the weight values of image data from the dorsal, ventral, lateral, and mandibular sides of the biological sample in the identification of this type of biological species.
[0026] Furthermore, the fusion algorithm in step six specifically includes:
[0027] P ind =W sd ×P sd +W sl ×P sl +W sv ×P sv +W ml ×P ml
[0028] Among them W sd W sl W sv W ml P represents the weight values of the image data from the four directions: the posterior, ventral, lateral, and lateral surfaces of the maxilla and the mandible. sd P sl P sv P ml P represents the species identification probability distribution of image data from four directions: dorsal, ventral, lateral, and lateral aspects of the maxilla and mandible. ind This represents the species identification probability distribution output by the neural network model for a single biological sample.
[0029] Furthermore, the classifier is a pre-trained classifier for the ImageNet dataset.
[0030] Furthermore, the specific operations for data augmentation of the images in the training set include, but are not limited to, image rotation, image translation, image cropping, and noise addition.
[0031] Furthermore, the reference that the images of the same biological individual exist only in the same dataset without segmenting individual images is used to represent the biological individual as a unit.
[0032] The beneficial technical effects of the present invention are at least as follows:
[0033] (1) This method constructs a standard data structure for an individual sample in the task of identifying biological species in biological samples, so as to better realize its application in different biological identification tasks.
[0034] (2) This method utilizes multi-angle image data from a single biological individual sample through data fusion and a special weighting calculation method for images from different angles, thereby achieving highly accurate identification of biological species.
[0035] (3) This method constructs a biological individual species identification method based on deep neural networks. By using deep neural networks, it achieves higher efficiency, accuracy and lower cost compared with traditional artificial biological identification. Attached Figure Description
[0036] The present invention will be further described with reference to the accompanying drawings, but the embodiments in the drawings do not constitute any limitation on the present invention. For those skilled in the art, other drawings can be obtained based on the following drawings without creative effort.
[0037] Figure 1 This is a flowchart of a biological individual species identification method based on deep neural networks according to the present invention.
[0038] Figure 2 This is a schematic diagram of multi-angle images in the dataset partitioning of an embodiment of the present invention.
[0039] Figure 3 This is a schematic diagram illustrating the method used in data enhancement according to an embodiment of the present invention.
[0040] Figure 4 , 5 Figures 6 and 7 are schematic diagrams of the confusion matrix in the mole species classification task according to an embodiment of the present invention. Detailed Implementation
[0041] Embodiments of the present invention are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.
[0042] The endpoints and any values of the ranges disclosed herein are not limited to the precise ranges or values, and these ranges or values should be understood to include values close to these ranges or values. For numerical ranges, the endpoint values of the various ranges, the endpoint values of the various ranges and individual point values, and individual point values can be combined with each other to obtain one or more new numerical ranges, which should be considered as specifically disclosed herein.
[0043] Example 1: As Figure 1 As shown, this invention provides a method for species identification of biological individuals based on deep neural networks, comprising the following steps:
[0044] S1. Obtaining the dataset and data preprocessing:
[0045] Images of the dorsal, ventral, lateral, and mandibular sides of the organism to be identified are obtained as a dataset sample. To facilitate subsequent model training, the images need to be formatted as .JPG.
[0046] S2, Dataset Partitioning:
[0047] Based on an individual-by-individual basis, and without segmenting images of each individual (images of the same individual will only exist in the same dataset), the dataset is divided into "test set" and "training set" in an 8:2 ratio. For each complete individual specimen, its data folder contains image data of the skull from four different orientations (dorsal, ventral, and lateral views of the maxilla, and lateral view of the mandible), as shown below. Figure 2-3 As shown. Simultaneously, methods including but not limited to rotation, mirroring, brightness adjustment, blurring, and random deletion will be used to augment the training set images.
[0048] S3. Construction of Species Identification Network Model:
[0049] This method extracts features from biological samples for identification using deep neural networks. A deep neural network model is used as the core of the entire network model. This model is pre-trained on the ImageNet image dataset to better suit this method. After the sample is input into the neural network to extract features, a classifier outputs the probability distribution data of the biological sample across each labeled category.
[0050] S4. Classifier Training:
[0051] The classifiers used in this method are all trained separately based on pre-training on the ImageNet dataset. Before training begins, the number of output channels in the fully connected layers of the model needs to be changed to the number of labeled species of the biological samples to be identified. After adjustment, the pre-divided training set data is input into our pre-trained neural network model to adjust the network parameters to adapt it to the identification task of this type of organism, and the weights in the network model are adjusted using backpropagation.
[0052] S5. Obtain the basic weights of image data in each direction:
[0053] The standard biological sample dataset constructed in this method contains four images: the dorsal, ventral, lateral, and lateral views of the maxilla and mandible. After training the classifier, the image data from the test set needs to be input separately into the network model, and the weights of the image data in each direction for this type of biological sample dataset in the species identification task are obtained through the corresponding algorithm.
[0054] S6. Species identification process:
[0055] like Figure 4 As shown, after training the classifier, data samples of each biological individual are input into the network model. Multiple image data of each individual sample are sequentially fed into the recognition network to obtain the species identification probability distribution data for each image. Finally, the identification result of the biological sample is obtained through a data fusion algorithm. For the mole skull sample dataset used, a species identification accuracy of 92% can be achieved.
[0056] Although embodiments of the invention have been shown and described, those skilled in the art will understand that various changes, modifications, substitutions and variations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims
1. A method for species identification of biological individuals based on deep neural networks, characterized in that, The method includes: Step 1: Obtaining the dataset and preprocessing the data: Images of the dorsal, ventral, lateral, and mandibular sides of the desired biological individual were obtained as a set of individual dataset samples, and the image format was standardized. Step 2: Dataset partitioning: Based on the principle of not segmenting individual images, all dataset samples are divided into test set and training set in an 8:2 ratio. For each complete biological specimen, the biological specimen data folder contains image data of the dorsal, ventral, lateral, and mandibular sides of the skull of the current biological individual. At the same time, data augmentation is performed on the images of the training set. Step 3: Construction of the species identification network model: A deep neural network model is used as the main body of the entire network model. The network model is pre-trained using the ImageNet image dataset. After the multi-angle skull sample image data of biological individuals is input into the neural network to extract features, the classifier outputs the probability distribution data of the input sample data on each labeled category. Step 4: Classifier Training Change the number of output channels of the fully connected layer in the model to the number of labeled species of the biological samples to be identified. After the adjustment, input the previously divided training set data into the pre-trained neural network model to adjust the network parameters so that it can be adapted to the identification task of this type of organism. Then, use the backpropagation method to adjust the weights in the network model. Step 5: Obtain the basic weights of image data in each direction. After the classifier training is completed, the image data in the test set is input into the network model separately, and the weight of the image data in each direction in the biological sample data in the species identification task is obtained by the algorithm. Step 6: Species Identification Process After the classifier is trained, data samples of each biological individual are input into the network model. Multiple image data of each individual sample are sequentially fed into the recognition network to obtain the species identification probability distribution data of each image. Finally, the identification result of the biological sample is obtained through a data fusion algorithm.
2. The method for biological individual species identification based on deep neural networks according to claim 1, characterized in that, The image format is .JPG.
3. The method for biological individual species identification based on deep neural networks according to claim 1, characterized in that, The labels for the individual dataset samples include s for maxilla, m for mandible, d for posterior surface, v for ventral surface, and l for lateral surface. The genus and species are indicated at the beginning of the label. That is, a complete image label is: genus#species#image number#maxilla / mandible#image orientation.
4. The method for biological individual species identification based on deep neural networks according to claim 1, characterized in that, The algorithm is as follows: First, the skull images of the biological individual from multiple angles are input separately into the network model according to different angles. The identification probability of each image direction is extracted, and the corresponding weight of each image data in the identification of this type of biological is obtained. The specific formula is as follows: Wherein, sd, sl, sv, and ml represent the accuracy of classifying the four types of images of the dorsolateral, lateral, ventral, and lateral aspects of the maxilla and mandible in this biometric identification task, respectively. These represent the weight values of image data from the dorsal, lateral, ventral, and lateral aspects of the maxilla and mandible in the identification of this type of biological species.
5. The method for biological individual species identification based on deep neural networks according to claim 4, characterized in that, The fusion algorithm in the sixth step specifically includes: in These represent the weight values of the image data from the four directions: the posterior, lateral, ventral, and lateral aspects of the maxilla and the mandible, respectively. The probability distributions for species identification are shown for image data from four directions: the dorsal, lateral, ventral, and lateral surfaces of the maxilla and the mandible, respectively. This represents the species identification probability distribution output by the neural network model for a single biological sample.
6. The method for biological individual species identification based on deep neural networks according to claim 1, characterized in that, The classifier is a pre-trained classifier for the ImageNet dataset.
7. The method for biological individual species identification based on deep neural networks according to claim 1, characterized in that, The specific operations for data augmentation of the images in the training set include image rotation, image translation, image cropping, and noise addition.
8. The method for biological individual species identification based on deep neural networks according to claim 1, characterized in that, The reference, which uses biological individuals as units and does not segment individual images, indicates that images of the same biological individual only exist in the same dataset.