A deep learning-based multi-view skull defect multi-attribute identification method

By constructing a multi-view skull defect multi-attribute recognition network model and utilizing deep learning technology and channel space attention mechanism, the simultaneous recognition of gender and ethnicity attributes of skull defects was achieved. This solves the problems of low recognition efficiency and insufficient multi-attribute modeling in existing technologies, and improves the accuracy and automation of skull defect sample recognition in forensic medicine.

CN122223218APending Publication Date: 2026-06-16NORTHWEST UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NORTHWEST UNIV
Filing Date
2026-03-04
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies are difficult to apply to skull samples that are often missing, broken, or partially missing in actual cases. Moreover, most studies can only identify a single attribute and lack the ability to jointly model multiple attributes, resulting in low identification efficiency and poor accuracy in forensic medicine.

Method used

A deep learning-based method for identifying multiple attributes of skull defects in multiple views is adopted. By constructing a multi-attribute recognition network model for skull defects in multiple views, and utilizing feature extraction, feature fusion and multi-attribute classification modules, combined with channel spatial attention mechanism and multi-task learning, the method can simultaneously identify gender and ethnic attributes.

🎯Benefits of technology

It significantly improves the accuracy and automation of gender and ethnicity identification of skull defects, simplifies the processing flow, and enhances the practicality and reliability of identity deduction in forensic medicine. It is particularly suitable for common but difficult-to-process skull defect samples.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122223218A_ABST
    Figure CN122223218A_ABST
Patent Text Reader

Abstract

The application discloses a kind of multi-view defect skull multi-attribute identification method based on deep learning, comprising the following steps: obtaining the computed tomography image data of volunteer skull, reconstructing and generating three-dimensional skull model and converting into multiple perspective two-dimensional image to form image dataset, division training set, test set and verification set;Multi-view defect skull multi-attribute recognition network model based on deep learning is constructed;The model is trained using training set data, and the parameters of the model are updated;The performance of the model is evaluated using the verification set, and the optimal model is selected or the hyperparameters are adjusted based on the evaluation results;The image of the skull to be identified is input into the trained model, and the segmentation result is output.The application can solve the problem that the prior art has poor applicability when facing common defect skull in actual cases, can only identify a single attribute and lacks unified modeling of the overall morphology, simultaneously obtains the gender and ethnic attribute information of the damaged skull, and effectively improves the classification accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of medical image analysis, specifically relating to a multi-view skull defect multi-attribute identification method based on deep learning. Background Technology

[0002] In the fields of forensic medicine and criminal investigation, inferring biological characteristics from unidentified human remains is a crucial step in determining individual identity. Age, sex, height, and ethnicity are generally recognized as the four key biological attributes, especially when direct identification evidence such as DNA or dental records is lacking; skeletal feature analysis becomes the primary method. The skull, due to its structural stability, high preservation rate, and significant morphological differences between groups and sexes, is widely used in forensic anthropology for identification.

[0003] In existing technologies, the determination of sex and ethnicity of the skull mainly relies on three types of methods: first, morphological observation based on expert experience, which is highly subjective and has poor repeatability; second, statistical discrimination methods based on linear or geometric morphological measurements, which, while possessing a certain degree of objectivity, require a large amount of manual measurement, resulting in low efficiency and susceptibility to operational errors; and third, computer-aided measurement methods, which extract anatomical features and construct classification models through image processing technology, thus improving the level of automation to some extent. In recent years, with the development of artificial intelligence, some studies have attempted to introduce machine learning, especially deep learning technology, into skull attribute recognition, utilizing convolutional neural networks to automatically learn discriminative features and reduce reliance on manual feature engineering.

[0004] However, existing technologies still have significant shortcomings: First, most studies only focus on intact skulls, making them difficult to apply to skull samples with defects, fragments, or partial loss commonly seen in real-world cases; second, existing methods mostly focus on identifying single attributes (such as gender or ethnicity only), lacking the ability to jointly model multiple attributes, which requires building multiple independent models in practical applications, increasing computational costs and ignoring potential correlations between attributes; third, studies on defective skulls are mostly limited to specific anatomical regions (such as the mandible and maxillary sinus), lacking a unified modeling mechanism for the overall representation of the skull under arbitrary defect patterns. Summary of the Invention

[0005] The purpose of this invention is to provide a multi-view skull defect identification method based on deep learning, which solves the problems of poor applicability of existing technologies when facing common skull defects in actual cases, the inability to identify only a single attribute and the lack of unified modeling of the overall morphology. The invention simultaneously obtains the gender and ethnicity information of the damaged skull, effectively improving the classification accuracy.

[0006] To achieve the above objectives, the present invention provides the following technical solution: A deep learning-based method for identifying multiple attributes of skull defects in multiple views includes the following steps: Step 1: Obtain computed tomography (CT) images of the volunteer's skull, reconstruct a three-dimensional skull model, correct and normalize the three-dimensional skull model, and convert it into two-dimensional images from multiple perspectives to form an image dataset, which is then divided into training, testing, and validation sets. Step 2: Construct a deep learning-based multi-view skull defect multi-attribute recognition network model. The network model includes a feature extraction module, a feature fusion module, and a multi-attribute classification module. The feature extraction module consists of a VGG network with the fully connected classification layer removed. The feature fusion module consists of a batch normalization layer with a kernel size of 5, a ReLU activation function, a 1×1 convolutional kernel, a 3×3 convolutional kernel, and channel attention and spatial attention. The multi-attribute classification module consists of two attribute classification blocks, each of which consists of a max pooling layer, two fully connected layers, and a ReLU activation function. The network model is expressed by the following formula: in The input image; This represents the entire multi-attribute recognition model; The final identified attributes are represented as follows: in This is a feature extractor that extracts information from each image. This represents feature normalization, which unifies the scale of the feature map. This represents the weights of different features, and the features are weighted accordingly. This represents a classifier that classifies the final features generated. A set of multiple 2D skull images in RGB format are input into the feature extraction module. The feature extractor processes all input images to generate a feature map set X containing key information about the missing skull. Then, the feature fusion module integrates the information in the feature map set X to generate a feature map Y containing complete skull information. Finally, the multi-attribute classification module performs attribute classification on the feature map Y to obtain the identified gender and race information. Step 3: Train the deep learning-based multi-view skull defect multi-attribute recognition network model using the training set, obtain the weights of the network model and update the parameters; Step 4: Use the validation set to evaluate the performance of the deep learning-based multi-view skull defect multi-attribute recognition network model, and select the optimal model or adjust the hyperparameters based on the evaluation results; Step 5: Input the image of the skull defect to be identified into the trained deep learning-based multi-view skull defect multi-attribute recognition network model, and output the corresponding gender and ethnicity classification results.

[0007] Furthermore, the feature extraction module in step 2 includes the following steps: The multi-view skull images generated in step 1 are input into a pre-trained VGG network. The original fully connected classification layer of the VGG network is removed, while its deep convolutional layers are retained to extract high-dimensional features with spatial structural information. For each input image, a corresponding feature map is output, thereby obtaining a set of feature maps. ,in Indicates the first The feature map corresponding to each viewpoint, where B represents the batch, C represents the number of channels, W represents the width, H represents the height, and N represents the number of views.

[0008] Furthermore, the feature fusion module described in step 2 includes the following steps: (1) Normalize the extracted single-view feature group X, input the single-view features into the batch normalization layer, and apply the ReLU activation function to finally obtain the single-view features. ; (2) All view features are stitched and fused along the channel dimension to form a feature map containing all skull information from the single view features. ; (3) Use a 1×1 convolution kernel to perform channel fusion on the input features, exchange information within multiple channels, and then use a 3×3 convolution kernel to compress the channels to obtain the features. ; (4) Next, channel spatial attention mechanism is used for computation to further enhance important channel features and capture important regions in the spatial dimension: Channel attention performs max pooling and average pooling on each channel of the input feature map, sums the outputs, and uses a sigmoid activation function to obtain the attention weights for each channel. Then, it performs weighted enhancement channel by channel. The calculation formula is: , where avg is the average pooling operation and max is the max pooling operation; Spatial attention enhances the features at a spatial location by calculating the average and maximum responses of all channels at that location, summing them, and then using the Sigmoid activation function to calculate the attention weight for each location. These weights are then used to enhance information at important locations. The calculation formula is as follows: Y represents the features after channel attention enhancement. This is represented as averaging all channels C in spatial location, where h is the height and w is the width. This indicates that the maximum value is selected for all channels in the spatial location; (5) The final output structure is .

[0009] Furthermore, the multi-attribute classification module in step 2 includes the following steps: [The text abruptly ends here, so the translation stops.] As input, the features are obtained and max pooling is performed to obtain... The final number of channels is 512. The number of channels of Y is reduced to half of the previous number, i.e., 256, through the first fully connected layer. The ReLU activation function is used to process the data, retaining the features that contribute positively to the classification, setting all negative numbers to 0, and filtering out irrelevant features. Then, the data is mapped to the classification dimension of 2 through the second fully connected layer, i.e., binary classification, and the final classification result is output. The two classifiers of the multi-attribute classification module have the same structure, only the specific attributes they output are different.

[0010] Furthermore, the training process in step 3 employs a multi-task learning mechanism, where the multi-task loss function is expressed as a weighted sum of the losses for each task, as shown in the formula: ,in, Indicates task-specific losses. Indicates task weight; During training, a binary cross-entropy loss function is selected for gender and race classification tasks. A dynamic weighted average method is used to learn the task weights, and the Adam optimizer is used for optimization to improve the accuracy of model predictions.

[0011] Compared with the prior art, the present invention has the following beneficial effects: This invention transforms three-dimensional skull data into multi-angle two-dimensional views containing information about the damaged skull. To obtain feature information from the views, a convolutional neural network is used to extract features from a single skull view. Then, a channel spatial attention mechanism is used to enhance important features in the views and fuse multi-view features. Finally, a multi-task learning approach is introduced to simultaneously obtain the sex and ethnicity identification results of the damaged skull. This method allows for the simultaneous acquisition of sex and ethnicity information from damaged skulls, improving classification accuracy.

[0012] This invention provides a multi-attribute recognition framework for fractured skulls based on channel spatial attention. It extracts feature maps of the fractured skull from multi-angle two-dimensional images, fuses these features to generate complete skull morphological information, and inputs this information into a classifier to jointly determine multiple attributes such as gender and ethnicity. Compared to traditional gender and ethnicity classification methods that rely on manual measurement and expert experience, this invention significantly improves the automation of the identification process, avoiding subjective bias and sample damage caused by human operation. Furthermore, it eliminates the need for separate modeling for each attribute, simplifying the process and improving processing efficiency. In practical applications, this method offers advantages such as ease of operation, real-time response, and high classification accuracy. It is particularly suitable for fractured skull samples that are common but difficult to process in forensic practice, effectively enhancing the practicality and reliability of identity inference.

[0013] By introducing a channel spatial attention mechanism to fuse multi-view features, this invention can adaptively enhance discriminative features in key channels and spatial locations, suppress redundant or interfering information, and achieve efficient integration of skull features from different perspectives. This mechanism enables the model to focus more on important anatomical structures related to gender and ethnicity during the fusion process, thereby significantly improving the accuracy and robustness of multi-attribute recognition.

[0014] This invention employs a multi-task learning classification method, which can simultaneously output classification results for multiple biological attributes such as gender and ethnicity during a single inference process. Compared with the traditional single-task independent modeling method, it not only avoids redundant calculations and model redundancy, but also effectively utilizes the inherent correlation information between different attributes, thereby improving recognition efficiency while enhancing the consistency of classification results and overall discriminative performance. Attached Figure Description

[0015] Figure 1 This is a flowchart of the multi-attribute recognition process of the present invention; Figure 2 Images of the fractured skull from multiple angles; Figure 3 The network configuration diagram for the feature fusion module and multi-attribute classification module of the invention is shown. Detailed Implementation

[0016] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0017] like Figure 1 As shown in this embodiment, a multi-view skull defect multi-attribute identification method based on deep learning includes the following steps: Step 1: Obtain computed tomography (CT) images of the volunteer's skull, reconstruct a three-dimensional skull model, correct and normalize the three-dimensional skull model, and convert it into two-dimensional images from multiple perspectives to form an image dataset, which is then divided into training, testing, and validation sets. Cranial medical imaging data of volunteers was obtained using computed tomography (CT) scanning technology. A 3D model was then generated using a CT reconstruction algorithm, and the skull data was corrected and normalized using the Frankfurt coordinate system. Due to the complexity of 3D skull data and its high computational and storage resource requirements, a reflection model was used to generate 2D images of the skull from multiple angles. The final data for a single skull generated 7 images from different angles, each image being 224*224 pixels in size, which served as input data for the model training process. Figure 2 As shown.

[0018] Step 2: Construct a deep learning-based multi-view skull defect multi-attribute recognition network model. The network model includes a feature extraction module, a feature fusion module, and a multi-attribute classification module. The feature extraction module consists of a VGG network with the fully connected classification layer removed. The feature fusion module consists of a batch normalization layer with a kernel size of 5, a ReLU activation function, 1×1 convolutional kernels, 3×3 convolutional kernels, and channel attention and spatial attention. The multi-attribute classification module consists of two attribute classification blocks, each consisting of a max-pooling layer, two fully connected layers, and a ReLU activation function. After passing through the classification module, the classification result is output. The input to the feature extraction module is multi-view images, and features are extracted from each view using a deep network architecture to generate a multi-view feature set of the missing skull. The input to the feature fusion module is the multi-view feature set information, used to generate a feature map containing complete skull information. The input to the multi-attribute classification module is the complete skull information feature map, used for classification by gender and race, and outputs the final classification accuracy.

[0019] The network model is expressed by the following formula: in The input image; This represents the entire multi-attribute recognition model; The final identified attributes are represented as follows: in This is a feature extractor that extracts information from each image. This represents feature normalization, which unifies the scale of the feature map. This represents the weights of different features, and the features are weighted accordingly. This represents a classifier that classifies the final features generated.

[0020] A set of multiple 2D skull images in RGB format are input into the feature extraction module. The feature extractor processes all input images to generate a feature map set X containing key information about the missing skull. Then, the feature fusion module integrates the information in the feature map set X to generate a feature map Y containing complete skull information. Finally, the multi-attribute classification module performs attribute classification on the feature map Y to obtain the identified gender and race information.

[0021] In the feature extraction module, the obtained multi-view skull data is used to extract features through a VGG network (VisualGeometry Group network, a deep convolutional neural network) with fixed parameters to obtain the feature representations corresponding to each view.

[0022] The multi-view skull images generated in step 1 are input into a pre-trained VGG network. The original fully connected classification layer of the VGG network is removed, and the intermediate layer that retains spatial location information is selected as the feature output layer to extract high-dimensional features with spatial structure information. For each input image, a corresponding feature map is output, thereby obtaining a set of feature maps. ,in Indicates the first The feature map corresponding to each viewpoint, where B represents the batch, C represents the number of channels, W represents the width, H represents the height, and N represents the number of views.

[0023] In the feature fusion module, in order to further explore the relationships between views and skull information, such as... Figure 3 As shown, the extracted single-view feature group X is normalized. In the single-view feature input batch normalization layer, the input features are standardized to accelerate model convergence. Finally, the ReLU activation function is applied to obtain the single-view features. All view features are stitched and fused along the channel dimension to form a feature map containing all skull information from a single viewpoint. At this point, the excessive number of channels increases the network size and computational cost, and introduces redundant information due to overlapping views, affecting subsequent feature mining and model performance. A 1×1 convolutional kernel is used to perform channel fusion on the input features, exchanging information within multiple channels, and then a 3×3 convolutional kernel is used for channel compression to obtain the features. This operation reduces the number of channels without changing the spatial dimension. It preserves important features while removing redundant information through weighted summation. Then, a channel spatial attention mechanism is used to further enhance important channel features, capture important regions in the spatial dimension, suppress unimportant features, and reduce interference from redundant information.

[0024] Channel attention performs max pooling and average pooling on each channel of the input feature map, adds the outputs of the two and uses the sigmoid activation function to obtain the attention weights for each channel, and then performs weighted enhancement channel by channel.

[0025] The calculation formula for this method is as follows: Where avg represents average pooling and max represents max pooling.

[0026] Spatial attention enhances the features of channel attention at spatial locations. It calculates the average and maximum responses of all channels at a given location, sums them, and uses the Sigmoid activation function to calculate the attention weight for each location. The weights are then used to enhance important location information.

[0027] The calculation formula for this method is as follows: Where Y represents the feature after channel attention enhancement. This is represented as averaging all channels C in spatial location, where h is the height and w is the width. This indicates that the maximum value is selected for all channels in the spatial location. The final output is .

[0028] In the multi-attribute classification module, two separate classifiers are used: one for gender prediction and one for ethnicity prediction. A shared network reduces the resource consumption associated with single-task learning and simultaneously outputs results from multiple tasks based on potential inter-task correlations. The feature fusion module then integrates the results... As input, the features are obtained and max pooling is performed to obtain... The final number of channels is 512. The first fully connected layer reduces the number of channels in Y to half, 256. The ReLU activation function is then used to retain features that positively contribute to classification, setting all negative values ​​to 0 and filtering out irrelevant features. Next, a second fully connected layer maps the data to a two-dimensional classification dimension, resulting in binary classification. The final classification result is then output. The two classifiers have the same structure, differing only in the specific attributes they output.

[0029] Step 3: Train the deep learning-based multi-view skull defect multi-attribute recognition network model using the training set, obtain the weights of the network model, and update the parameters. Training employs a multi-task learning mechanism, where the multi-task loss function is represented as a weighted sum of the losses for each task. The formula is: ,in, Indicates task-specific losses. Indicates the task weight.

[0030] During training, a binary cross-entropy loss function is selected for the gender and race classification tasks, and a dynamic weighted average method is used to learn the task weights. The Adam optimizer is then used for optimization to improve the model's prediction accuracy.

[0031] Step 4: Use the validation set to evaluate the performance of the deep learning-based multi-view skull defect multi-attribute recognition network model, and select the optimal model or adjust the hyperparameters based on the evaluation results. Step 5: Input the skull defect image to be identified into the trained deep learning-based multi-view skull defect multi-attribute recognition network model, and output the corresponding gender and ethnicity classification results. Experimental section: The method of this invention is compared with other methods for identifying the attributes of skull defects, as shown in Table 1. Existing technologies are generally limited to the independent identification of a single attribute (sex or ethnicity), and their accuracy still has room for improvement. For example, a method based on computer-aided measurement of 10 skull indicators and the establishment of a discriminant equation achieves a sex identification accuracy of 83.08%; a method using improved Canny edge detection to extract skull contours and combining it with CNN for sex identification achieves a sex accuracy of 95.00% on incomplete skulls; a method using multi-region, multi-angle skull image input and an improved LeNet5 network for sex classification achieves an accuracy of 95.03%; another method based on local depth projection images, combined with dilated convolution and channel weight learning mechanisms for ethnicity identification, achieves an accuracy of 98.04%. However, none of the above methods achieve joint discrimination of sex and ethnicity attributes, while this invention not only supports simultaneous output of multiple attributes but also achieves sex and ethnicity identification accuracies of 97.22% and 98.61% respectively, demonstrating overall superior performance compared to existing technologies.

[0032] Table 1 Comparison with different methods To verify the contribution of each module to the recognition performance of this invention, ablation experiments were conducted, and the results are shown in Table 2. The experiments show that when using only the front view for gender and ethnicity identification, the limited skull morphology information contained in a single-view image makes it difficult to comprehensively represent the overall structure of the missing skull, resulting in low recognition accuracy. Introducing a multi-attribute classification module improves performance, but it is still limited by the incompleteness of the input information. In contrast, using multi-view images as input effectively enhances the coverage of skull features and significantly improves recognition accuracy. Further introduction of either the feature fusion module or the multi-attribute classification module further improves model performance, with the multi-view fusion module showing particularly significant effects on feature integration. When the multi-view input, feature fusion module, and multi-attribute classification module work collaboratively, the model achieves optimal recognition performance, fully demonstrating the effectiveness and necessity of the overall architecture of this invention.

[0033] Table 2 Module Performance Comparison

Claims

1. A method for identifying multiple attributes of skull defects based on deep learning, characterized in that, Includes the following steps: Step 1: Obtain computed tomography (CT) images of the volunteer's skull, reconstruct a three-dimensional skull model, correct and normalize the three-dimensional skull model, and convert it into two-dimensional images from multiple perspectives to form an image dataset, which is then divided into training, testing, and validation sets. Step 2: Construct a deep learning-based multi-view skull defect multi-attribute recognition network model, which includes a feature extraction module, a feature fusion module, and a multi-attribute classification module. The feature extraction module consists of a VGG network with the fully connected classification layer removed; the feature fusion module consists of a batch normalization layer with a kernel size of 5, a ReLU activation function, a 1×1 convolutional kernel, a 3×3 convolutional kernel, and channel attention and spatial attention; the multi-attribute classification module consists of two attribute classification blocks, each of which consists of a max pooling layer, two fully connected layers, and a ReLU activation function. The network model is expressed by the following formula: in The input image; This represents the entire multi-attribute recognition model; The final identified attributes are represented as follows: in This is a feature extractor that extracts information from each image. This represents feature normalization, which unifies the scale of the feature map. This represents the weights of different features, and the features are weighted accordingly. This represents a classifier that classifies the final features generated. A set of multiple two-dimensional skull images in RGB format are input into the feature extraction module. The feature extractor processes all input images to generate a feature map set X containing key information about the missing skull. Subsequently, the information in feature map group X is integrated through the feature fusion module to generate feature map Y containing complete skull information; finally, the multi-attribute classification module is used to classify the attributes of feature map Y to obtain the identified gender and race information. Step 3: Train the deep learning-based multi-view skull defect multi-attribute recognition network model using the training set, obtain the weights of the network model and update the parameters; Step 4: Use the validation set to evaluate the performance of the deep learning-based multi-view skull defect multi-attribute recognition network model, and select the optimal model or adjust the hyperparameters based on the evaluation results; Step 5: Input the image of the skull defect to be identified into the trained deep learning-based multi-view skull defect multi-attribute recognition network model, and output the corresponding gender and ethnicity classification results.

2. The method for identifying multiple attributes of skull defects based on deep learning according to claim 1, characterized in that, Step 2, the feature extraction module, includes the following steps: The multi-view skull images generated in step 1 are input into a pre-trained VGG network. The original fully connected classification layer of the VGG network is removed, while its deep convolutional layers are retained to extract high-dimensional features with spatial structural information. For each input image, a corresponding feature map is output, thereby obtaining a set of feature maps. ,in Indicates the first The feature map corresponding to each viewpoint, where B represents the batch, C represents the number of channels, W represents the width, H represents the height, and N represents the number of views.

3. The method for identifying multiple attributes of skull defects based on deep learning according to claim 2, characterized in that, Step 2, the feature fusion module, includes the following steps: (1) Normalize the extracted single-view feature group X, input the single-view features into the batch normalization layer, and apply the ReLU activation function to finally obtain the single-view features. ; (2) All view features are stitched and fused along the channel dimension to form a feature map containing all skull information from the single view features. ; (3) Use a 1×1 convolution kernel to perform channel fusion on the input features, exchange information within multiple channels, and then use a 3×3 convolution kernel to compress the channels to obtain the features. ; (4) Next, channel spatial attention mechanism is used for computation to further enhance important channel features and capture important regions in the spatial dimension: Channel attention performs max pooling and average pooling on each channel of the input feature map, sums the outputs, and uses a sigmoid activation function to obtain the attention weights for each channel. Then, it performs weighted enhancement channel by channel. The calculation formula is: , where avg is the average pooling operation and max is the max pooling operation; Spatial attention enhances the features at a spatial location by calculating the average and maximum responses of all channels at that location, summing them, and then using the Sigmoid activation function to calculate the attention weight for each location. These weights are then used to enhance information at important locations. The calculation formula is as follows: Y represents the features after channel attention enhancement. This is represented as averaging all channels C in spatial location, where h is the height and w is the width. This indicates that the maximum value is selected for all channels in the spatial location; (5) The final output structure is .

4. The method for identifying multiple attributes of skull defects based on deep learning according to claim 3, characterized in that, Step 2, the multi-attribute classification module, includes the following steps: [The text abruptly ends here, likely due to an incomplete sentence or a formatting error.] As input, the features are obtained and max pooling is performed to obtain... The final number of channels is 512. The number of channels of Y is reduced to half of the previous number, i.e., 256, through the first fully connected layer. The ReLU activation function is used to process the data, retaining the features that contribute positively to the classification, setting all negative numbers to 0, and filtering out irrelevant features. Then, the data is mapped to the classification dimension of 2 through the second fully connected layer, i.e., binary classification, and the final classification result is output. The two classifiers of the multi-attribute classification module have the same structure, only the specific attributes they output are different.

5. The method for identifying multiple attributes of skull defects based on deep learning according to claim 4, characterized in that, The training process in step 3 employs a multi-task learning mechanism. The multi-task loss function is represented as a weighted sum of the losses for each task, and the formula is: ,in, Indicates task-specific losses. Indicates task weight; During training, a binary cross-entropy loss function is selected for gender and race classification tasks. A dynamic weighted average method is used to learn the task weights, and the Adam optimizer is used for optimization to improve the accuracy of model predictions.