A visual nerve-based diabetic foot ulcer risk assessment method and system
By employing a multimodal fusion neural network assessment method based on visual nerves and utilizing data acquired through nano-dressings and cameras, the challenge of assessing diabetic foot ulcer conditions has been solved. This approach enables accurate risk assessment and clinical application, reduces the risk of overfitting, and enhances the interpretability and device compatibility of the assessment.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- THE AFFILIATED HOSPITAL OF XUZHOU MEDICAL UNIV
- Filing Date
- 2026-03-31
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies are insufficient to effectively assess the condition of diabetic foot ulcers, leading to complex treatment processes that are difficult to adjust in a timely manner, increasing the medical burden on patients and the risk of amputation.
A risk assessment method for diabetic foot ulcers based on visual nerves was adopted. Phosphorescent images, ulcer wound images and sensor feature data were collected through nano-dressing. A multimodal fusion neural network model was used to assess the condition, including feature extraction and fusion of phosphorescent subnetwork, wound subnetwork and sensor subnetwork, to generate risk assessment results.
It enables accurate assessment of diabetic foot ulcers, and by combining nanomaterials and algorithm models, it improves the accuracy and interpretability of the assessment, supports edge device deployment, reduces the risk of overfitting, and enhances clinical trust.
Smart Images

Figure CN122245823A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of disease risk assessment technology based on image vision, and in particular to a method and system for risk assessment of diabetic foot ulcers based on visual nerves. Background Technology
[0002] Diabetes mellitus is a disease with complications affecting hundreds of millions of people worldwide. Diabetic foot ulcer (DFU) is a common complication of diabetes, affecting approximately 15% of people with type 1 and type 2 diabetes. As a typical chronic wound, DFU is extremely difficult to heal. Therefore, DFU not only imposes a significant medical burden on patients but also increases the risk of amputation. Generally, wound healing treatment can be divided into several overlapping stages, including hemostasis, inflammation reduction, regeneration, and remodeling.
[0003] The serious challenge lies in the fact that DFU healing is a far more complex process, often stalling at a certain stage and accompanied by systemic pathological changes. In reality, diabetes can negatively impact wound healing through one or more biological mechanisms. For example, diabetes can lead to microcirculatory disturbances, triggering certain peripheral neuropathy or peripheral artery disease. Furthermore, diabetes alters the immune response, reducing the body's resistance to bacterial infections. Moreover, high blood sugar causes cell membrane hardening and vasoconstriction, leading to reduced blood flow in the absence of oxygen and nutrient supply, thus hindering the healing process. Therefore, in actual treatment, effective assessment of the foot ulcer wound in diabetic patients is crucial for timely adjustments to treatment methods to accelerate recovery. How to effectively assess the condition is a significant challenge. Summary of the Invention
[0004] This application provides a method and system for assessing the risk of diabetic foot ulcers based on the visual nerve, which enables effective and accurate assessment of the condition of foot ulcer wounds in diabetic patients.
[0005] To achieve the above objectives, the embodiments of this application adopt the following technical solutions: Firstly, a method for risk assessment of diabetic foot ulcers based on visual nerves is provided. This method is used to assess the risk of foot ulcers in patients. The patient's foot ulcer is covered with a nano-dressing, which includes a hydrogel dressing and a nanosensing dressing. The hydrogel dressing is covalently linked with nanoprobe structures, which are excited to different degrees of phosphorescence under different oxygen concentrations. The nanosensing dressing is used to collect sensory feature data of the foot ulcer. The method includes: acquiring phosphorescent images of the patient's foot ulcer, images of the ulcer wound, and sensor feature data; processing the phosphorescent images based on a phosphorescent sub-network model to obtain an oxygen-phosphorescence response feature vector; processing the ulcer wound images based on a wound sub-network model to obtain an ulcer pathological feature vector; extracting physiological parameters from the sensor feature data based on a sensor sub-network model to obtain a physiological feature vector; and weightedly concatenating and fusing the oxygen-phosphorescence response feature vector, ulcer pathological feature vector, and physiological feature vector based on a fusion sub-network model to obtain a fused feature vector. The fused feature vector is then used for identification calculations to obtain a diabetic foot ulcer risk assessment result. The value of the diabetic foot ulcer risk assessment result is used to indicate the patient's foot ulcer risk assessment level.
[0006] In one possible implementation, the phosphorescent network model includes a first input preprocessing layer, a lightweight ResNet18 network, and a first feature extraction layer connected in sequence. The first feature extraction layer includes a first extraction layer and a second extraction layer connected in sequence. The first extraction layer includes a first linear layer, a batch normalization layer, and a ReLU activation layer connected in sequence, and the second extraction layer includes a second linear layer and a first regularized dropout layer connected in sequence. The number of input channels in the first and second linear layers is twice the corresponding number of input channels. The second linear layer is used to output 128-dimensional oxygen-phosphorescence response feature information. During model training, the first regularized dropout layer sets the oxygen-phosphorescence response feature information output by some channels of the second linear layer to 0 based on a preset random deactivation probability, thus obtaining an oxygen-phosphorescence response feature vector.
[0007] In one possible implementation, the wound sub-network model is an improved version of the DenseNet121 network. Specifically, the improvement involves removing the classifier from the output stage of the DenseNet121 network and adding a second input preprocessing layer before the input of the DenseNet121 network, and a second feature extraction layer after the output of the DenseNet121 network. The second input preprocessing layer adapts the image size of the input ulcer wound image to 224×224. The second feature extraction layer includes a third extraction layer and a fourth extraction layer connected in sequence. Specifically, the third extraction layer includes a third linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence; the fourth extraction layer includes a fourth linear layer and a second regularized dropout layer connected in sequence; the third linear layer reduces the input 992-dimensional ulcer pathological feature information to 256-dimensional ulcer pathological feature information; and the fourth linear layer reduces the input 256-dimensional ulcer pathological feature information to 128-dimensional ulcer pathological feature information. During the model training phase, the second regularized dropout layer sets the ulcer pathological feature information output by some channels of the fourth linear layer to 0 based on a preset random inactivation probability, thus obtaining the ulcer pathological feature vector.
[0008] In one possible implementation, the sensor subnetwork model includes a third input preprocessing layer, a first hidden layer, a third regularized discard layer, a second hidden layer, and an output layer connected in sequence. The third input preprocessing layer is used to perform statistical standardization on the input sensor feature data.
[0009] In one possible implementation, the sensing feature data includes three dimensions of physiological feature information: pH value, temperature value, and humidity value. The first hidden layer comprises a fifth linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence. The fifth linear layer enhances the three-dimensional physiological feature information to 16 dimensions. During model training, a third regularized dropout layer sets the physiological feature information output from some channels of the fifth linear layer to 0 based on a preset random inactivation probability. The second hidden layer comprises a sixth linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence. The sixth linear layer enhances the 16-dimensional physiological feature information to 32 dimensions. The output layer is a seventh linear layer excluding the activation layer. The seventh linear layer processes the 32-dimensional physiological feature information and outputs a 32-dimensional physiological feature vector.
[0010] In one possible implementation, the momentum parameter of the batch normalized layer in the first hidden layer is 0.9, and the minute constant is set to 0.00001.
[0011] In one possible implementation, the fusion sub-network model includes a fourth input preprocessing layer, a multi-level cross-fusion layer, a dynamic attention weighting layer, and a decision output layer connected in sequence. The multi-level cross-fusion layer includes three levels of fusion layers connected in sequence. The first level fusion layer includes a phosphorescence-pathology fusion layer and an image-sensing fusion layer. The phosphorescence-pathology fusion layer includes a matrix multiplication attention layer and a first weighted fusion layer. The matrix multiplication attention layer is used to cross-multiply the oxygen-phosphorescence response feature vector and the ulcer pathology feature vector to obtain an attention vector matrix. The first weighted fusion layer is used to cross-multiply the physiological feature vector with the attention vector matrix and then add the ulcer pathology feature vector to obtain cross-modal fusion features. The second level fusion layer includes a feature splitting layer, a sub-feature interaction layer, and a recombination and aggregation layer connected in sequence. The feature splitting layer is used to split the cross-modal fusion features into multiple sub-fusion features. The sub-feature interaction layer is used to perform element-wise multiplication operations on each sub-fusion feature with other sub-fusion features to obtain multiple multiplicative fusion features. The recombination and aggregation layer is used to merge and connect multiple multiplicative fusion features to obtain recombined fusion features. The third-level fusion layer consists of a multi-head self-attention layer, a residual connection layer, and a layer normalization layer connected in sequence. The multidimensional features of the recombined fusion features are evenly distributed to each head in the multi-head self-attention layer for processing. The residual connection layer is used to add the output features of the multi-head self-attention layer to the recombined fusion features to obtain the global fusion features.
[0012] In one possible implementation, the dynamic attention weighting layer is used to generate channel weights based on channel attention, generate spatial weights based on spatial attention, and perform weighted operations on the global fusion features based on the channel weights and spatial weights to obtain attention features.
[0013] In one possible implementation, the decision output layer includes a multi-branch feature network and an ensemble decision network. The multi-branch feature network includes a fine-grained feature branch, a global feature branch, and a key feature branch. The outputs of the fine-grained feature branch, the global feature branch, and the key feature branch are respectively connected to the input of the ensemble decision network. Specifically, the fine-grained feature branch includes, in sequence, an eighth linear layer, a batch normalization layer, a LeakyReLU activation layer, a fourth regularized dropout layer, a ninth linear layer, a batch normalization layer, and a LeakyReLU activation layer; the input feature dimensions of the eighth and ninth linear layers are twice the corresponding output feature dimensions. The global feature branch includes, in sequence, a global average pooling layer, a tenth linear layer, and a ReLN activation layer; the input feature dimension of the tenth linear layer is eight times the corresponding output feature dimension. The key feature branch includes, in sequence, an attention feature selection layer and an eleventh linear layer; the attention feature selection layer is used to select the top k dimensions of features from the attention features based on feature-level attention; the input feature dimension of the eleventh linear layer is twice the corresponding output feature dimension. The integrated decision network consists of a feature stitching layer, a classification head layer, and a result output layer connected in sequence. The feature stitching layer is used to stitch together the outputs of the fine feature branches, the global feature branches, and the key feature branches to obtain multidimensional integrated features. The classification head layer is used to perform classification operations on the multidimensional integrated features to obtain the disease severity distribution value. The result output layer is used to output the risk assessment result of diabetic foot ulcer based on the value of the disease severity distribution value and the contribution weight, which is a preset weight value for phosphorescent images, ulcer wound images, and sensor feature data.
[0014] Secondly, embodiments of this application also provide a visual nerve-based risk assessment system for diabetic foot ulcers, including a data acquisition component and a data processing component. The data acquisition component includes a nano-dress, a first camera, and a second camera. The nano-dress is applied to the foot ulcer site of the patient. The nano-dress includes a hydrogel dressing and a nanosensing dressing. The hydrogel dressing has a nanoprobe structure covalently linked to it. The nanoprobe structure is excited with different degrees of phosphorescence under different oxygen concentrations. The nanosensing dressing is used to collect sensory feature data of the foot ulcer site. The first camera is used to acquire images of the phosphorescence features of the foot ulcer site to obtain a single-wavelength phosphorescence image. The second camera is used to acquire images of the wound features of the foot ulcer site to obtain a three-primary-color ulcer wound image. The data processing component is used to: acquire phosphorescent images, ulcer wound images, and sensor feature data of the patient's foot ulcer; process the phosphorescent images based on the phosphorescent sub-network model to obtain the oxygen-phosphorescence response feature vector; process the ulcer wound images based on the wound sub-network model to obtain the ulcer pathological feature vector; extract physiological parameters from the sensor feature data based on the sensor sub-network model to obtain the physiological feature vector; and perform weighted concatenation and fusion of the oxygen-phosphorescence response feature vector, ulcer pathological feature vector, and physiological feature vector based on the fusion sub-network model to obtain the fused feature vector. The fused feature vector is then used for recognition operations to obtain the recognition result. The value of the recognition result is used to indicate the patient's foot ulcer risk assessment level.
[0015] The embodiments of this application have the following advantages, i.e., beneficial effects: 1. By combining nanomaterials, an integrated approach of treatment, detection, and re-treatment has been achieved, addressing clinical pain points.
[0016] 2. At the algorithm model level, multimodal fusion improves the accuracy and richness of the evaluation. For example, the sub-module design is tailored to the data characteristics, extracting key features more accurately, and the complex fusion evaluation module enhances cross-modal information interaction.
[0017] 3. The design of each algorithm sub-module at the training and optimization level is adapted to the characteristics of medical data to reduce the risk of overfitting.
[0018] 4. Interpretable design enhances clinical trust: The modality contribution weight of the fusion module output is weighted. For example, in this assessment, sensor features account for 50% and skin features account for 35%. The visualization of phosphorescence / dermatopathology features (such as attention heatmaps) can intuitively show which type of data dominates the assessment results, avoiding the problem that black box models are difficult for doctors to accept.
[0019] 5. Lightweight architecture supports edge device deployment: Sub-modules are all improved and adapted using lightweight networks (ResNet18, DenseNet121, and small parameter multi-head self-attention mechanism MLP). The total number of parameters of the fusion module is controllable, which can be adapted to monitoring devices connected to nanofilms to achieve automatic evaluation without relying on large computing devices. Attached Figure Description
[0020] Figure 1 A schematic diagram of the structure of a optic nerve-based risk assessment system for diabetic foot ulcers provided in this application embodiment; Figure 2 A schematic flowchart illustrating a method for assessing the risk of diabetic foot ulcers based on optic nerves, provided in an embodiment of this application; Figure 3 A schematic diagram of the algorithm network model for a diabetic foot ulcer risk assessment method provided in this application embodiment; Figure 4 A schematic diagram of the algorithm network model of a phosphorescent network model provided in an embodiment of this application; Figure 5 A schematic diagram of the algorithmic network model of another phosphorescent network model provided in the embodiments of this application; Figure 6 A schematic diagram of the algorithm network model of a wound subnetwork model provided in an embodiment of this application; Figure 7 A schematic diagram of the algorithm network model of a sensing subnetwork model provided in this application embodiment; Figure 8 A schematic diagram of an algorithmic network model for a fusion subnetwork model provided in this application embodiment; Figure 9 This is a schematic diagram of an algorithmic network model for another fusion sub-network model provided in an embodiment of this application. Detailed Implementation
[0021] It should be noted that the terms "first" and "second" used in the embodiments of this application are only used to distinguish features of the same type and should not be construed as indicating relative importance, quantity, order, etc.
[0022] The terms "exemplary" or "for example" used in the embodiments of this application are used to indicate examples, illustrations, or descriptions. Any embodiment or design described as "exemplary" or "for example" in this application should not be construed as being more preferred or advantageous than other embodiments or designs. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.
[0023] The terms "coupling" and "connection" used in the embodiments of this application should be interpreted broadly. For example, they can refer to a physical direct connection or an indirect connection achieved through electronic devices, such as a connection achieved through resistors, inductors, capacitors or other electronic devices.
[0024] Diabetes mellitus is a disease with complications affecting hundreds of millions of people worldwide. Diabetic foot ulcer (DFU) is a common complication of diabetes, affecting approximately 15% of people with type 1 and type 2 diabetes. As a typical chronic wound, DFU is extremely difficult to heal. Therefore, DFU not only imposes a significant medical burden on patients but also increases the risk of amputation. Generally, wound healing treatment can be divided into several overlapping stages, including hemostasis, inflammation reduction, regeneration, and remodeling.
[0025] The serious challenge lies in the fact that DFU healing is a far more complex process, often stalling at a certain stage and accompanied by systemic pathological changes. In reality, diabetes can negatively impact wound healing through one or more biological mechanisms. For example, diabetes can lead to microcirculatory disturbances, triggering certain peripheral neuropathy or peripheral artery disease. Furthermore, diabetes alters the immune response, reducing the body's resistance to bacterial infections. Moreover, high blood sugar causes cell membrane hardening and vasoconstriction, leading to reduced blood flow in the absence of oxygen and nutrient supply, thus hindering the healing process. Therefore, in actual treatment, effective assessment of the foot ulcer wound in diabetic patients is crucial for timely adjustments to treatment methods to accelerate recovery. How to effectively assess the condition is a significant challenge.
[0026] The above analysis shows that oxygen content at the site of a foot ulcer is a major factor affecting wound healing. In addition, parameters such as temperature, humidity, and pH at the wound site are also crucial. Furthermore, the actual size and depth of the foot ulcer are closely related to the recovery process. Achieving an accurate assessment of the risk of the condition by comprehensively considering all these factors remains a significant challenge.
[0027] To address the aforementioned problems, this application provides a 1000-based optic nerve-based risk assessment system for diabetic foot ulcers, such as... Figure 1As shown, the optic nerve-based diabetic foot ulcer risk assessment system 1000 includes a data acquisition component 100 and a data processing component 200. The data acquisition component 100 includes a nano-dressing 10, a first camera 20, and a second camera 30. The nano-dressing 10 is applied to the patient's foot ulcer and includes a hydrogel dressing and a nanosensing dressing. The hydrogel dressing has covalently linked nanoprobe structures that excite phosphorescence to varying degrees under different oxygen concentrations. The nanosensing dressing is used to acquire sensory feature data of the foot ulcer. The first camera 20 is used to acquire images of the phosphorescence features of the foot ulcer, obtaining a single-wavelength phosphorescence image. The second camera 30 is used to acquire images of the wound features of the foot ulcer, obtaining a three-primary-color ulcer wound image. The data processing component is used to: acquire phosphorescent images, ulcer wound images, and sensor feature data of the patient's foot ulcer; process the phosphorescent images based on the phosphorescent sub-network model to obtain the oxygen-phosphorescence response feature vector; process the ulcer wound images based on the wound sub-network model to obtain the ulcer pathological feature vector; extract physiological parameters from the sensor feature data based on the sensor sub-network model to obtain the physiological feature vector; and perform weighted concatenation and fusion of the oxygen-phosphorescence response feature vector, ulcer pathological feature vector, and physiological feature vector based on the fusion sub-network model to obtain the fused feature vector. The fused feature vector is then used for identification calculations to obtain the diabetic foot ulcer risk assessment result. The value of the diabetic foot ulcer risk assessment result is used to indicate the patient's foot ulcer risk assessment level.
[0028] In this application, Figure 1 In the illustrated embodiment, multi-source data acquired by a composite nanofilm material and two cameras is used to quantitatively assess the severity of diabetic foot ulcers, aiding clinical judgment. This embodiment collects three types of data: 1. Phosphorescence images: phosphorescence images generated by the hydrogel based on the single-channel emission state of the affected area (oxygen concentration is directly related to ulcer healing status; hypoxia is often accompanied by ulcer deterioration); 2. Images of the ulcer wound corresponding to the affected skin, i.e., RGB images of the foot ulcer area (including ulcer area, edge morphology, tissue state (granulation / necrosis), exudate, and other pathological information); 3. Sensor feature data acquired by the sensor, based on three physiological parameters measured by the nanostructure sensor (pH value: abnormal acidity indicates infection; humidity: reflects exudate volume and is associated with the degree of inflammation; temperature: local heat is a core signal of infection). A neural network algorithm based on multi-network structure fusion is used to perform multimodal fusion of the multi-source data, thereby comprehensively assessing the patient's disease risk.
[0029] For example, the nanoprobe structure can be a nanostructure based on small molecule probes and / or macromolecule probes.
[0030] For example, the nanosensing dressing of the nanodress 10 can be connected to an electrical processing device, which may include signal processing circuitry (e.g., processing the acquired sensing signals such as signal filtering and signal amplification) and signal transmission circuitry, such as wireless signal transmission. Alternatively, it may include a processor and a memory to process and store the acquired sensing data. The acquired sensing signals can be transmitted to the data processing component 200 in real time, or stored in the memory and retrieved by the data processing component 200.
[0031] Based on the above Figure 1 The visual nerve-based diabetic foot ulcer risk assessment system 1000 shown can perform the following: Figure 2 The illustrated method for assessing the risk of diabetic foot ulcers based on optic nerves, including steps S100-S300, is as follows: S100: Acquire phosphorescent images, ulcer wound images, and sensor feature data of the patient's foot ulcer.
[0032] In some possible implementations, the data processing component 200 runs a visual neural network model, such as... Figure 3 As shown, the neural network model includes a phosphorescent sub-network model, a wound sub-network model, a sensing sub-network model, and a fusion sub-network model. The phosphorescent sub-network model is used to input phosphorescent images for feature extraction. The wound sub-network model is used to input ulcer images for feature extraction. The sensing sub-network model is used to input sensor feature data for feature extraction. Specific operations for feature processing are described in steps S200A, S200B, and S200C below. Then, the fusion sub-network model inputs the feature data output from the first three sub-network models, performs feature fusion and analysis, and obtains the final risk assessment result. Specific operations for feature fusion and analysis are described in step S300 below.
[0033] S200A processes phosphorescence images based on a phosphorescence network model to obtain oxygen-phosphorescence response feature vectors.
[0034] In some possible implementations, such as Figure 4As shown, the phosphorescent network model includes a first input preprocessing layer, a lightweight ResNet18 network, and a first feature extraction layer connected in sequence. The first feature extraction layer includes a first extraction layer and a second extraction layer connected in sequence. The first extraction layer includes a first linear layer, a batch normalization layer, and a ReLU activation layer connected in sequence. The second extraction layer includes a second linear layer and a first regularized dropout layer connected in sequence. The number of input channels in the first and second linear layers is twice the corresponding number of input channels. The second linear layer is used to output 128-dimensional oxygen-phosphorescence response feature information. During model training, the first regularized dropout layer sets the oxygen-phosphorescence response feature information output by some channels of the second linear layer to 0 based on a preset random deactivation probability, thus obtaining an oxygen-phosphorescence response feature vector.
[0035] In this application, Figure 4 In the illustrated embodiment, the core value of phosphorescent images lies in reflecting the oxygen concentration at the affected area (luminescence intensity is positively correlated with oxygen content). It is necessary to extract oxygen-related features such as luminescence distribution, brightness gradient, and local brightness differences, while avoiding irrelevant noise (such as ambient light interference). Therefore, a pre-trained lightweight ResNet18 network was selected. This network model is adapted to medical edge devices and has advantages such as fast inference and fewer parameters. Compared to complex network models, such as ResNet50, it can avoid the problem of overfitting during training. However, in practical applications, the lightweight ResNet18 network needs to be improved. Specifically, since phosphorescent images are single-channel images with a single wavelength, a first input preprocessing layer needs to be added, or the first convolutional layer of the lightweight ResNet18 network needs to be modified to convert the single-channel input into a three-channel input to adapt to the input of the lightweight ResNet18 network. The size of the convolutional kernel and the stride can remain unchanged to ensure the capture of luminescence details. In addition, based on the lightweight ResNet18 network, a structure of batch normalization → ReLU activation → linear layer → dropout (i.e., the first regularized dropout layer) is added in the later stages. The added batch normalization layer can eliminate the feature shift caused by phosphorescence brightness fluctuations, and the first regularized dropout layer can prevent overfitting due to the small number of phosphorescence image samples. Furthermore, the second linear layer can adjust the dimension of the output oxygen-phosphorescence response feature vector to match the dimension of the feature vectors output by other sub-models.
[0036] For example, such as Figure 5As shown, the complete phosphorescent network model can be sequentially composed of a first input preprocessing layer, an adaptive convolutional layer, a max pooling layer, a first residual block group (ResBlock), a second residual block group, a third residual block group, a fourth residual block group, a global average pooling layer, a first extraction layer, and a second extraction layer. The following explanation uses an input phosphorescent image with dimensions of 224×224×1 as an example: The first input preprocessing layer takes a 224×224×1 phosphorescent image as input. Image normalization is performed on the phosphorescent image in this layer, specifically mapping pixel values from [0,255] to [0,1]. In addition, Gaussian filtering is applied to the phosphorescent image based on a 3×3 kernel and a filtering parameter σ=0.5. In this embodiment, because phosphorescent images are easily affected by ambient light (such as indoor lighting) and noise from the device's photosensitive element, small-kernel Gaussian filtering can smooth high-frequency noise (such as pixel jumps) while preserving the "local brightness gradient" (the boundary between hypoxic and normal areas). Conversely, using a large kernel (such as 5×5) will blur the boundaries and lose crucial oxygen difference information. Furthermore, the absolute luminescence values of phosphorescent images from different patients may vary greatly due to different device parameters (such as exposure time) (e.g., patient A's image has a maximum brightness of 200, while patient B's is 150). Normalization can unify the brightness scale and prevent the model from mistakenly treating "device differences" as "oxygen concentration differences."
[0037] The adapted convolutional layer includes one convolutional layer, one batch normalization layer, and a ReLU activation layer. The convolutional layer has 1 input channel, 64 output channels, a 7×7 kernel, a stride of 2, padding of 3, and no bias (bias=False). The batch normalization layer has a momentum of 0.9 and an epsilon of 0.00001. In this embodiment, although the phosphorescence image is single-channel, it requires multi-channel convolution to extract features of different dimensions, such as "horizontal brightness gradient," "vertical emission distribution," and "local brightness peaks." Therefore, by changing the input channel from 1 to 64 output channels, the 64 convolutional kernels can cover the core feature types related to oxygen. The reason for using a 7×7 convolutional kernel combined with stride and padding operations is that the 7×7 kernel can capture a wider range of brightness correlations. For example, a 3×3 kernel can only see local pixels, while a 7×7 kernel can cover the overall luminescence of a small area of oxygen-deficient region. The stride of the stride operation is 2, which can reduce the size from 224 to 112, thus reducing the amount of subsequent computation. The padding operation is set to 3 to ensure that edge pixels are not lost, because in practical applications, ulcers often occur at the edges of the feet, and edge features cannot be lost. Finally, the bias is set to 0, i.e., no bias (bias=False), because the batch normalization layer already includes mean adjustment. Removing the bias reduces parameter redundancy and lowers the risk of overfitting. Finally, the adaptation convolutional layer outputs a feature of size 112×112×64 to the max pooling layer. On the basis of halving the size, the number of channels is expanded to 64.
[0038] The max pooling layer has a 3×3 pooling kernel, a stride of 2, and a padding of 1. It can output features of size 56×56×64, further compressing the size while retaining key features.
[0039] The first residual block group comprises two residual blocks, each consisting of the following sequentially connected structures: ① A convolutional layer, a batch normalization layer, and a ReLN activation layer. The convolutional layer has 64-dimensional input to 64-dimensional output parameters, a 3×3 kernel, a stride size of 1, and a padding size of 1. ② A convolutional layer combined with a batch normalization layer, also with 64-dimensional input to 64-dimensional output parameters, a 3×3 kernel, a stride size of 1, and a padding size of 1. ③ An identity mapping layer combined with a ReLU activation layer, where the input of the identity mapping layer is directly added to the output of ②. The final output is a feature of size 56×56×64, where the channel dimension remains unchanged, but feature reuse is enhanced.
[0040] The second residual block group comprises two residual blocks, each consisting of the following sequentially connected structures: ① A convolutional layer, a batch normalization layer, and a ReLN activation layer. The convolutional layer parameters are 64-dimensional input to 128-dimensional output, with a 3×3 kernel, a stride size of 2, and a padding size of 1. ② A convolutional layer combined with a batch normalization layer, with 128-dimensional input to 128-dimensional output, a 3×3 kernel, a stride size of 2, and a padding size of 1. ③ A convolutional layer combined with a ReLU activation layer, with the input of the identity mapping layer directly added to the output of ②. The final output is a feature with a size of 28×28×128, where the channel dimension is doubled and the size is halved, improving the feature abstraction.
[0041] The third residual block comprises two residual blocks, each consisting of the following sequentially connected structures: ① A convolutional layer, a batch normalization layer, and a ReLN activation layer. The convolutional layer has a 128-dimensional input to a 256-dimensional output, a 3×3 kernel, a stride of 2, and a padding size of 1. ② A convolutional layer combined with a batch normalization layer, with a 256-dimensional input to a 256-dimensional output, a 3×3 kernel, a stride of 1, and a padding size of 1. ③ A convolutional layer combined with a ReLU activation layer, with the input of the identity mapping layer directly added to the output of ②. The final output is a feature of size 14×14×256, where the channel dimension is doubled again, the size is halved, and the focus is on the global luminescence distribution.
[0042] The fourth residual block comprises two residual blocks, each consisting of the following sequentially connected structures: ① a convolutional layer, a batch normalization layer, and a ReLN activation layer. The convolutional layer has a 256-dimensional input to a 512-dimensional output, a 3×3 kernel, a stride of 2, and a padding size of 1. ② A convolutional layer combined with a batch normalization layer, with a 512-dimensional input to a 512-dimensional output, a 3×3 kernel, a stride of 1, and a padding size of 1. ③ A convolutional layer combined with a ReLU activation layer, with the input of the identity mapping layer directly added to the output of ②. The final output is a 7×7×512 feature map, where the channel dimension is doubled again and the size is halved, resulting in a final feature map that condenses and reflects key oxygen-related information. In this embodiment, the design of the above four residual blocks aims to address the gradient vanishing problem in deep networks and enhance oxygen feature reuse. The core design of the residual blocks: Phosphorescent image feature extraction requires 4 sets of residual blocks (8 in total). If ordinary convolutional layers are used, the gradient vanishing effect in deep networks will prevent shallow brightness features from being transmitted to higher layers. However, identity mapping or 1×1 convolution can directly transmit shallow features to higher layers, ensuring that local luminescence details (such as tiny hypoxic points) are not obscured. In addition, the last three residual blocks employ a channel doubling logic (64→128→256→512). As the network deepens, the feature abstraction level gradually increases. The shallow layers (64 channels) can capture pixel-level brightness, the middle layers (128→256 channels) can capture the luminescence distribution of local areas, and the deep layers (512 channels) can capture the global oxygen concentration pattern (such as the proportion of hypoxic area in the entire ulcer region). Doubling the channels can accommodate more complex combinations of oxygen features. Finally, using a stride of 2 only in the first convolutional layer of the second to fourth residual blocks halves the size of each processing step (224→112→56→28→14→7), which both reduces the computational cost and ensures that the features do not overlap after each compression step (e.g., 56×56→28×28, which just covers the typical size of foot ulcers (1-5cm), avoiding feature redundancy).
[0043] The global average pooling layer, with a 7×7 pooling kernel and a stride of 1, outputs a feature map of size 1×1×512, compressing the 2D feature map into a 1D vector and reducing parameters. In this embodiment, based on the 7×7 pooling kernel, the final feature map size is 7×7×512. Global average pooling averages the 7×7 pixels of each channel to obtain a 1×1×512 vector. Compared to the fully connected layer based on 7×7×512 combined with 4096 parameters in the original lightweight ResNet18 network (approximately 7×7×512×4096 ≈ 10 million parameters), the global pooling layer in the improved model has no parameters, significantly reducing the risk of overfitting and avoiding interference from local pixel anomalies (such as single noise points) on the features.
[0044] The first extraction layer comprises a first linear layer, a batch normalization layer, and a ReLU activation layer connected in sequence. The first linear layer takes 512-dimensional information as input and outputs 256-dimensional information. The batch normalization layer has a momentum of 0.9 and a small constant of 0.00001. Finally, it outputs 256-dimensional feature information to initially compress features and enhance effective information. In this embodiment, the input dimension of the first linear layer is changed from 512 to 256 dimensions. Expanding the 512-dimensional features to 256 dimensions provides more feature selection space for the ReLU activation layer. The subsequent ReLU activation layer sets negative values to 0, and more dimensions retain more effective features.
[0045] The second extraction layer comprises a second linear layer and a first regularized dropout layer connected in sequence. The second linear layer takes 256-dimensional information as input and outputs 128-dimensional information. The probability of the first regularized dropout layer is 0.3, randomly deactivating some neuronal features of the channels output by the second linear layer based on a probability of 0.3, which can avoid overfitting during the model training phase. In subsequent practical applications, the probability can be set to 0. The final output is a 128-dimensional oxygen-phosphorescence response feature vector. In practical applications, this embodiment may face the problem of few phosphorescence image samples and a high risk of overfitting. The 0.3 deactivation probability can randomly shut down 30% of neurons, forcing the model to not rely on a single brightness feature, such as not treating a fixed brightness value as a sign of hypoxia, thus improving generalization. In addition, the final output feature dimension is 128-dimensional, which is consistent with the dimension of the ulcer pathology feature vector described later, reducing the dimension adaptation cost for multimodal fusion. For example, if the phosphorescence feature is 128-dimensional and the skin feature is 256-dimensional, an additional linear layer is needed for adjustment during fusion, increasing parameter redundancy.
[0046] S200B processes ulcer images based on a wound sub-network model to obtain ulcer pathological feature vectors.
[0047] In some possible implementations, such as Figure 6As shown, the wound sub-network model is an improved version of the DenseNet121 network. Specifically, the improvements are as follows: the classifier at the output stage of the DenseNet121 network is removed, and a second input preprocessing layer is added before the input of the DenseNet121 network, and a second feature extraction layer is added after the output of the DenseNet121 network. The second input preprocessing layer is used to adapt the image size of the input ulcer wound image to 224×224. The second feature extraction layer includes a third extraction layer and a fourth extraction layer connected in sequence. The third extraction layer comprises a third linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence; the fourth extraction layer comprises a fourth linear layer and a second regularized dropout layer connected in sequence; the third linear layer is used to reduce the input 992-dimensional ulcer pathological feature information to 256-dimensional ulcer pathological feature information; the fourth linear layer is used to reduce the input 256-dimensional ulcer pathological feature information to 128-dimensional ulcer pathological feature information; during the model training phase, the second regularized dropout layer sets the ulcer pathological feature information output from some channels of the fourth linear layer to 0 based on a preset random inactivation probability, thus obtaining the ulcer pathological feature vector.
[0048] In this application, Figure 6 In the illustrated embodiment, the ulcer wound image is an RGB three-channel image, with core information being the ulcer morphology (area, edges) and tissue state (granulation tissue / necrotic tissue / exudate). To effectively extract these features, three major problems need to be addressed: ① distinguishing between pathological features (such as blurred ulcer edges) and irrelevant features (such as skin texture and pigmentation); ② capturing local details (such as tiny necrotic spots) and global morphology (such as the overall ulcer area); ③ adapting to the skin color differences among different patients (such as black / white / yellow skin tone). Therefore, the model needs to prioritize extracting medically recognized pathological features rather than general image features (such as texture and color-irrelevant details). This embodiment uses a pre-trained DenseNet121 (compared to ResNet, DenseNet is better at capturing local details through "feature reuse," such as the blurriness of ulcer edges and the boundary between exudate and normal skin), with pre-trained weights based on ImageNet (ensuring basic feature extraction capabilities, then fine-tuning using clinical data). The original DenseNet121 classifier (1000 ImageNet classes) is replaced with a second feature extraction layer consisting of a third linear layer → batch normalization → ReLU → second regularized dropout layer. The third linear layer compresses the features output by the base network to 128 dimensions to align with the oxygen-phosphorescence response feature vector. Batch normalization and the second regularized dropout layer eliminate interference from differences in skin color and shooting lighting among different patients, improving generalization. The final output is an ulcer pathological feature vector containing 128 dimensions of pathological morphological features (such as ulcer edge irregularity and necrotic tissue proportion).
[0049] For example, the drop rate of the second regularized drop layer is 0.3.
[0050] S200C extracts physiological parameters from sensor feature data based on the sensor sub-network model to obtain physiological feature vectors.
[0051] In some possible implementations, such as Figure 7 As shown, the sensor sub-network model includes a third input preprocessing layer, a first hidden layer, a third regularized dropout layer, a second hidden layer, and an output layer connected in sequence. The third input preprocessing layer is used to perform statistical standardization on the input sensor feature data. It should be noted that the "standard" in statistical standardization does not refer to an artificially defined standard, but rather to the standardization process in mathematical probability and statistics. For example, the standardization under a normal distribution is (x-μ) / σ, where μ is the mean of the parameters in the normal population, σ is the standard deviation of the normal population, and x is the sample value. The "standard" in this description is clear and explicit.
[0052] For example, the sensing feature data includes three dimensions of physiological feature information: pH value, temperature value, and humidity value. The first hidden layer includes a fifth linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence. The fifth linear layer is used to enhance the three-dimensional physiological feature information to 16-dimensional physiological feature information. During the model training phase, the third regularization dropout layer sets the physiological feature information output by some channels of the fifth linear layer to 0 based on a preset random inactivation probability. The second hidden layer includes a sixth linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence. The sixth linear layer is used to enhance the 16-dimensional physiological feature information to 32-dimensional physiological feature information. The output layer is a seventh linear layer excluding the activation layer. The seventh linear layer is used to process the 32-dimensional physiological feature information and output a 32-dimensional physiological feature vector.
[0053] For example, the third input preprocessing layer can also handle outliers, such as pH < 4.0 or > 8.0, temperature < 30°C or > 42°C, and humidity < 20% or > 100% being considered outliers and replaced with the average value of the same batch.
[0054] For example, the following example illustrates how 3-bit sensor data can be input to output a 32-bit physiological feature vector: First, the third input preprocessing layer takes in 3D sensor data and performs outlier handling and statistical standardization. In this embodiment, sensor data may jump due to poor contact (e.g., pH electrode not adhering to the skin), resulting in values like pH=2.0 and temperature=45℃. Such outliers can severely interfere with the model. Replacing them with the average of the same batch (e.g., if the normal pH average of a batch is 6.5, replacing the outlier 2.0 with 6.5) can prevent the model from learning incorrect features. Standardization can address dimensional differences. For example, pH ranges from 0-14 (spanning 14), temperature from 30℃-40℃ (spanning 10), and humidity from 0%-100% (spanning 100). After standardization, all parameters are mapped to a distribution centered at 0 with a standard deviation of 1. The model will not prioritize humidity and ignore pH (e.g., 5.0) simply because the humidity value is high (e.g., 80%). Mean parameters (μ) for the normal population: pH = 6.2, temperature = 37.0℃, humidity = 50% (normal parameters of the feet of diabetic patients based on clinical literature statistics). σ is the standard deviation of the parameters in the normal population (pH = 0.3, temperature = 0.5℃, humidity = 8%). After standardization, abnormal parameters will show obvious positive and negative deviations. For example, during infection, pH changes from 5.6 to 6.2, (5.6-6.2) / 0.3 = -2.0, and temperature changes from 38.0℃ to 37℃, (38.0-37.0) / 0.5 = 2.0, which facilitates model identification.
[0055] Then, the sensor feature information with an input dimension of 3 is processed by the linear layer within the first hidden layer and transformed into sensor feature information with an output dimension of 16. This is then processed by the batch normalization layer and ReLN activation layer within the first hidden layer before being output to the third regularized discarding layer. In this embodiment, the sensor data has a low dimension, with only 3 items. Directly converting from 3D to 32D would result in excessive feature jumps and information loss. Expanding to 16 dimensions first provides the model with sufficient space to learn the correlation biases between parameters. For example, the model can learn to understand that a negative pH bias (infection) + a positive temperature bias (inflammation) + a positive humidity bias (excessive exudation) is a typical combination of severe ulcers, and 16 dimensions can accommodate such combined features. Furthermore, there may be differences between batches of sensor data. For example, some batches of patients may have mostly mild ulcers with small parameter biases, while others may have mostly severe ulcers with large biases. The batch normalization layer can adjust the mean within the batch, allowing the model to focus on relative biases rather than absolute biases. ReLU activation can retain only positive valid biases, such as positive temperature bias = inflammation, negative pH bias = infection, and set meaningless biases (such as small negative temperature bias = normal fluctuations) to 0, thus reducing interference.
[0056] Next, the third regularized dropout layer deactivates the 16-dimensional sensor feature information input with a deactivation probability of 0.2 during the pre-training phase, and then outputs the processed sensor feature information to the second hidden layer. In this embodiment, although the sensor data is structured, it contains slight noise (such as temperature fluctuations between 37.0℃ and 37.2℃). The deactivation probability of 0.2 can randomly shut down 20% of neurons, preventing the model from treating noise fluctuations as pathological biases. For example, the model will not determine that there is inflammation in the foot because of a normal temperature fluctuation of 37.2℃, but will focus on cases where the temperature is consistently >37.5℃.
[0057] Then, after processing from 16-dimensional input to 32-dimensional output within the second hidden layer, batch normalization and ReLN activation layers are applied sequentially to output the resulting 32-dimensional sensor feature information to the output layer. In this embodiment, the 16-dimensional to 32-dimensional processing further expands the dimensions, learning more complex multi-parameter collaborative biases. For example, the condition pH = -1.5 (moderate infection) + temperature = 1.0 (mild inflammation) + humidity = 1.5 (moderate exudation) corresponds to "moderate ulcer," and 32 dimensions can more accurately encode this type of combination. Simultaneously, the batch normalization layer combined with the ReLU activation layer can further filter effective features, ensuring that the output 32-dimensional features are all bias combinations related to the disease condition, without redundant information.
[0058] Finally, the output layer outputs a 32-dimensional physiological feature vector. In this embodiment, the output layer has no activation function. The intensity of sensor features (such as the absolute value of the deviation) is directly related to the severity of the condition (e.g., a pH deviation of -2.0 is more severe than -1.0). Using a linear layer without activation preserves this intensity information. Conversely, if ReLU is used, negative deviations (such as negative pH deviation) will be set to 0, resulting in the loss of infection-related features. Furthermore, the 32-dimensional output has a lower dimension than the oxygen-phosphorescence response feature vector and the ulcer pathology feature vector, which aligns with the objective fact that sensor data contains less information than image data. This avoids an excessively high proportion of low-information features during fusion. For example, 32 dimensions allow sensor features to account for 11% of the total fused features (128+128+32=288), ensuring they are neither ignored nor overshadowed.
[0059] S300: Based on the fusion sub-network model, the oxygen-phosphorescence response feature vector, ulcer pathological feature vector, and physiological feature vector are weighted and spliced together to obtain the fused feature vector. The fused feature vector is then used for identification calculation to obtain the risk assessment result of diabetic foot ulcer.
[0060] For example, the value of the diabetic foot ulcer risk assessment result is used to indicate the patient's foot ulcer risk assessment level.
[0061] In some possible implementations, such as Figure 8As shown, the fusion sub-network model includes a fourth input preprocessing layer, a multi-level cross-fusion layer, a dynamic attention weighting layer, and a decision output layer connected in sequence. The multi-level cross-fusion layer includes three levels of fusion layers connected in sequence. The first level of fusion layer includes a phosphorescence-pathology fusion layer and an image-sensing fusion layer. The phosphorescence-pathology fusion layer includes a matrix multiplication attention layer and a first weighted fusion layer. The matrix multiplication attention layer is used to cross-multiply the oxygen-phosphorescence response feature vector and the ulcer pathology feature vector to obtain an attention vector matrix. The first weighted fusion layer is used to cross-multiply the physiological feature vector with the attention vector matrix and then add the ulcer pathology feature vector to obtain cross-modal fusion features. The second level of fusion layer includes a feature splitting layer, a sub-feature interaction layer, and a recombination and aggregation layer connected in sequence. The feature splitting layer is used to split the cross-modal fusion features into multiple sub-fusion features. The sub-feature interaction layer is used to perform element-wise multiplication operations on each sub-fusion feature with other sub-fusion features to obtain multiple multiplicative fusion features. The recombination and aggregation layer is used to merge and connect multiple multiplicative fusion features to obtain recombined fusion features. The third-level fusion layer consists of a multi-head self-attention layer, a residual connection layer, and a layer normalization layer connected in sequence. The multidimensional features of the recombined fusion features are evenly distributed to each head in the multi-head self-attention layer for processing. The residual connection layer is used to add the output features of the multi-head self-attention layer to the recombined fusion features to obtain the global fusion features.
[0062] In this application, Figure 8 In the illustrated embodiment, the weights of different modalities / features are dynamically adjusted through multi-level interaction and attention mechanisms. A three-level progressive fusion strategy is adopted to gradually deepen the information interaction between modalities, solving the problem of varying modal importance under different conditions. The fusion assessment module can effectively integrate multi-source heterogeneous data, preserving the unique information of each modality while mining the correlation patterns between them, providing accurate and interpretable decision support for the assessment of diabetic foot ulcer conditions.
[0063] For example, the dynamic attention weighting layer is used to generate channel weights based on channel attention, generate spatial weights based on spatial attention, and perform weighted operations on the global fusion features based on the channel weights and spatial weights to obtain attention features.
[0064] For example, such as Figure 9As shown, the decision output layer comprises a multi-branch feature network and an ensemble decision network. The multi-branch feature network includes a fine-grained feature branch, a global feature branch, and a key feature branch. The outputs of the fine-grained feature branch, the global feature branch, and the key feature branch are connected to the input of the ensemble decision network. Specifically, the fine-grained feature branch consists of a sequentially connected eighth linear layer, a batch normalization layer, a LeakyReLU activation layer, a fourth regularized dropout layer, a ninth linear layer, a batch normalization layer, and a LeakyReLU activation layer; the input feature dimensions of the eighth and ninth linear layers are twice the corresponding output feature dimensions. The global feature branch consists of a sequentially connected global average pooling layer, a tenth linear layer, and a ReLN activation layer; the input feature dimension of the tenth linear layer is eight times the corresponding output feature dimension. The key feature branch consists of a sequentially connected attention feature selection layer and an eleventh linear layer; the attention feature selection layer is used to select the top k dimensions of features from the attention features based on feature-level attention; the input feature dimension of the eleventh linear layer is twice the corresponding output feature dimension. The integrated decision network consists of a feature stitching layer, a classification head layer, and a result output layer connected in sequence. The feature stitching layer is used to stitch together the outputs of the fine feature branches, the global feature branches, and the key feature branches to obtain multidimensional integrated features. The classification head layer is used to perform classification operations on the multidimensional integrated features to obtain the disease severity distribution value. The result output layer is used to output the risk assessment result of diabetic foot ulcer based on the value of the disease severity distribution value and the contribution weight, which is a preset weight value for phosphorescent images, ulcer wound images, and sensor feature data.
[0065] In this application, Figure 9 In the illustrated embodiment, a multi-branch network is used to capture decision information at different granularities. Specifically: the fine-grained feature branch structure includes a 512→256 linear layer, Batch Normalization (BN), LeakyReLU, Dropout (with a loss probability of 0.4) → linear layer (256-dimensional input to 128-dimensional output), BN, and LeakyReLU activation layer connected in sequence. Its function is to capture fine-grained feature differences, suitable for distinguishing similar conditions. The global feature branch structure includes a global average pooling layer (512-dimensional input to 512-dimensional output), a linear layer (512-dimensional input to 64-dimensional output), and a ReLU activation layer connected in sequence. Its function is to focus on the global feature distribution, suitable for judging the overall trend of the condition. The key feature branch structure includes a top-k feature selection network (k=64) based on feature-level attention and a linear layer (64-dimensional input to 32-dimensional output) connected in sequence. Its function is to focus on the most critical feature subset, improving decision robustness. Finally, the integrated decision network features are concatenated to form a 128+64+32=224-dimensional integrated feature set, and the prediction results are output based on the classification head.
[0066] The processor involved in the embodiments of this application can be a chip. For example, it can be a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a system-on-chip (SoC), a central processing unit (CPU), a network processor (NP), a digital signal processing circuit (DSP), a microcontroller unit (MCU), a programmable logic device (PLD), or other integrated chips.
[0067] The memory involved in the embodiments of this application can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DRRAM). It should be noted that the memory used in the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
[0068] It should be understood that in the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
[0069] Those skilled in the art will recognize that the modules and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0070] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and modules described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0071] In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or modules may be electrical, mechanical, or other forms.
[0072] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical modules; that is, they may be located on one device or distributed across multiple devices. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.
[0073] In addition, the functional modules in the various embodiments of this application can be integrated into one device, or each module can exist physically separately, or two or more modules can be integrated into one device.
[0074] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented using software programs, implementation can be, in whole or in part, in the form of a computer program product. This computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device containing one or more servers, data centers, etc., that can be integrated with the medium. The available media can be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid-state drives (SSDs)).
[0075] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method for assessing the risk of diabetic foot ulcers based on the optic nerve, characterized in that, This is used for risk assessment of foot ulcers in patients, wherein the affected area of the foot ulcer is covered with a nano-dressing, the nano-dressing comprising a hydrogel dressing and a nano-sensing dressing, wherein the hydrogel dressing is covalently linked with a nano-probe structure, the nano-probe structure being excited to different degrees of phosphorescence under different oxygen concentrations, and the nano-sensing dressing being used to collect sensory feature data of the affected area of the foot ulcer. The method includes: Acquire phosphorescent images, ulcer wound images, and sensor feature data of the patient's foot ulcer; The phosphorescence image is processed based on the phosphorescence network model to obtain the oxygen-phosphorescence response feature vector; The ulcer image is processed based on the wound sub-network model to obtain the ulcer pathological feature vector; Physiological parameters are extracted from the sensor feature data based on the sensor sub-network model to obtain a physiological feature vector. The oxygen-phosphorescence response feature vector, the ulcer pathological feature vector, and the physiological feature vector are weighted and fused based on the fusion sub-network model to obtain a fused feature vector. The fused feature vector is then used for recognition operations to obtain a recognition result. The value of the recognition result is used to indicate the patient's foot ulcer risk assessment level.
2. The method for assessing the risk of diabetic foot ulcers based on optic nerves according to claim 1, characterized in that, The phosphorescent network model comprises a first input preprocessing layer, a lightweight ResNet18 network, and a first feature extraction layer connected in sequence; the first feature extraction layer comprises a first extraction layer and a second extraction layer connected in sequence; the first extraction layer comprises a first linear layer, a batch normalization layer, and a ReLU activation layer connected in sequence, and the second extraction layer comprises a second linear layer and a first regularized dropout layer connected in sequence; wherein... The number of input channels in the first linear layer and the second linear layer is twice the number of corresponding input channels. The second linear layer is used to output 128-dimensional oxygen-phosphorescence response feature information. During the model training phase, the first regularized dropout layer sets the oxygen-phosphorescence response feature information output by some channels of the second linear layer to 0 based on a preset random inactivation probability, thereby obtaining the oxygen-phosphorescence response feature vector.
3. The method for assessing the risk of diabetic foot ulcers based on the optic nerve according to claim 1, characterized in that, The wound sub-network model is an improved version of the DenseNet121 network. Specifically, the improvements include: removing the classifier from the output stage of the DenseNet121 network; adding a second input preprocessing layer before the input of the DenseNet121 network; and adding a second feature extraction layer after the output of the DenseNet121 network. The second input preprocessing layer is used to adapt the image size of the input ulcer wound image to 224×224. The second feature extraction layer includes a third extraction layer and a fourth extraction layer connected in sequence. The third extraction layer comprises a third linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence; the fourth extraction layer comprises a fourth linear layer and a second regularized discard layer connected in sequence; the third linear layer is used to reduce the input 992-dimensional ulcer pathological feature information to 256-dimensional ulcer pathological feature information; the fourth linear layer is used to reduce the input 256-dimensional ulcer pathological feature information to 128-dimensional ulcer pathological feature information; During the model training phase, the second regularized dropout layer sets the ulcer pathological feature information output by some channels of the fourth linear layer to 0 based on a preset random inactivation probability, thereby obtaining the ulcer pathological feature vector.
4. The method for assessing the risk of diabetic foot ulcers based on optic nerves according to claim 1, characterized in that, The sensor sub-network model includes a third input preprocessing layer, a first hidden layer, a third regularized discard layer, a second hidden layer, and an output layer connected in sequence. The third input preprocessing layer is used to perform statistical standardization on the input sensor feature data.
5. The method for assessing the risk of diabetic foot ulcers based on optic nerves according to claim 4, characterized in that, The sensing feature data includes physiological feature information in three dimensions: pH value, temperature value, and humidity value. The first hidden layer includes a fifth linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence. The fifth linear layer is used to enhance the 3-dimensional physiological feature information to 16-dimensional physiological feature information. During the model training phase, the third regularized dropout layer sets the physiological feature information output by some channels of the fifth linear layer to 0 based on a preset random inactivation probability. The second hidden layer includes a sixth linear layer, a batch normalization layer, and a ReLN activation layer connected in sequence; the sixth linear layer is used to enhance the 16-dimensional physiological feature information to 32-dimensional physiological feature information. The output layer is a seventh linear layer that does not include an activation layer. The seventh linear layer is used to process the 32-dimensional physiological feature information and output the 32-dimensional physiological feature vector.
6. The method for assessing the risk of diabetic foot ulcers based on optic nerves according to claim 5, characterized in that, The momentum parameter of the batch normalized layer in the first hidden layer is 0.9, and the small constant is set to 0.00001.
7. The method for assessing the risk of diabetic foot ulcers based on the visual nerve according to any one of claims 1-6, characterized in that, The fusion sub-network model includes a fourth input preprocessing layer, a multi-level cross-fusion layer, a dynamic attention weighting layer, and a decision output layer connected in sequence. The multi-level cross-fusion layer includes a three-level fusion layer connected in sequence. The first-level fusion layer includes a phosphorescence-pathology fusion layer and an image-sensing fusion layer. The phosphorescence-pathology fusion layer sequentially includes a matrix multiplication attention layer and a first weighted fusion layer. The matrix multiplication attention layer is used to cross-multiply the oxygen-phosphorescence response feature vector and the ulcer pathology feature vector to obtain an attention vector matrix. The first weighted fusion layer is used to cross-multiply the physiological feature vector with the attention vector matrix and then add the ulcer pathology feature vector to obtain cross-modal fusion features. The second-level fusion layer comprises a feature splitting layer, a sub-feature interaction layer, and a recombination and aggregation layer connected in sequence. The feature splitting layer is used to split the cross-modal fusion feature into multiple sub-fusion features. The sub-feature interaction layer is used to perform element-wise multiplication operations on each sub-fusion feature with other sub-fusion features to obtain multiple multiplicative fusion features. The recombination and aggregation layer is used to merge and connect the multiple multiplicative fusion features to obtain a recombined fusion feature. The third-level fusion layer includes a multi-head self-attention layer, a residual connection layer, and a layer normalization layer connected in sequence; the multi-dimensional features of the recombined fusion feature are evenly distributed to each head in the multi-head self-attention layer for processing; the residual connection layer is used to add the output features of the multi-head self-attention layer to the recombined fusion feature to obtain the global fusion feature.
8. The method for assessing the risk of diabetic foot ulcers based on optic nerves according to claim 7, characterized in that, The dynamic attention weighting layer is used to generate channel weights based on channel attention, generate spatial weights based on spatial attention, and perform weighted operations on the global fusion features based on the channel weights and the spatial weights to obtain attention features.
9. The method for assessing the risk of diabetic foot ulcers based on optic nerves according to claim 8, characterized in that, The decision output layer includes a multi-branch feature network and an ensemble decision network. The multi-branch feature network includes a fine-grained feature branch, a global feature branch, and a key feature branch. The outputs of the fine-grained feature branch, the global feature branch, and the key feature branch are respectively connected to the input of the ensemble decision network. The fine feature branch includes an eighth linear layer, a batch normalization layer, a LeakyReLU activation layer, a fourth regularized discard layer, a ninth linear layer, a batch normalization layer, and a LeakyReLU activation layer connected in sequence; the input feature dimension of the eighth linear layer and the ninth linear layer is twice the corresponding output feature dimension; The global feature branch includes a global average pooling layer, a tenth linear layer, and a ReLN activation layer connected in sequence; the input feature dimension of the tenth linear layer is eight times the corresponding output feature dimension. The key feature branch includes an attention feature selection layer and an eleventh linear layer connected in sequence; the attention feature selection layer is used to select the top k dimensions of features from the attention features based on feature-level attention; the input feature dimension of the eleventh linear layer is twice the corresponding output feature dimension; The integrated decision network comprises a feature stitching layer, a classification head layer, and a result output layer connected in sequence. The feature stitching layer is used to stitch together the outputs of the fine feature branches, the global feature branches, and the key feature branches to obtain multidimensional integrated features. The classification head layer is used to perform classification operations on the multidimensional integrated features to obtain disease severity distribution values. The result output layer is used to output diabetic foot ulcer risk assessment results based on the values of the disease severity distribution values and contribution weights, wherein the contribution weights are preset weight values related to the phosphorescent image, the ulcer wound image, and the sensor feature data.
10. A optic nerve-based risk assessment system for diabetic foot ulcers, characterized in that, The system includes a data acquisition component and a data processing component. The data acquisition component includes a nano-dress, a first camera, and a second camera. The nano-dress is applied to the foot ulcer of a patient and comprises a hydrogel dressing and a nano-sensing dressing. The hydrogel dressing has nano-probe structures covalently linked to it, which are excited to varying degrees of phosphorescence under different oxygen concentrations. The nano-sensing dressing is used to acquire sensory feature data of the foot ulcer. The first camera is used to acquire images of the phosphorescence features of the foot ulcer, obtaining a single-wavelength phosphorescence image. The second camera is used to acquire images of the wound features of the foot ulcer, obtaining a three-primary-color ulcer wound image. The data processing component is used for: Acquire phosphorescent images, ulcer wound images, and sensor feature data of the patient's foot ulcer; The phosphorescence image is processed based on the phosphorescence network model to obtain the oxygen-phosphorescence response feature vector; The ulcer image is processed based on the wound sub-network model to obtain the ulcer pathological feature vector; Physiological parameters are extracted from the sensor feature data based on the sensor sub-network model to obtain a physiological feature vector. The oxygen-phosphorescence response feature vector, the ulcer pathological feature vector, and the physiological feature vector are weighted and fused based on the fusion sub-network model to obtain a fused feature vector. The fused feature vector is then used for recognition operations to obtain a recognition result. The value of the recognition result is used to indicate the patient's foot ulcer risk assessment level.