A medical feature selection method, apparatus, device, and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By introducing an archive-driven population evolution strategy and a directional selection mechanism for reference feature subsets into the Aurora Optimization Algorithm, the problem of insufficient stability of the Aurora Optimization Algorithm in medical feature selection is solved, achieving efficient and stable feature subset selection and improving the training effect of medical data analysis models.

CN122050878BActive Publication Date: 2026-06-30BIG DATA & INFORMATION TECH RES INST OF WENZHOU UNIV +1

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BIG DATA & INFORMATION TECH RES INST OF WENZHOU UNIV
Filing Date: 2026-04-17
Publication Date: 2026-06-30

Application Information

Patent Timeline

17 Apr 2026

Application

30 Jun 2026

Publication

CN122050878B

IPC: G16H50/70; G06N3/126; G06N3/006

AI Tagging

Technology Topics

Medicine Data profiling

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing Aurora optimization algorithms are prone to destroying high-quality feature subsets due to the randomness of mutation operations in medical feature selection, resulting in insufficient stability and an inability to efficiently adapt to the complex characteristics of high-dimensional medical data.

Method used

We introduce an archive-driven population evolution strategy and a directional selection mechanism for reference feature subsets. By using the centroids of failed and successful archives as references in the early and late stages of iteration, respectively, we guide the population away from inferior regions and towards superior regions. We also update the feature subsets by combining directional update of the direction vector and random perturbation.

Benefits of technology

It significantly improves the stability and adaptability of feature selection, increases convergence efficiency, and can screen out high-quality feature subsets from high-dimensional medical data to support the training of subsequent medical data analysis models.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122050878B_ABST

Patent Text Reader

Abstract

This invention provides a method, apparatus, device, and medium for medical feature selection, relating to the field of medical data processing technology. It addresses the technical problem of insufficient accuracy in selecting optimal feature subsets. The method comprises: constructing an improved aurora optimization algorithm that includes an archive-driven population evolution strategy and a directional selection mechanism based on a reference feature subset. The former uses a search mode with the centroid of failed archives as the reference feature subset in the early stages of iteration, guiding the population away from the centroid of failed archives to avoid inferior feature regions; in the later stages of iteration, it uses a search mode with the centroid of successful archives as the reference feature subset, guiding the population to converge towards the centroid of successful archives. The latter constructs a directional update direction vector based on the difference between the reference feature subset selected by the archive-driven population evolution strategy and the current feature subset, guiding the directional update of the feature subset. This can select the optimal feature subset that better meets the data analysis needs of the target medical dataset.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of medical data processing technology, and in particular to a method, apparatus, device, and medium for selecting medical features. Background Technology

[0002] Medical feature selection, as a core step in medical data preprocessing, directly impacts the time and resource consumption of model training. How to accurately select feature subsets with strong discriminative power and high stability, while eliminating redundancy and noise interference, is a current research focus in the field of high-dimensional medical data mining.

[0003] The Aurora Optimization Algorithm is an optimization method that works by simulating the motion of aurora particles in a magnetic field. Through mechanisms such as particle position updating, energy iteration, and adaptive balancing of global and local searches, it achieves efficient search for the optimal solution and can be applied to the selection of medical features in high-dimensional medical data.

[0004] However, the mutation operation of this algorithm relies on random adjustment, which can easily destroy the well-performing feature subsets that have already been searched, leading to a decrease in the stability of medical feature selection and making it unable to efficiently adapt to the complex characteristics of medical data. Summary of the Invention

[0005] Therefore, it is necessary to provide a method, apparatus, device, and medium for selecting medical features to address the aforementioned technical problems and achieve better selection of medical features.

[0006] This technology addresses the shortcomings of existing Aurora Optimization algorithms in medical feature selection, such as the tendency of mutation operations to destroy high-quality feature subsets and insufficient stability of feature selection. It enables accurate screening of features in high-dimensional medical data, improves the stability and adaptability of feature selection, and provides a high-quality feature foundation for the training of subsequent medical data analysis models.

[0007] The following technical solution is adopted in this specification:

[0008] This manual provides a method for selecting medical features, including:

[0009] An improved aurora optimization algorithm is constructed by adding an archive-driven population evolution strategy and a directional selection mechanism based on reference feature subsets to the original aurora optimization algorithm; among which,

[0010] The archive-driven population evolution strategy is used to switch the search mode as the optimization iteration progresses: in the early stage of iteration, a search mode with the centroid of failed archives as the reference feature subset is adopted to guide the population away from the centroid of failed archives to avoid inferior feature regions; in the later stage of iteration, a search mode with the centroid of successful archives as the reference feature subset is adopted to guide the population to converge toward the centroid of successful archives; wherein, failed archives are used to store the worst feature subset in history during the iteration process, and successful archives are used to store the best feature subset in history during the iteration process;

[0011] The directional selection mechanism based on reference feature subsets is based on the archive-driven population evolution strategy. The difference between the selected reference feature subset and the current feature subset is used to construct a directional update direction vector, which guides the directional update of the feature subset.

[0012] Obtain the target medical dataset, and generate a population containing multiple feature subsets based on the target medical dataset; optimize the population by improving the aurora optimization algorithm to obtain the optimal feature subset.

[0013] Furthermore, the archive-driven population evolution strategy specifically includes:

[0014] Construct a dynamic equilibrium parameter that increases non-linearly with the number of iterations;

[0015] The algorithm is divided into early and late iterations based on the numerical range of the dynamic equilibrium parameters. In the early iteration, the centroids of failed files are used as the reference feature subset, and in the late iteration, the centroids of successful files are used as the reference feature subset.

[0016] Based on a reference feature subset, a directional selection mechanism based on the reference feature subset is performed on the current feature subset in the population.

[0017] Furthermore, the implementation of a directional selection mechanism based on a reference feature subset for the current feature subset in the population specifically includes:

[0018] A first weighted weight is constructed to control the local precession motion, and a second weighted weight is constructed to control the global walk motion. Both the first weighted weight and the second weighted weight change nonlinearly with the ratio of the current iteration number to the maximum iteration number.

[0019] For each feature subset in the population, the precession velocity and aurora egg walk step length are obtained respectively. The precession velocity and aurora egg walk step length are weighted and summed based on the first adaptive weight and the second adaptive weight to obtain the motion driving coefficient.

[0020] The difference between the values of each dimension of the reference feature subset and the corresponding values of the current feature subset is used as the directional update vector.

[0021] Construct a binary mask vector with the same length as the feature dimension of the feature subset. Perform a dimension masking operation on the product of the motion driving coefficient and the orientation update direction vector using the binary mask vector, and retain only the product of the mask label dimension to obtain the orientation dimension update amount.

[0022] The random perturbation direction vector is obtained by using the overall difference between the current feature subset and the randomly selected feature subset within the population as the basis for random perturbation; the dimension mask and amplitude adjustment operations are performed on the random perturbation direction vector based on the binary mask vector and random numbers, retaining only the perturbation result of the mask-marked dimension and adjusting the perturbation intensity to obtain the random dimension perturbation increment;

[0023] Summing the current feature subset, the directional dimension update amount, and the random dimension perturbation increment yields the feature subset after the directional dimension update is completed.

[0024] Furthermore, the generation of a population comprising multiple feature subsets based on the target medical dataset is achieved through Latin hypercube sampling, specifically including:

[0025] Set the population size N and feature dimension D to be appropriate for the feature size of the target medical dataset;

[0026] The value range of each dimension in the feature space of the target medical dataset is equally divided into N equally probable intervals. A sample point is randomly selected from each interval. Then, the sample points selected from each dimension are randomly permuted and combined to generate an initial population adapted to the target medical dataset. In the initial population, the i-th... The feature subset at the th ... The value of dimension Obtain by the following formula:

[0027] ;

[0028] in, and The first The lower and upper bounds of a dimension, for Random numbers between intervals.

[0029] Furthermore, it also includes: the first The feature subset at the th ... The value of dimension Mapped to a binary vector:

[0030] Obtaining the first based on the Sigmoid function The feature subset at the th ... Dimensional mapping values:

[0031] ;

[0032] Generate random numbers between [0,1] , random number With mapping value Comparison:

[0033] If random number Greater than or equal to the mapped value , will the The feature subset at the th ... The value of dimension Set to 1; if random number Less than the mapped value ,Will The feature subset at the th ... The value of dimension Set to 0.

[0034] Furthermore, the improvement of the aurora optimization algorithm for population optimization specifically includes:

[0035] I. Initialization Phase: Create a success archive and a failure archive. The success archive is used to store the historical best feature subset during the iteration process, and the failure archive is used to store the historical worst feature subset during the iteration process. At the same time, initialize the population after Latin hypercube sampling and binary mapping, and set the maximum number of iterations for the algorithm.

[0036] II. The stage of archive-driven population evolution strategy;

[0037] III. Collision Detection and Perturbation Stage: Determine whether the current population meets the preset collision conditions. If the collision conditions are met, the feature subset is randomly perturbed through a chaotic perturbation mechanism. If the preset collision conditions are not met, the directional selection mechanism based on the reference feature subset is directly entered.

[0038] IV. Directed selection mechanism stage based on reference feature subset: Using the reference feature subset as a benchmark, the population is updated through a directed selection mechanism based on the reference feature subset to obtain a new population after the update;

[0039] V. Archive Update Phase: Based on the classification accuracy of the feature subsets by the fuzzy K-nearest neighbor classifier and the size of the feature subsets, a fitness function is constructed. The fitness value of all feature subsets in the updated new population is obtained through the fitness function. Based on the fitness value, all feature subsets are sorted by quality. The best feature subset in the updated new population is stored in the successful archive and the worst feature subset is stored in the failed archive. The successful archive and the failed archive are iteratively updated according to the first-in-first-out update strategy to keep the storage capacity of the two archives constant.

[0040] VI. Iteration Control Phase: Determine whether the current iteration count has reached the preset maximum iteration count. If not, increment the current iteration count by 1 and return to step II. If the maximum iteration count has been reached, terminate the algorithm iteration and output the optimal feature subset.

[0041] This specification provides a medical feature selection device, comprising:

[0042] An improved aurora optimization algorithm building module is provided to construct an improved aurora optimization algorithm by adding an archive-driven population evolution strategy and a directional selection mechanism based on reference feature subsets to the original aurora optimization algorithm; wherein,

[0043] The archive-driven population evolution strategy is used to switch the search mode as the optimization iteration progresses: in the early stage of iteration, a search mode with the centroid of failed archives as the reference feature subset is adopted to guide the population away from the centroid of failed archives to avoid inferior feature regions; in the later stage of iteration, a search mode with the centroid of successful archives as the reference feature subset is adopted to guide the population to converge toward the centroid of successful archives; wherein, failed archives are used to store the worst feature subset in history during the iteration process, and successful archives are used to store the best feature subset in history during the iteration process;

[0044] The directional selection mechanism based on reference feature subsets is based on the archive-driven population evolution strategy. The difference between the selected reference feature subset and the current feature subset is used to construct a directional update direction vector, which guides the directional update of the feature subset.

[0045] The medical feature selection module is used to acquire the target medical dataset, generate a population including multiple feature subsets based on the target medical dataset, and optimize the population by improving the aurora optimization algorithm to obtain the optimal feature subset.

[0046] This specification provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described medical feature selection method.

[0047] This specification provides a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the above-described medical feature selection method.

[0048] The above-mentioned technical solutions adopted in this specification can achieve the following beneficial effects:

[0049] This invention significantly enhances the algorithm's directional search capability in high-dimensional space by introducing an archive-driven population evolution strategy and a reference feature subset-guided selection mechanism, reducing invalid perturbations and improving convergence efficiency and stability. In multidimensional experiments involving public datasets and real clinical data, this method maintains high classification accuracy while significantly compressing feature dimensions, demonstrating good generalization performance and clinical applicability, and providing an effective tool for predicting the efficacy of immunotherapy for allergic rhinitis in children.

[0050] This invention employs an archive-driven population evolution strategy as the top-level global strategy, adaptively switching search modes as the optimization iteration progresses: In the early stages of iteration, a search mode using the centroid of failed archives as the reference feature subset is adopted, guiding the population away from the centroid of failed archives to avoid inferior feature regions and enhancing the algorithm's global exploration capability; in the later stages of iteration, a search mode using the centroid of successful archives as the reference feature subset is adopted, guiding the population to converge toward the centroid of successful archives, improving the algorithm's local development and convergence speed; a directional selection mechanism based on reference feature subsets serves as the underlying execution mechanism. Based on the archive-driven population evolution strategy, the reference direction determined by the archive-driven population evolution strategy is transformed into specific feature subset update actions. A directional update direction vector is constructed by the dimensional difference between the selected reference feature subset and the current feature subset, anchoring the update direction. Based on the directional update direction vector, the feature subset is guided to perform directional and selective dimensional updates, which can avoid the destruction of excellent structures during feature updates, and the selected optimal feature subset can better meet the data analysis needs of the target medical dataset. Attached Figure Description

[0051] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:

[0052] Figure 1 This is a flowchart illustrating a medical feature selection method provided in this specification;

[0053] Figure 2 This document provides a flowchart illustrating a feature optimization process based on a fuzzy K-nearest neighbor classifier.

[0054] Figure 3 This is one of the schematic diagrams comparing the convergence curves of different algorithms on a publicly available medical dataset, as provided in this specification. Figure 3 (a) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the brain tumor dataset; Figure 3 (b) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the breast cancer dataset; Figure 3 (c) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the central nervous system tumor dataset; Figure 3 (d) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the skin lesion dataset; Figure 3 (e) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the diffuse large B-cell lymphoma dataset; Figure 3 (f) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the prostate cancer dataset;

[0055] Figure 4 This is the second illustration comparing the convergence curves of different algorithms on a publicly available medical dataset, as provided in this specification. Figure 4 (a) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the leukemia dataset 1; Figure 4 (b) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the leukemia dataset 2; Figure 4 (c) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the leukemia dataset 3; Figure 4 (d) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the lung cancer dataset; Figure 4 (e) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the lymphatic disease dataset; Figure 4 (f) in the figure is a schematic diagram comparing the convergence curves of different algorithms on the Wisconsin breast cancer dataset;

[0056] Figure 5 This is a diagram showing the importance ranking of SCIT efficacy prediction features provided in this specification;

[0057] Figure 6 This is a schematic diagram of a medical feature selection device provided in this specification;

[0058] Figure 7 This is a schematic diagram of a device for obtaining [something] provided in this specification. Detailed Implementation

[0059] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments in this specification without creative effort are within the scope of protection of this application.

[0060] To address the technical problems of existing standard medical feature selection algorithms when processing high-dimensional, high-noise medical data, such as blind search mechanisms, lack of historical experience guidance leading to local optima, low search efficiency, difficulty in balancing global exploration and local development capabilities, and ultimately high redundancy of selected feature subsets and poor classification performance, this invention provides a medical feature selection method based on an enhanced aurora optimization algorithm.

[0061] This invention introduces an enhanced aurora optimization algorithm, which integrates an archive-driven population evolution strategy with a reference feature subset-guided selection mechanism to achieve efficient and accurate screening of feature subsets in high-dimensional medical data. The specific scheme is as follows:

[0062] 1. Construction of Enhanced Aurora Optimization Algorithm: The core of the enhanced aurora optimization algorithm lies in integrating two innovative strategies to compensate for the shortcomings of the original aurora optimization algorithm. The first is an archive-driven population evolution strategy: the algorithm maintains two external archives, a success archive and a failure archive, dynamically storing the optimal and worst feature subsets from historical iterations, respectively. Simultaneously, a dynamically switching parameter γ based on an exponential function is designed. This parameter adaptively controls the algorithm's search direction. In the early iterations, the centroid of the failure archive is used as a reference to guide the population away from known ineffective feature combination regions, strengthening the algorithm's global exploration capability. In the later iterations, the centroid of the success archive is used as a reference to guide the population towards known high-performance feature combination regions, accelerating the algorithm's local development process. This strategy achieves effective accumulation and utilization of search experience, precisely balancing the algorithm's global exploration and local development capabilities. Secondly, there is a selection mechanism guided by a reference feature subset: In response to the shortcomings of the original Aurora Optimization Algorithm, such as random mutation direction and full-dimensional updates that easily destroy the effective feature structure, a reference feature subset provided by a file-driven population evolution strategy is introduced into the mutation operation. A clearly directional difference vector is constructed to anchor the search direction. At the same time, a random binary mask vector is introduced. This vector is combined with a variable asynchronous length to achieve selective dimensional updates of the solution vector. The update operation is only performed on the dimension corresponding to the mask marked "1", which effectively protects the dimensional structure that is close to the optimal in the current solution, greatly reduces invalid update operations, and improves the search efficiency and accuracy of the algorithm in high-dimensional feature space.

[0063] 2. Binary Variant Transformation and Feature Subset Evaluation: To adapt to the discrete combinatorial optimization problem of medical feature selection, the sigmoid transformation function is used to map the continuous position vector output by the EPLO algorithm into a binary decision vector, where a value of 1 represents the selection of the corresponding feature and a value of 0 represents the removal of the corresponding feature. At the same time, a comprehensive evaluation fitness function is constructed, which takes into account both the classification accuracy of the fuzzy K-nearest neighbor classifier and the size of the selected feature subset, formalizing the feature selection problem into a weighted minimization problem, thereby guiding the algorithm to select the feature subset that achieves the optimal balance between classification accuracy and feature simplicity.

[0064] 3. Model Construction and Validation: The improved binary EPLO (bEPLO) algorithm is combined with the fuzzy K-nearest neighbor classifier to construct a complete wrap-around medical feature selection and classification model. The model is tested on public medical datasets and real clinical datasets. By comparing key indicators such as classification error rate, number of selected features, and algorithm convergence speed, the effectiveness and superiority of the method of this invention in improving the classification performance of medical data and reducing feature dimensionality are verified.

[0065] The technical solutions provided by this invention can be widely applied to high-dimensional medical data processing scenarios such as medical data mining, clinical data classification and diagnosis, and medical image feature extraction.

[0066] The method for selecting medical features according to the present invention is described below with reference to the accompanying drawings.

[0067] Figure 1 This is a flowchart illustrating a medical feature selection method provided in this specification, such as... Figure 1 As shown, the method includes:

[0068] S1. An improved Aurora optimization algorithm is constructed by adding an archive-driven population evolution strategy and a directional selection mechanism based on reference feature subsets to the original Aurora optimization algorithm. The archive-driven population evolution strategy switches the search mode as the optimization iteration progresses: in the early stages of iteration, a search mode using the centroids of failed archives as reference feature subsets is adopted to guide the population away from the centroids of failed archives to avoid inferior feature regions; in the later stages of iteration, a search mode using the centroids of successful archives as reference feature subsets is adopted to guide the population to converge towards the centroids of successful archives. Failed archives are used to store the worst-ever feature subsets during the iteration process, and successful archives are used to store the best-ever feature subsets during the iteration process. The directional selection mechanism based on reference feature subsets constructs a directional update direction vector based on the difference between the reference feature subset selected by the archive-driven population evolution strategy and the current feature subset, and guides the directional update of the feature subset based on this directional update direction vector.

[0069] In this application embodiment, the specific implementation of the file-driven population evolution strategy is as follows:

[0070] Constructing dynamic equilibrium parameters This parameter increases non-linearly with the number of algorithm iterations, and its calculation formula is as follows: ;in, It is the current iteration number. It represents the maximum number of iterations.

[0071] During each iteration, a random number between 0 and 1 is generated. Through random numbers With dynamic equilibrium parameters Numerical comparisons are used to adaptively determine the reference feature subset for feature subset updates: if ≥ The centroids of failed records are selected as a subset of reference features to guide the population away from low-quality, inferior feature regions; if < The centroids of successful archives are selected as a subset of reference features to guide the population to converge toward high-quality feature regions.

[0072] Early iteration When the value is small, the algorithm tends to use the centroid of failed files as a reference feature subset, driving the population away from known inferior feature regions and strengthening global exploration capabilities; as the number of algorithm iterations gradually increases, the dynamic balance parameter... The value increases non-linearly. < As the probability increases, the algorithm will be more inclined to use the centroid of successful archives as a reference feature subset, achieving an adaptive transition from global exploration to local development, guiding the population to gather towards historically optimal feature regions, and accelerating local development and algorithm convergence. This is achieved based on dynamic equilibrium parameters. The numerical range is used to divide the algorithm into early and late iteration stages. In the early iteration stage, the centroid of the failed files is used as a reference, while in the late iteration stage, the centroid of the successful files is used as a reference. Both successful and failed files are updated using a first-in, first-out (FIFO) principle. When a file reaches a preset storage capacity, the oldest stored feature subset is removed, and the best or worst feature subset obtained in the latest iteration is included, maintaining the timeliness of feature subsets within the file and providing an accurate basis for selecting the reference feature subset. Based on the reference feature subset, the targeted selection mechanism based on the reference feature subset described in this application is executed on the current feature subset in the population to complete the targeted update of the feature subset.

[0073] For example, in this embodiment of the application, the directional selection mechanism based on the reference feature subset realizes the directional and selective updating of the feature subset through the following formula, thereby completing the population iterative optimization:

[0074] ;

[0075] The specific implementation steps of this mechanism, as well as the definitions and calculation methods of the parameters in the formula, are as follows:

[0076] Construct the first weighted weights for controlling the local precession motion. The second weighting weight used to control the global walk motion Both change non-linearly with the ratio of the current iteration number to the maximum iteration number, and the calculation formula is as follows: ; ;in, Indicates the current iteration number; This represents the total number of iterations.

[0077] For each feature subset in the population, the precession velocity is obtained separately. Aurora Egg Walk Stride Based on the first adaptive weight Second adaptive weight counter-rotational precession Aurora Egg Walk Stride Perform a weighted summation to obtain the motion driving coefficient. Among them, the precession speed Aurora Egg Walk Stride The calculation formula is: ; ;in, The constant coefficient can be set to 1; This represents the damping factor, which can take random values between [1, 1.5]. for Flight distribution function, used to implement global random search; This represents the average position of all feature subsets in the population; For the first The feature subset at the th ... The value that a dimension can take; This represents the lower bound of each dimension of the feature space; A random number within the interval [0,1]; This represents the upper bound of each dimension of the feature space.

[0078] Based on a reference feature subset selected by an archive-driven population evolution strategy, the values of the reference feature subset in each dimension are extracted. Compare it with the current number The feature subset at the th ... The value of dimension The difference is used as the orientation vector for directional updates, i.e. ;in The centroid is dynamically selected as the algorithm iterates; in the early stages of iteration, it represents the failure profile at the [missing information - likely a specific location or point]. The value of dimensionality guides the population away from regions of inferior characteristics, and in the later stages of iteration, the centroid of the successful archive is in the first iteration. The value of dimensionality guides the population to converge toward regions with high-quality features.

[0079] Construct a binary mask vector with the same length as the feature dimension of the feature subset. This vector is used to selectively update the feature dimensions, avoiding damage to the already mined high-quality feature structure; the motion driving coefficients are multiplied by the directional update vector, and then processed through the binary mask vector. Perform a dimension masking operation on the product result, retaining only the dimensions marked as updated by the mask, and finally obtain the directional dimension update. .

[0080] Randomly select a subset of features within the population , will the current number Feature subset With this random feature subset The overall difference is used as the random perturbation direction vector, i.e. - ); then based on the binary mask vector Random numbers in the interval [0,1] Perform dimension masking and amplitude adjustment operations on the random perturbation direction vector, retaining only the perturbation result of the masked dimension, and then using random numbers... By adjusting the perturbation strength, the perturbation increment of the random dimension is finally obtained. .

[0081] The current number The original position of each feature subset Summing the calculated directional dimension update amount and random dimension perturbation increment with these values yields the result of the directional and selective dimension update. Each feature subset is used to iterate the position of a single individual, thereby updating all feature subsets in the population.

[0082] S2. Obtain the target medical dataset and generate a population containing multiple feature subsets based on the target medical dataset; optimize the population by improving the aurora optimization algorithm to obtain the optimal feature subset.

[0083] In this embodiment, the target medical dataset is obtained through multi-source clinical data and preprocessing steps, as follows:

[0084] A high-dimensional medical dataset of features to be analyzed was collected. In this embodiment, a clinical dataset of desensitization treatment for children with allergic rhinitis provided by a cooperating hospital was selected. This dataset covers multiple clinical feature dimensions such as immune indicators and cell counts. The original clinical data was systematically cleaned and standardized. Specifically, multiple imputation was used to fill in missing values of immune indicators such as specific IgE values, identify and correct abnormal values such as cell counts that are significantly outside the physiological range, and perform Z-score standardization on all continuous variables such as IgE concentration and cell percentage. Finally, a standardized and structured clinical feature database was constructed.

[0085] To ensure the objectivity and robustness of subsequent model evaluation, a ten-fold cross-validation strategy was adopted to divide the standardized clinical feature database: the database was randomly divided into ten mutually exclusive feature subsets. In each round of validation, one subset was selected as an independent test set, and the remaining nine subsets were combined as the model training set, providing a data foundation for subsequent feature selection and model construction.

[0086] For example, in this embodiment of the application, an initial population containing multiple feature subsets is generated based on the target medical dataset. This is achieved by combining Latin hypercube sampling with binary mapping of the Sigmoid function, which effectively solves the problems of uneven distribution and easy occurrence of search blind spots in traditional random initialization populations, and improves the diversity of the initial population and the global exploration capability of the algorithm. The specific implementation steps are as follows:

[0087] This embodiment employs the Latin hypercube sampling method for population initialization, ensuring that individuals in the initial population are uniformly distributed within the feature space. This enhances the diversity of the initial population and the algorithm's early global exploration capabilities, laying the foundation for subsequent global exploration. First, the population size is set to N, and the feature dimension to be D, which is compatible with the feature size of the target medical dataset. Then, the value range of each dimension in the feature space of the target medical dataset is equally divided into N equally probable intervals. One sample point is randomly selected from each interval, and the sample points selected from each dimension are randomly arranged and combined to generate a continuous initial population compatible with the target medical dataset. The initial population contains the [missing information - likely a specific type of population]. The feature subset at the th ... The value of dimension Obtain by the following formula: ; and The first and second parts of the feature space of the target medical dataset are respectively the first and second parts. The lower and upper bounds of a dimension, for Random numbers between intervals. The range of values is , The range of values is .

[0088] To adapt to the discrete combination optimization requirements of medical feature selection, the continuous value feature subset obtained by Latin hypercube sampling is mapped to a binary vector, where a value of 1 represents the selection of the corresponding feature and 0 represents the removal of the corresponding feature. The specific mapping process is as follows:

[0089] ;

[0090] ;

[0091] in, It is a random number between [0,1]. Greater than or equal to the mapped value , will the The feature subset at the th ... The value of dimension Set to 1; if random number Less than the mapped value ,Will The feature subset at the th ... The value of dimension Set to 0. Following this rule, perform binary mapping on the continuous values of each dimension of all feature subsets in the initial population, ultimately obtaining an initial population composed of binary feature subsets, providing a data foundation for the subsequent iterative optimization of the enhanced aurora optimization algorithm.

[0092] For example, in the embodiments of this application, the population is iteratively optimized using an improved enhanced aurora optimization algorithm to select the optimal feature subset of the target medical dataset. Specifically, the following stages are executed sequentially, with each stage connected to form a complete algorithm optimization loop.

[0093] I. Initialization phase.

[0094] First, the core parameters of the enhanced aurora optimization algorithm are initialized, including the feature dimension D (total number of original clinical features) of the target medical dataset, the maximum number of iterations, and the population size N. Simultaneously, two external archives are created: a success archive and a failure archive. The success archive stores the historical best feature subsets during iteration, while the failure archive stores the historical worst feature subsets. Dedicated variables are set to record the globally optimal feature subsets and their corresponding fitness values in real time. Finally, the initial population, after Latin hypercube sampling and binary mapping using the Sigmoid function, is loaded, completing all preparations before algorithm iteration.

[0095] II. The stage of archive-driven population evolution strategy.

[0096] Based on dynamic threshold The determination results identify the dynamic reference feature subset for this iteration, providing core guidance for subsequent targeted update mechanisms.

[0097] III. Collision Detection and Disturbance Phase.

[0098] This stage enhances the algorithm's ability to escape local optima through collision detection and chaotic perturbation mechanisms, introducing reasonable random perturbations into the feature subset update. Specifically, the execution logic involves generating two random numbers within the [0,1] interval. and Calculate the collision threshold , The value monotonically increases with the number of iterations, making the collision probability positively correlated with the iteration progress. When the following condition is met... and If the population meets the preset collision conditions, it is determined that the population does not meet the conditions. If the collision conditions are met, a chaotic perturbation mechanism is introduced to randomly perturb the feature subset of the current population to break the constraint of the local optimum. If the collision conditions are not met, the process directly enters the next stage and performs a targeted update operation based on the current population state.

[0099] IV. Targeted selection mechanism stage based on reference feature subset.

[0100] Using a reference feature subset as a benchmark, a targeted selection mechanism is executed to update the population, resulting in a new population after the update. This addresses the problem that traditional full-dimensional updates easily destroy the structure of excellent feature combinations.

[0101] Generate a random binary mask vector with the same length as the feature dimension. A mask value of 1 indicates that the corresponding dimension feature will be updated, while a value of 0 indicates that the corresponding dimension feature will remain unchanged. This achieves selective dimensional updates of the feature subset, minimizing the disruption to existing high-quality feature structures. This mechanism, combined with the aforementioned reference feature subset direction, ensures that the algorithm can focus on promising search directions while minimizing the disruption to existing high-quality solutions.

[0102] V. Archive Update Phase.

[0103] To evaluate the quality of each candidate feature subset, a prediction model is constructed by combining it with a fuzzy K-nearest neighbor classifier. First, the continuous solution vector generated by the algorithm is converted into a binary vector representing whether a feature is selected using the sigmoid transformation function. Then, the FKNN model is trained based on the selected feature subsets from this binary vector, and fitness values are obtained on the validation set. The fitness function is designed as follows to balance classification accuracy and feature subset size:

[0104] ;

[0105] in, This represents the classification accuracy of the FKNN model. The size of the selected feature subset, This represents the total number of original features. This function calculates the fitness values of all feature subsets in the updated population, serving as the core criterion for determining the quality of feature subsets.

[0106] Based on fitness values, all feature subsets of the new population are ranked in order of merit. The best feature subsets are included in the successful archive and the worst feature subsets are included in the failed archive. At the same time, the two archives are maintained according to the first-in-first-out update strategy. When the archives reach the preset storage capacity, the earliest stored feature subsets are removed to keep the archive storage capacity constant and ensure the timeliness and effectiveness of feature subsets in the archives.

[0107] VI. Iterative Control Phase.

[0108] Determine if the current iteration count has reached the preset maximum iteration count. If not, increment the current iteration count by 1 and return to step II to start a new round of population update and feature selection. If the maximum iteration count has been reached, terminate the algorithm iteration and output the optimal feature subset.

[0109] When the maximum number of iterations is reached, the algorithm terminates and outputs the globally optimal feature subset found during the entire optimization process. This subset is considered to be the most relevant feature subset (such as key immune biomarkers) to the classification task (e.g., SCIT efficacy prediction). Based on this optimal feature subset, the final FKNN prediction model is retrained using all training data. This model can then be used to predict the efficacy of new patient samples.

[0110] Figure 2 This specification provides a flowchart illustrating a feature optimization process based on a fuzzy K-nearest neighbor classifier, as shown below. Figure 2 As shown, the algorithm first collects data, imports the dataset, and initializes it. It then uses 10x cross-validation to divide the dataset into multiple test and training sets. Next, it initializes the population using Latin hypercube sampling, constructs an initial archive, and updates dynamic parameters. It then uses a binary mask and updates the population position based on reference points. The algorithm checks if the stopping condition is met; if not, it continues updating the population position. If the condition is met, it selects the best individual as the algorithm's feature subset. FKNN is then used to classify the feature subset, and the fitness value, classification error rate, and number of features are evaluated. Finally, the algorithm outputs the best feature subset and terminates the process.

[0111] Figure 3 and Figure 4 This diagram illustrates a comparison of the convergence curves of the method (bEPLO) provided in this specification with eight other different algorithms performing a medical feature selection task on 12 publicly available medical datasets. Figure 3 The convergence curves are compared on the following six datasets: {Brain Tumor Dataset, Breast Cancer Dataset, Central Nervous System Tumor Dataset, Skin Lesion Dataset, Diffuse Large B-Cell Lymphoma Dataset, Prostate Cancer Dataset}. Figure 4The convergence curves are compared on the following six datasets: {Leukemia Dataset 1, Leukemia Dataset 2, Leukemia Dataset 3, Lung Cancer Dataset, Lymphoma Dataset, Wisconsin Breast Cancer Dataset}. The English and Chinese names of the method of this invention (bEPLO) and eight other different algorithms are shown in Table 1.

[0112] Table 1

[0113]

[0114] This embodiment systematically verifies the performance of the proposed method through multiple sets of comparative experiments, specifically divided into two parts: verification using publicly available medical datasets and verification using real clinical datasets. The experimental setup and results are as follows:

[0115] In the validation experiments on publicly available medical datasets, all comparison algorithms were run independently 10 times, employing a 10x cross-validation strategy. This involved dividing each dataset evenly into 10 subsets, randomly selecting one subset as the test set in each experiment, and using the remaining 9 subsets as the training set. Simultaneously, the algorithm hyperparameters were uniformly set: population size N = 30, maximum number of iterations = 50, and average fitness, classification error rate, and number of selected features were used as performance evaluation criteria. Experimental results show that the Enhanced Aurora Optimization (EPLO) algorithm performed exceptionally well on 11 publicly available medical datasets. It achieved the best average fitness score with an overall average ranking of 1.4, ranking first. Regarding classification error rate, it achieved zero errors on 9 datasets with an average ranking of 1.53, ranking first. In terms of the number of selected features, it selected the most concise features, with an average ranking of 1.2, ranking first. Statistical analysis using the Wilcoxon signed-rank test and Friedman test showed that EPLO exhibited the best overall performance among all comparison algorithms.

[0116] In a validation experiment using a real clinical dataset, this invention applies the bEPLO algorithm to a specific clinical study on the efficacy prediction of subcutaneous immunotherapy (SCIT) for allergic rhinitis in children. Based on real clinical data from two hospitals in southern Zhejiang Province, 272 cases of children aged 4-15 years who had completed at least three years of mite-specific subcutaneous immunotherapy were collected. A medical dataset covering immune indicators, clinical characteristics, and environmental exposure information was constructed. A bEPLO-FKNN model combining binary EPLO and fuzzy K-nearest neighbors was designed for key feature selection and efficacy prediction modeling of this dataset. This experiment also adopted a 10x cross-validation framework, and each algorithm was run independently 30 times to ensure the reproducibility of the experimental results. Experimental results show that, compared with eight other mainstream metaheuristic medical feature selection algorithms, the bEPLO-FKNN model performs best in all six clinical evaluation indicators, with an average classification accuracy of 92.30%, precision of 94.82%, and Matthews correlation coefficient and F-metric of 80.73% and 83.87%, respectively, significantly outperforming all the comparison models. The algorithm's convergence curve further verifies that bEPLO can still converge quickly in high-dimensional clinical data processing and can effectively avoid getting trapped in local optima.

[0117] This experiment also systematically analyzed the frequency of medical feature selection in multiple runs of the bEPLO algorithm, successfully identifying a set of core efficacy predictors with clinical interpretability. Figure 5 This is a diagram showing the importance ranking of SCIT efficacy prediction features provided in this specification. Figure 5 The Chinese names corresponding to the English names of each feature are shown in Table 2. Figure 5 As shown, ranked by selection frequency from highest to lowest, the top five are: house dust mite IgE / total IgE ratio, house dust mite-specific IgE level, eosinophil percentage, dust mite IgE / total IgE ratio, and lymphocyte percentage. This result not only confirms the central role of specific IgE indicators in predicting the efficacy of SCIT, but also suggests that the patient's immune background and inflammatory status may influence treatment response, providing reliable data support and clinical reference for clinicians to conduct individualized efficacy assessments of SCIT in children with allergic rhinitis.

[0118] Table 2

[0119]

[0120] This invention proposes an enhanced aurora optimization algorithm. By introducing an archive-driven population evolution strategy and a reference point-guided selection mechanism, the algorithm's directional search capability in high-dimensional space is significantly enhanced, invalid perturbations are reduced, and convergence efficiency and stability are improved. In multidimensional experiments involving public datasets and real clinical data, this method maintains high classification accuracy while significantly compressing feature dimensions, demonstrating good generalization performance and clinical applicability, and providing an effective computational tool for predicting the efficacy of immunotherapy for allergic rhinitis in children.

[0121] The medical feature selection device provided by the present invention is described below. The medical feature selection device described below and the medical feature selection method described above can be referred to in correspondence.

[0122] Figure 6 For an example, please refer to the structural schematic diagram of a medical feature selection device provided by the present invention. Figure 6 As shown, the medical feature selection device may include:

[0123] An improved Aurora optimization algorithm construction module is introduced to build the improved Aurora optimization algorithm by adding an archive-driven population evolution strategy and a directional selection mechanism based on reference feature subsets to the original Aurora optimization algorithm. The archive-driven population evolution strategy switches the search mode as the optimization iteration progresses: in the early stages of iteration, a search mode using the centroids of failed archives as the reference feature subset is adopted to guide the population away from the centroids of failed archives to avoid inferior feature regions; in the later stages of iteration, a search mode using the centroids of successful archives as the reference feature subset is adopted to guide the population to converge towards the centroids of successful archives. Failed archives are used to store the worst feature subsets in history during the iteration process, and successful archives are used to store the best feature subsets in history during the iteration process. The directional selection mechanism based on reference feature subsets constructs a directional update direction vector based on the difference between the reference feature subset selected by the archive-driven population evolution strategy and the current feature subset, and guides the directional update of feature subsets based on this directional update direction vector.

[0124] The medical feature selection module is used to acquire the target medical dataset, generate a population including multiple feature subsets based on the target medical dataset, and optimize the population by improving the aurora optimization algorithm to obtain the optimal feature subset.

[0125] Specific limitations regarding the medical feature selection device can be found in the above-mentioned limitations on medical feature selection, and will not be repeated here. Each module in the aforementioned medical feature selection device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in hardware or independently of the processor in the acquiring device, or stored in software in the memory of the acquiring device, so that the processor can call and execute the operations corresponding to each module.

[0126] This specification also provides a machine-readable storage medium storing a computer program that can be used to execute the above-described... Figure 1 The provided method for selecting medical features.

[0127] This instruction manual also provides Figure 7 The schematic diagram of the obtaining machine equipment shown is as follows: Figure 7 At the hardware level, the acquiring device includes a processor, internal bus, network interface, memory, and non-volatile memory, and may also include other hardware required for the services. The processor reads the corresponding acquiring program from the non-volatile memory into memory and then runs it to achieve the above. Figure 1 The provided method for selecting medical features.

[0128] Those skilled in the art will understand that implementing all or part of the processes in the above embodiments can be accomplished by instructing related hardware through a machine program. The machine program can be stored in a non-volatile machine-readable storage medium. When executed, the machine program can include the processes of the embodiments described above. Any references to memory, storage, database, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, or optical storage, etc. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can be in various forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM), etc.

[0129] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

Claims

1. A medical feature selection method characterized by, include: An improved aurora optimization algorithm is constructed by adding an archive-driven population evolution strategy and a directional selection mechanism based on reference feature subsets to the original aurora optimization algorithm; among which, The archive-driven population evolution strategy is used to switch the search mode as the optimization iteration progresses: in the early stage of iteration, a search mode with the centroid of failed archives as the reference feature subset is adopted to guide the population away from the centroid of failed archives to avoid inferior feature regions; in the later stage of iteration, a search mode with the centroid of successful archives as the reference feature subset is adopted to guide the population to converge toward the centroid of successful archives; wherein, failed archives are used to store the worst feature subset in history during the iteration process, and successful archives are used to store the best feature subset in history during the iteration process; The directional selection mechanism based on a reference feature subset is based on an archive-driven population evolution strategy. It constructs a directional update direction vector by comparing the selected reference feature subset with the current feature subset, and guides the directional update of the feature subset based on this vector. Specifically, it includes: A first weighted weight is constructed to control the local precession motion, and a second weighted weight is constructed to control the global walk motion. Both the first weighted weight and the second weighted weight change nonlinearly with the ratio of the current iteration number to the maximum iteration number. For each feature subset in the population, the precession velocity and aurora egg walk step length are obtained respectively. The precession velocity and aurora egg walk step length are weighted and summed based on the first weight and the second weight to obtain the motion driving coefficient. The difference between the values of each dimension of the reference feature subset and the corresponding values of the current feature subset is used as the directional update vector. Construct a binary mask vector with the same length as the feature dimension of the feature subset. Perform a dimension masking operation on the product of the motion driving coefficient and the orientation update direction vector using the binary mask vector, and retain only the product of the mask label dimension to obtain the orientation dimension update amount. The random perturbation direction vector is obtained by using the overall difference between the current feature subset and the randomly selected feature subset within the population as the basis for random perturbation; the dimension mask and amplitude adjustment operations are performed on the random perturbation direction vector based on the binary mask vector and random numbers, retaining only the perturbation result of the mask-marked dimension and adjusting the perturbation intensity to obtain the random dimension perturbation increment; Summing the current feature subset, the directional dimension update amount, and the random dimension perturbation increment yields the feature subset after the directional dimension update is completed. Obtain the target medical dataset, and generate a population containing multiple feature subsets based on the target medical dataset; optimize the population by improving the aurora optimization algorithm to obtain the optimal feature subset.

2. The medical feature selection method of claim 1, wherein, The file-driven population evolution strategy specifically includes: Construct a dynamic equilibrium parameter that increases non-linearly with the number of iterations; The algorithm is divided into early and late iterations based on the numerical range of the dynamic equilibrium parameters. In the early iteration, the centroids of failed files are used as the reference feature subset, and in the late iteration, the centroids of successful files are used as the reference feature subset. Based on a reference feature subset, a directional selection mechanism based on the reference feature subset is performed on the current feature subset in the population.

3. The medical feature selection method of claim 1, wherein, The population generated based on the target medical dataset, comprising multiple feature subsets, is achieved through Latin hypercube sampling, specifically including: Set the population size N and feature dimension D to be appropriate for the feature size of the target medical dataset; The value range of each dimension in the target medical data set feature space is equally divided into N equal probability intervals, a sample point is randomly extracted in each interval, and the sample points extracted in each dimension are randomly arranged and combined to generate an initial population adapted to the target medical data set, wherein the value of the first dimension of the first feature subset is obtained according to the following formula: ； wherein, and are the lower and upper bounds of the dimensional interval, is a random number between the intervals.

4. The medical feature selection method of claim 3, wherein, Also includes: The first subset of features is mapped to a binary vector of dimension : 1 if the feature is present, 0 otherwise. The second subset of features is mapped to a binary vector of dimension The mapping value of the first feature subset in the first dimension is obtained based on a Sigmoid function dimension: ； Generate random numbers between [0,1] , random number With mapping value Compare, if random numbers Greater than or equal to the mapped value , will the The feature subset at the th ... The value of dimension Set to 1; if random number Less than the mapped value ,Will The feature subset at the th ... The value of dimension Set to 0.

5. The medical feature selection method of claim 4, wherein, The improvement of the aurora optimization algorithm for population optimization specifically includes: I. Initialization Phase: Create a success archive and a failure archive. The success archive is used to store the historical best feature subset during the iteration process, and the failure archive is used to store the historical worst feature subset during the iteration process. At the same time, initialize the population after Latin hypercube sampling and binary mapping, and set the maximum number of iterations for the algorithm. II. The stage of archive-driven population evolution strategy; III. Collision Detection and Perturbation Stage: Determine whether the current population meets the preset collision conditions. If the collision conditions are met, the feature subset is randomly perturbed through a chaotic perturbation mechanism. If the preset collision conditions are not met, the directional selection mechanism based on the reference feature subset is directly entered. IV. Directed selection mechanism stage based on reference feature subset: Using the reference feature subset as a benchmark, the population is updated through a directed selection mechanism based on the reference feature subset to obtain a new population after the update; V. Archive Update Phase: Based on the classification accuracy of the feature subsets by the fuzzy K-nearest neighbor classifier and the size of the feature subsets, a fitness function is constructed. The fitness value of all feature subsets in the updated new population is obtained through the fitness function. Based on the fitness value, all feature subsets are sorted by quality. The best feature subset in the updated new population is stored in the successful archive and the worst feature subset is stored in the failed archive. The successful archive and the failed archive are iteratively updated according to the first-in-first-out update strategy to keep the storage capacity of the two archives constant. VI. Iteration Control Phase: Determine whether the current iteration count has reached the preset maximum iteration count. If not, increment the current iteration count by 1 and return to step II. If the maximum iteration count has been reached, terminate the algorithm iteration and output the optimal feature subset.

6. A medical feature selection apparatus characterized by comprising: include: An improved aurora optimization algorithm building module is provided to construct an improved aurora optimization algorithm by adding an archive-driven population evolution strategy and a directional selection mechanism based on reference feature subsets to the original aurora optimization algorithm; wherein, The archive-driven population evolution strategy is used to switch the search mode as the optimization iteration progresses: in the early stage of iteration, a search mode with the centroid of failed archives as the reference feature subset is adopted to guide the population away from the centroid of failed archives to avoid inferior feature regions; in the later stage of iteration, a search mode with the centroid of successful archives as the reference feature subset is adopted to guide the population to converge toward the centroid of successful archives; wherein, failed archives are used to store the worst feature subset in history during the iteration process, and successful archives are used to store the best feature subset in history during the iteration process; The directional selection mechanism based on a reference feature subset is based on an archive-driven population evolution strategy. It constructs a directional update direction vector by comparing the selected reference feature subset with the current feature subset, and guides the directional update of the feature subset based on this vector. Specifically, it includes: A first weighted weight is constructed to control the local precession motion, and a second weighted weight is constructed to control the global walk motion. Both the first weighted weight and the second weighted weight change nonlinearly with the ratio of the current iteration number to the maximum iteration number. For each feature subset in the population, the precession velocity and aurora egg walk step length are obtained respectively. The precession velocity and aurora egg walk step length are weighted and summed based on the first weight and the second weight to obtain the motion driving coefficient. The difference between the values of each dimension of the reference feature subset and the corresponding values of the current feature subset is used as the directional update vector. Construct a binary mask vector with the same length as the feature dimension of the feature subset. Perform a dimension masking operation on the product of the motion driving coefficient and the orientation update direction vector using the binary mask vector, and retain only the product of the mask label dimension to obtain the orientation dimension update amount. The random perturbation direction vector is obtained by using the overall difference between the current feature subset and the randomly selected feature subset within the population as the basis for random perturbation; the dimension mask and amplitude adjustment operations are performed on the random perturbation direction vector based on the binary mask vector and random numbers, retaining only the perturbation result of the mask-marked dimension and adjusting the perturbation intensity to obtain the random dimension perturbation increment; Summing the current feature subset, the directional dimension update amount, and the random dimension perturbation increment yields the feature subset after the directional dimension update is completed. The medical feature selection module is used to acquire the target medical dataset, generate a population including multiple feature subsets based on the target medical dataset, and optimize the population by improving the aurora optimization algorithm to obtain the optimal feature subset.

7. An electronic device comprising a memory, a processor, and a computer program stored on the memory and running on the processor, characterized in that, When the processor executes the computer program, it implements the medical feature selection method as described in any one of claims 1 to 5.

8. A non-transitory computer-readable storage medium having stored thereon a computer program, characterized in that, When the computer program is executed by a processor, it implements the medical feature selection method as described in any one of claims 1 to 5.