A blood drug concentration prediction method, model and electronic device

By constructing an individualized blood drug concentration prediction model using multiple objective optimization algorithms and machine learning algorithms, the problem of inaccurate blood drug concentration prediction during sertraline drug use was solved, achieving precise and safe drug use and reducing the incidence of adverse reactions.

CN122201846APending Publication Date: 2026-06-12BEIJING ANDING HOSPITAL CAPITAL MEDICAL UNIV +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING ANDING HOSPITAL CAPITAL MEDICAL UNIV
Filing Date
2026-04-27
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In the process of personalized medication, existing technologies are unable to accurately predict the blood concentration of sertraline-type SSRIs, leading to inaccurate dosage and adverse reactions.

Method used

Multiple objective optimization algorithms (such as differential evolution, genetic evolution, and particle swarm optimization) are used to iteratively optimize pharmacokinetic parameters. By combining random forest and stepwise regression to screen covariates, an individualized blood drug concentration prediction model is constructed. The best algorithm is selected through weighted scoring, and parameters such as clearance rate, apparent volume of distribution, and absorption rate constant are introduced into the model to fill in missing values, thus forming a personalized concentration-time prediction model.

🎯Benefits of technology

It enables accurate prediction of blood drug concentrations, reduces adverse reactions, and improves medication safety and efficacy. It can quickly and accurately calculate pharmacokinetic parameters and blood drug concentrations, adapt to individual differences, and provide personalized medication recommendations.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122201846A_ABST
    Figure CN122201846A_ABST
Patent Text Reader

Abstract

The application discloses a blood drug concentration prediction method, a model and an electronic device, and solves how to realize accurate prediction of blood drug concentration in a big data environment, which is a technical problem urgently to be solved for improving current personalized drug blood drug concentration prediction. The application comprises obtaining a clinical data set, which is divided into a to-be-tested data set and a training set; the training set is respectively optimized and iterated to an optimal result by using two or more target optimization algorithms; effect indexes and efficiency indexes of each target optimization algorithm are calculated, and after screening according to preset effect index critical values and efficiency index critical values, a comprehensive score is obtained by weighted scoring of the effect indexes and the efficiency indexes of each algorithm, and the best target optimization algorithm is the one with the highest score; a preset blood drug concentration prediction model is trained according to the best target optimization algorithm; and the to-be-tested data set is input into the trained blood drug concentration prediction model, and a predicted blood drug concentration prediction value is output.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of information processing technology, and more specifically, to a method, model, and electronic device for predicting blood drug concentration. Background Technology

[0002] With the increasing availability of detection methods, there are more and more indicator variables used for predicting blood drug concentrations in personalized medication, which helps to improve the accuracy of blood drug concentration prediction. How to achieve accurate prediction of blood drug concentrations in a big data environment is a technical problem that urgently needs to be solved to improve the prediction of blood drug concentrations in personalized medication.

[0003] For example, in the process of individualized drug administration, the dosage of sertraline needs to be determined based on the individual patient's genetic, physiological, and pathological characteristics. However, because different patients have different sensitivities to sertraline, the dosage of sertraline is easily inaccurate, leading to adverse reactions in patients.

[0004] Therefore, how to quantitatively describe the absorption, distribution, metabolism, and excretion processes of sertraline-type SSRIs (selective serotonin reuptake inhibitors) during patient medication, and accurately predict blood drug concentrations after administration, thereby effectively improving the efficacy and safety of medication use, is a problem that the industry urgently needs to solve. Summary of the Invention

[0005] The purpose of this invention is to solve the above problems by providing a blood drug concentration prediction method, model, and electronic device, which addresses the issue of how to achieve accurate prediction of blood drug concentration in a big data environment and improve the accuracy of blood drug concentration prediction for personalized medication.

[0006] To address the above problems, the present invention provides the following technical solution: Firstly, a method for predicting blood drug concentrations includes: Acquire clinical datasets, which are divided into test datasets and training sets; The training set is optimized iteratively using two or more objective optimization algorithms until the optimal result is obtained; Calculate the performance and efficiency metrics of each objective optimization algorithm, and after filtering according to the preset performance and efficiency thresholds, obtain a comprehensive score by weighting the performance and efficiency metrics of each objective optimization algorithm, and select the one with the highest score as the best objective optimization algorithm. A pre-defined blood drug concentration prediction model is trained based on the optimal objective optimization algorithm; Input the dataset to be tested into the trained blood drug concentration prediction model, and output the predicted blood drug concentration values.

[0007] In relevant embodiments, before dividing the clinical dataset into the test dataset and the training set, the following steps are also included: The random forest algorithm was used on the covariates in the clinical dataset to determine the importance ranking of the indicators in each data point in the clinical dataset; For clinical datasets ranked by the importance of indicators, covariates were screened through stepwise regression, and data of a set quantity were selected as the core covariate set. Based on the pharmacokinetic mechanism and core covariate set, the route of administration and pharmacokinetic structural equations are determined, a blood drug concentration-time pharmacokinetic structural equation is established, and parameters such as clearance rate CL, apparent volume of distribution V, and absorption rate constant ka are introduced to form an individualized concentration-time prediction model. Based on individualized concentration-time prediction models and actual blood drug concentration and time data, the individual pharmacokinetic parameters of the patient, clearance rate CL, apparent volume of distribution V, and absorption rate constant Ka, were determined.

[0008] In relevant embodiments, pharmacokinetic parameters are determined based on individualized concentration-time prediction models and actual blood drug concentration versus time data, including: In an individualized concentration-time prediction model, actual blood drug concentration and time data are input, and pharmacokinetic parameters are estimated through regression equations. These pharmacokinetic parameters include clearance rate CL, apparent volume of distribution V, and absorption rate constant Ka. Based on the basic parameters and core covariates, individualized pharmacokinetic parameters are obtained; then, combined with the dosage and dosing regimen, they are substituted into the pharmacokinetic structural equation to obtain the final pharmacokinetic parameters.

[0009] In relevant embodiments, when there are missing values ​​in the core covariate set, a covariate replacement algorithm is used to fill in the missing values.

[0010] In related embodiments, after screening covariates through stepwise regression in the clinical dataset that ranks the importance of indicators, the process also includes forming a covariate set from all the obtained covariates. When the combined medications in the dataset to be tested are included in the core covariate set, the blood drug concentration model is trained based on the dataset to be tested, and the predicted blood drug concentration is output. If the combined medications in the dataset to be tested do not include the core covariate set, then a matching combined medication is selected from the covariate set and added to the core covariate set to form a personalized core covariate set.

[0011] In relevant embodiments, in the trained pharmacokinetic model dataset, the individual core covariates are substituted into the pharmacokinetic structural equation to obtain prior parameters, and then Bayesian updates are performed using the individual's subsequent concentration monitoring data to obtain individual posterior parameters; based on the individual posterior parameters and the dosing regimen, the predicted blood drug concentration is determined and output.

[0012] In relevant embodiments, pharmacokinetic parameters are regressed exponentially, with the following formula: , Where a is the baseline, e is the natural logarithm, and b1 is the covariate. The logarithmic linear coefficients, The variable is a categorical variable, consisting of one or more of the following: gender, smoking status, alcohol consumption, and genotype. b1 is a numerical variable, which can be one or more of age, height, and weight; b2 is a covariate. The power exponent.

[0013] In relevant embodiments, the clinical dataset includes: Medication details: Date of administration, time of administration, dosage, blood drug concentration, number of consecutive administrations, and dosing interval; Physiological parameters: including age, sex, height, weight, and liver and kidney function; Lifestyle: Whether you have any underlying medical conditions, whether you smoke, and whether you drink alcohol; Genotype and concomitant medications: Genotypes of enzymes involved in drug metabolism, types and frequencies of concomitant medications.

[0014] Secondly, a blood drug concentration prediction model includes: The data acquisition module is used to collect clinical datasets; The data optimization module is used to optimize clinical datasets and execute prediction methods. The algorithm selection module is used to calculate the performance and efficiency metrics of each objective optimization algorithm. After filtering based on preset performance and efficiency thresholds, a comprehensive score is obtained by weighting the performance and efficiency metrics of each algorithm, and the algorithm with the highest score is selected as the best objective optimization algorithm. The blood drug concentration prediction module is used to input the dataset to be tested into the trained blood drug concentration prediction model to obtain the individual blood drug concentration prediction value.

[0015] Thirdly, an electronic device includes a memory and a processor, the memory storing computer-readable instructions that, when executed by the processor, cause the processor to perform a blood drug concentration prediction method.

[0016] Compared with the prior art, the present invention has the following beneficial effects: (1) This invention uses two or more target optimization algorithms to optimize and iterate to the optimal result. After screening by preset effect index threshold and efficiency index threshold, the comprehensive score is obtained by weighted scoring of the effect index and efficiency index of each algorithm. The algorithm with the highest score is the best target optimization algorithm, which can calculate pharmacokinetic parameters more accurately. Single algorithm optimization is more suitable for SSRI (selective serotonin reuptake inhibitor) drug use recommendations, reducing the doctor's judgment time.

[0017] (2) By taking into full account the individual characteristics of patients, the present invention can accurately predict the patient’s response to drugs, thereby avoiding the use of drugs that may have adverse effects on patients; this helps to reduce the incidence of adverse drug reactions and improve the safety of patients’ medication.

[0018] (3) The personalized medication recommendation model constructed in this invention has efficient predictive ability, can quickly and accurately calculate the patient's pharmacokinetic parameters and blood drug concentration prediction values, can process a large amount of clinical data in a short time, and provide doctors with timely and effective medication advice. Attached Figure Description

[0019] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort, wherein: Figure 1 This is a flowchart of an embodiment of the present invention.

[0020] Figure 2 Flowchart for covariate selection.

[0021] Figure 3 This is a flowchart of the covariate replacement algorithm.

[0022] Figure 4 A schematic diagram of a blood drug concentration prediction model for personalized medicine.

[0023] Figure 5 This is a schematic diagram of an electronic device. Detailed Implementation

[0024] To make the objectives, technical solutions, and advantages of this invention clearer, the following will be combined with... Figures 1 to 5 The present invention will be described in further detail below. The described embodiments should not be regarded as limitations on the present invention. All other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0025] Before providing a further detailed description of the embodiments of the present invention, the nouns and terms involved in the embodiments of the present invention will be explained, and the nouns and terms involved in the embodiments of the present invention shall be interpreted as follows.

[0026] Objective optimization algorithms refer to mathematical methods or computational strategies used to find the optimal solution to a problem. Their core is to find the solution that satisfies the optimal value of the objective function in the feasible solution space through an iterative or search process. In this embodiment, objective optimization algorithms include differential evolution algorithm, genetic evolution algorithm, and particle swarm optimization algorithm.

[0027] Performance metrics: used to quantify the performance of objective optimization algorithms in solving problems, mainly including convergence (the degree to which the solution is close to the true optimal solution) and diversity (the uniformity of the distribution of the solution set).

[0028] Efficiency metrics: These measures the resource consumption and computational performance of an algorithm when solving a problem. They primarily focus on time efficiency and space efficiency, while also considering extended dimensions such as convergence speed and parallelism.

[0029] Random Forest Algorithm: This is a machine learning algorithm based on ensemble learning that improves the stability and accuracy of the model by combining the predictions of multiple decision trees (voting or averaging).

[0030] Stepwise regression is a statistical method that constructs the optimal regression model by dynamically selecting variables. Its core lies in balancing the explanatory power and simplicity of the model through a "both in and out" mechanism.

[0031] Structural equations of pharmacokinetics are mathematical models used to quantitatively describe the dynamic processes of drugs in the body. Their core is to integrate kinetic parameters of absorption, distribution, metabolism, and excretion (ADME) through mathematical equations.

[0032] Pharmacokinetic parameters include clearance rate CL, apparent volume of distribution V, and absorption rate constant ka.

[0033] Basic parameters include covariate information such as patient age, gender, concomitant medication information, and time.

[0034] Final pharmacokinetic parameters: including typical population values ​​and covariate coefficients obtained after covariate modeling and parameter iteration.

[0035] Covariate replacement algorithm: This is a method for handling missing covariates in causal inference and statistical modeling. Its core goal is to reduce model bias and improve the reliability of analysis by reasonably estimating missing values.

[0036] Prior parameters: The prior distribution of parameters from the internal population model.

[0037] Individual posterior parameters: posterior mean and random effect of individual pharmacokinetic parameters CL, V, Ka, etc., obtained by updating them based on prior data from individual concentration data.

[0038] Combination Figure 1 A method for predicting blood drug concentration according to the present invention includes: Step 100: Obtain the clinical dataset, which is divided into the test dataset and the training set; The clinical dataset includes: Medication details: Date of administration, time of administration, dosage, blood drug concentration, number of consecutive administrations, and dosing interval; Physiological parameters: including age, sex, height, weight, and liver and kidney function; Lifestyle: Whether you have any underlying medical conditions, whether you smoke, and whether you drink alcohol; Genotype and concomitant medications: Genotypes of enzymes involved in drug metabolism, types and frequencies of concomitant medications.

[0039] As shown in Table 1, the personalized medication indicator system table

[0040] The above is an example; other features related to blood drug concentration can be introduced as needed.

[0041] Step 200: The training set is optimized and iterated to the optimal result using two or more objective optimization algorithms. In this embodiment, three objective optimization algorithms, namely differential evolution algorithm, genetic evolution algorithm and particle swarm optimization algorithm, are selected to iterate the pharmacokinetic parameters, calculate the relatively optimal solution of the pharmacokinetic parameters for each patient, and evaluate the results.

[0042] The iterative algorithm for individual pharmacokinetic parameters is as follows: To establish a personalized blood drug concentration prediction model and recommend medication, the parameters must first be determined. However, these parameters are unknown from the clinical dataset. Therefore, iterative calculation of the pharmacokinetic parameters is required. Specifically, based on the existing data in the clinical dataset, the three unknown pharmacokinetic parameters in the following formula are derived. The formula for calculating blood drug concentration is as follows: , in, For the predicted blood drug concentration, Where is the absorption rate constant, and F is the bioavailability. The clearance rate constant, For apparent distribution volume, This refers to the dosage of medication. This data represents the time interval between patient medication administration and blood collection. Known variables in the data are blood drug concentration, medication dosage, and time interval; the unknown parameter is the absorption rate constant. Sweep rate constant Meanwhile, the clearance rate constant also has a medical relationship: , Where CL is the clearance rate constant. It is the apparent distribution volume.

[0043] According to the patient information, apart from the pharmacokinetic parameters, all other parameters in the formula for calculating blood drug concentration are unknown, i.e., the pharmacokinetic parameters that need to be iterated.

[0044] Optimization of initial values ​​for pharmacokinetic parameters based on particle swarm optimization algorithm Particle Swarm Optimization (PSO) is an evolutionary computation technique that originated from the study of bird flocking foraging behavior. Based on the observation of the collective activity of animals, it utilizes the sharing of information among individuals in the group to cause the movement of the entire group to evolve from disorder to order in the problem solution space, thereby obtaining the optimal solution.

[0045] In the algorithm, each solution is called a "particle." All particles have a fitness value determined by the optimized function, and each particle also has a velocity that determines its direction and distance of flight. All particles follow the currently optimal particle in the solution space. The specific process is as follows: First, initialize a swarm of particles, including random positions and velocities. Evaluate the fitness of each particle, comparing its fitness value with its best position pbest (individual extreme value), and select the better one as the current best position. Then compare its fitness value with gbest (population extreme value), and select the better one as the current best position. Adjust the particle velocity and position, and repeat the above steps until the maximum number of iterations is reached or the optimal position is found that satisfies a preset minimum fitness threshold. The corresponding relationships in the particle swarm algorithm are shown in Table 2.

[0046] Table 2. Correspondence between Particle Swarm Optimization Algorithms

[0047] The particle swarm optimization algorithm has five main parameters: population size, maximum number of iterations, inertia weight, individual and group memory. The reference ranges for each parameter are shown in Table 3.

[0048] Table 3. Important parameters of the particle swarm algorithm

[0049] The inertial weight w represents the influence of the velocity of the previous generation of particles on the velocity of the current generation of particles, that is, the degree of confidence a particle has in its current state of motion. The larger w is, the stronger the ability to explore new regions and the stronger the global optimization ability, but the weaker the local optimization ability. Conversely, the smaller w is, the weaker the global optimization ability and the stronger the local optimization ability.

[0050] Pharmacokinetic parameter optimization based on genetic algorithm Genetic algorithms are objective optimization algorithms that mimic the genetic characteristics of biological evolution. They represent the solutions to a problem as "chromosomes," which are iterated and optimized in the algorithm by simulating natural selection and genetic mechanisms to seek the optimal solution to the problem.

[0051] The core principles of genetic algorithms mainly include the following aspects: encoding, initialization, fitness function, selection, crossover, compilation, and iteration. Encoding maps the solution space of the problem to the search space of the genetic algorithm, representing the solutions as chromosomes. This is usually achieved through binary encoding, real-number encoding, or other forms of encoding. Initialization randomly generates an initial population consisting of a certain number of chromosomes, each representing a potential solution. The fitness function is a function defined during computation to evaluate the fitness of each chromosome, i.e., the quality of the solution represented by the chromosome. The fitness function is usually related to the objective function of the problem. Based on the fitness function, chromosomes with higher fitness are selected for reproduction, simulating the "survival of the fittest" principle in nature. Two chromosomes are randomly selected from the population, and their genes are exchanged in some way (such as single-point crossover, multi-point crossover, etc.) to produce new chromosomes. This simulates the gene recombination process in biological evolution. A gene is randomly selected from the chromosome for mutation, i.e., its value is changed. This simulates the gene mutation process in biological evolution, helping to maintain population diversity and preventing the algorithm from getting trapped in local optima too early. Repeated selection, crossover, and mutation operations generate a new population, and the fitness of the new population is evaluated. The iterative process continues until a termination condition is met, such as reaching a preset number of iterations or finding a solution that meets the requirements.

[0052] Commonly used parameters in genetic algorithms include population size, crossover rate, mutation rate, number of iterations, and fitness function. Population size is the number of chromosomes in the population. It affects the search range and convergence speed of the algorithm. An excessively large population size can significantly increase computational cost, while an excessively small population size may cause the algorithm to get trapped in local optima prematurely. The crossover rate determines the probability of crossover operations occurring. A high crossover rate may lead to an overly random search, while a low crossover rate may reduce search efficiency. The mutation rate determines the probability of mutation operations occurring. A high mutation rate may lead to an overly random search, while a low mutation rate may cause the algorithm to get trapped in local optima. The number of iterations is the number of times the algorithm runs. Too many iterations can significantly increase computational cost, while too few iterations may prevent the algorithm from finding the optimal solution. The fitness function is used to evaluate the fitness of chromosomes. The choice of fitness function directly affects the performance and results of the algorithm.

[0053] Genetic algorithms simulate biological evolution by using operations such as selection, crossover, and mutation to find the optimal solution in a predefined search space. They possess advantages such as strong global search capabilities and ease of integration with other algorithms, and have been widely applied in machine learning, optimization problems, and intelligent control.

[0054] Random Term Optimization Based on Differential Evolution Algorithm The basic idea of ​​differential evolution algorithm comes from the mutation, crossover and selection operations in biological evolution. However, it searches in a continuous space, which is different from traditional genetic algorithm in terms of encoding method and evolution operation.

[0055] Population initialization: A set of initial solutions is randomly generated from the solution space of the optimization problem. This set of solutions constitutes the initial population. Each solution is a vector, and the dimension of the vector corresponds to the number of variables in the optimization problem. For example, in pharmacokinetics parameter optimization, each parameter is a dimension of the vector.

[0056] Mutation: This is a crucial step in the differential evolution algorithm. For each individual in the population, a mutated individual is generated by weighting the vector difference between two different individuals in the population and then adding it to another individual. Mathematically, this is expressed as: in It is the first Mutant individuals of each generation , , From the first Three distinct individuals are randomly selected from the population, and are related to the current individual. different, It is a scaling factor used to control the scaling degree of the difference vector, and it determines the intensity of the mutation.

[0057] Crossover: To increase population diversity, mutant individuals are crossed with current individuals to generate experimental individuals. Common crossover methods include binomial crossover and exponential crossover. Taking binomial crossover as an example, for each dimension, crossover is performed according to a certain probability. The decision is made between using the value of the mutated individual and the value of the current individual. Mathematically, this is expressed as: , in The experimental individual was on the 1st Dimension value, Is Random numbers that are uniformly distributed between them. From arrive A dimension is randomly selected from (problem dimensions) to ensure that at least one dimension of the experimental individual comes from the variant individual; V i.j,GX is the mutation value of the i-th individual (i is the individual number) in the j-th dimension of the G-th generation; i,j,G It is the original value of the i-th individual in the j-th dimension of the G-th generation.

[0058] Selection operation: Compare the fitness values ​​of the experimental individual and the current individual, and select the individual with better fitness to enter the next generation of the population. The fitness value is defined according to the specific optimization problem. In pharmacokinetic parameter optimization, it may be the objective function that minimizes the error between the model-predicted blood drug concentration and the actual measured value. If the fitness of the experimental individual is better than that of the current individual, the experimental individual enters the next generation; otherwise, the current individual directly enters the next generation. In this way, the population evolves towards a better solution in each generation.

[0059] Common parameters Population size NP: The number of individuals in the population. A larger population size provides a wider search space, increasing the likelihood of finding the global optimum, but it also increases computational cost and time. In practical applications, the size needs to be adjusted based on the complexity of the problem and available computational resources; generally, a value of NP is considered optimal. between.

[0060] scaling factor : Controls the scaling of the difference vector, determining the strength of the mutation. The value range is typically within... between. When the value is small, the new individual generated by the mutation is less different from the original individual, and the algorithm tends to search locally. When the value is large, the new individuals produced by the mutation differ significantly from the original individuals, and the algorithm tends to perform a global search. Common values ​​are... .

[0061] Crossover probability : Determines the probability of using the variant individual value for each dimension in the experimental individual. The value range is within... between. The larger the value, the more similar the experimental individuals are to the mutant individuals, increasing population diversity, but it may cause the algorithm to converge prematurely; The smaller the value, the more information the experimental individual retains about itself, which may slow down the algorithm's convergence speed. Typically... Values ​​in between.

[0062] Maximum number of iterations The maximum number of iterations the algorithm can run. The algorithm stops running when the maximum number of iterations is reached. It limits the algorithm's computation time and resource consumption. If the number of iterations is set too small, the algorithm may not converge to the optimal solution; if it is set too large, it will waste computational resources. In practical applications, it needs to be adjusted according to the complexity of the problem and the algorithm's convergence performance.

[0063] Algorithm Flow Initialization: Set population size Scaling factor F, crossover probability Maximum number of iterations Parameters, etc., randomly generate the initial population And calculate the fitness value for each individual.

[0064] Mutation operation: For each generation For each individual in the population Generate mutated individuals according to the mutation formula. .

[0065] Crossover operation: crossover of mutated individuals With the current individual Perform cross operations to generate experimental individuals .

[0066] Selection Operation: Comparison of Trial Individuals and the current individual The fitness value is used to select individuals with better fitness to enter the next generation of the population. .

[0067] Termination condition check: Check if the maximum number of iterations has been reached. Alternatively, other termination conditions may be met (such as no significant improvement in fitness values ​​over multiple generations). If the termination condition is met, the individual with the best fitness in the current population is output as the optimal solution; otherwise, return to step 2 and continue with the next generation of evolution.

[0068] The differential evolution algorithm has advantages such as simple principle, easy implementation, fast convergence speed, and strong robustness. It does not require calculating the derivative of the objective function, making it suitable for various complex optimization problems, especially when the objective function is non-differentiable or difficult to differentiate.

[0069] The pharmacokinetic parameters obtained through iteration using three algorithms—particle swarm optimization, genetic algorithm, and differential evolution—were compared. Based on the proportions of efficacy and efficiency indicators, the particle swarm optimization algorithm was selected as the optimal algorithm.

[0070] Step 300: Calculate the performance index and efficiency index of each target optimization algorithm, and filter them according to the preset performance index threshold (e.g., fitting error ≤ 5%, prediction accuracy ≥ 90%) and efficiency index threshold (e.g., iteration steps ≤ 300 steps, optimization time ≤ 120 seconds) (removing genetic algorithms and differential evolution algorithms that do not meet the threshold, and only retaining particle swarm optimization algorithm to enter the weighted scoring stage). Obtain a comprehensive score by weighting the performance index (e.g., weight 70%) and efficiency index (e.g., weight 30%) of each algorithm, and select the one with the highest score as the best target optimization algorithm. Step 400: Train the preset blood drug concentration prediction model according to the optimal target optimization algorithm; Step 500: Input the dataset to be tested into the trained blood drug concentration prediction model, and output the predicted blood drug concentration value.

[0071] In one embodiment, before dividing the clinical dataset into the test dataset and the training set, the following steps are also included: like Figure 3 As shown, the random forest algorithm was used on the covariates in the clinical dataset to determine the importance ranking of the indicators in the clinical dataset. like Figure 2 As shown, the random forest algorithm is used to perform the first round of variable screening from a large number of features, and then stepwise regression is used to perform a second round of variable screening, thereby solving the problem of high feature quantity and dimensionality and making up for the shortcomings of the nonlinear mixed-effects model NONMEM in external validation.

[0072] For clinical datasets ranked by the importance of indicators, covariates are screened through stepwise regression, and data of a set quantity are selected as the core covariate set. Based on the pharmacokinetic mechanism and the core covariate set, the route of administration and pharmacokinetic structural equation are determined, a blood drug concentration-time pharmacokinetic structural equation is established, and parameters such as clearance rate CL, apparent volume of distribution V, and absorption rate constant ka are introduced to form an individualized concentration-time prediction model. For example, based on the pharmacokinetic mechanism and core covariate set, the route of administration and pharmacokinetic structural equation are determined, and a concentration-time equation is established; the optimal combination of covariates is screened from the "core covariate set"; the pharmacokinetic parameters CL, V, and Ka are iteratively estimated; and the blood drug concentration is calculated using the pharmacokinetic structural equation to form an individualized prediction model.

[0073] Based on the individualized concentration-time prediction model and actual blood drug concentration and time data, pharmacokinetic parameters are determined. Specifically, the time it takes for the patient to digest the drug is taken as a criterion and incorporated into the iterative model, and the predicted concentration is output at the specified time point.

[0074] Pharmacokinetic parameters were regressed exponentially, since these parameters are theoretically non-negative.

[0075] Where a is the baseline, e is the natural logarithm, and b1 is the covariate. The logarithmic linear coefficients, The variable is a categorical variable, consisting of one or more of the following: gender, smoking status, alcohol consumption, and genotype. b1 is a numerical variable, which can be one or more of age, height, and weight; b2 is a covariate. The power exponent.

[0076] In one embodiment, determining pharmacokinetic parameters based on an individualized concentration-time prediction model and actual blood drug concentration and time data includes: inputting actual blood drug concentration and time data into the individualized concentration-time prediction model and estimating basic parameters through regression equations; specifically: establishing a population-individual two-layer pharmacokinetic structure model based on a given route of administration and core covariate set, screening the optimal covariate combination, and calculating the time-blood drug concentration curve through constrained iteration and posterior optimization of pharmacokinetic parameters CL, V, and Ka, thereby generating an individualized concentration-time prediction model; Based on the basic parameters and core covariates, individualized pharmacokinetic parameters are obtained; then, combined with the dosage and dosing regimen, they are substituted into the pharmacokinetic structural equation to obtain the final pharmacokinetic parameters.

[0077] In one embodiment, when there are missing values ​​in the core covariate set, a covariate imputation algorithm is used to fill the missing values. The basis is whether the missing value is related to the data itself (unobserved value of the data itself, observed value of other variables). This classification directly determines the choice of subsequent missing value imputation methods (e.g., random missing values ​​can be handled by covariate imputation and other methods, while non-random missing values ​​need to be combined with more external information).

[0078] In one embodiment, after screening covariates through stepwise regression in the clinical dataset ranked by the importance of indicators, the process also includes forming a covariate set from all the obtained covariates. When the combined medications in the dataset to be tested are included in the core covariate set, the blood drug concentration model is trained based on the dataset to be tested, and the predicted blood drug concentration is output. If the combined medications in the dataset to be tested do not include the core covariate set, then a matching combined medication is selected from the covariate set and added to the core covariate set to form a personalized core covariate set. In one embodiment, in the trained pharmacokinetic model dataset, the individual core covariates are substituted into the pharmacokinetic structural equation to obtain prior parameters, and then Bayesian updates are performed using the individual's subsequent concentration monitoring data to obtain individual posterior parameters; the predicted blood drug concentration is determined and output based on the individual posterior parameters and the dosing regimen.

[0079] A unique set of core covariates is used as input into the regression equation to calculate the blood drug prediction model after optimization and training of pharmacokinetic parameters, and output the predicted blood drug concentration.

[0080] The final number of covariates is finite and includes multiple concomitant medications. However, in real-world scenarios, some "special patients" may not have taken any of the medications listed in the final covariates. This results in the parameters calculated through the regression equation failing to accurately represent the characteristics of these patients, leading to significant deviations in the prediction results. The solution proposed in this invention is to record the order of importance of all medications during the covariate selection process. When encountering the aforementioned "special patients," subsequent covariates can be added as the final covariates for that patient, making the model more personalized and thereby reducing prediction inaccuracies.

[0081] In one embodiment, random effects, also known as random variation, include between-individual and intra-individual variation (imperfect error) and periodic variation. Between-individual variation refers to random errors between different patients, excluding deterministic variation. Covariates are an important source of between-individual variation and can explain some of it. Population typical values ​​are usually functions of covariates, and changes in covariates affect the typical values ​​of pharmacokinetic / pharmacokinetic parameters. For example, if the apparent volume of distribution V increases with increasing body weight, then the pharmacokinetic parameter V is a function of body weight; the specific expression of this function is the process of modeling.

[0082] Individual-specific variation refers to the variation caused by different researchers, experimental methods, and the patient's own behavior over time, as well as model specification errors. Individual-specific variation (residual variation) is used to describe the difference between predicted and observed values ​​for an individual. Residual variation is usually caused by measurement errors, model bias, drug administration, or sampling errors. Its most commonly used representations are additive and multiplicative models.

[0083]

[0084] Where a is the baseline, e is the natural logarithm, and b1 is the covariate. The logarithmic linear coefficients, For categorical variables, such as gender, smoking status, alcohol consumption, and genotype, b2 is a numerical variable, such as age, height, and weight, and b3 is a covariate. The power exponent is used to introduce a random disturbance term into the model. It is added to the regression equation in the form of multiplication to describe the random effects of pharmacokinetic parameters and blood drug concentration.

[0085] For predicting individual blood drug concentrations, the blood drug concentration calculation formula needs to be used:

[0086] in, This refers to blood drug concentration. Where is the absorption rate constant, and F is the bioavailability. The clearance rate constant, For apparent distribution volume, This refers to the dosage of medication. The time interval between the patient's medication administration and blood collection.

[0087] Finally, a random disturbance term is incorporated to predict patient blood drug concentrations. The three pharmacokinetic parameters in the blood drug concentration calculation formula have already been used to derive regression equations through parameter iteration and covariate screening. After transforming the standard formula for the relationship between blood drug concentration and time into a linear equation, actual blood drug concentration and time data are substituted into the regression equation to estimate the basic parameters. A random disturbance term is then added to the regression equation in a multiplicative form, forming a regression equation containing a multiplicative random disturbance term.

[0088] Therefore, pharmacokinetic parameters can be calculated based on a patient's physiological parameters, lifestyle, genotype, and concomitant medications. These parameters are then incorporated into the blood drug concentration calculation formula, along with the patient's medication history, to obtain the predicted blood drug concentration. The predicted blood drug concentration is compared with the actual value, and the model's effectiveness is evaluated using assessment indicators. This ultimately forms a personalized medication recommendation model used to predict blood drug concentrations at different dosages and to recommend appropriate dosages within a given concentration range. In summary, the personalized medication recommendation model has been successfully constructed.

[0089] Evaluate the model performance Personalized medication recommendation models can predict a patient's blood drug concentration at a given dose. Blood drug concentration is a numerical variable, and to quantitatively evaluate the model's predictive effectiveness, evaluation metrics are needed to assess the prediction results. In this patent, blood drug concentration is a numerical problem, belonging to the regression problem. For a regression problem, evaluation metrics can include mean squared error (MSE), root mean square error (RMSE), mean absolute error (MAE), and sum of squared errors (SSE).

[0090] (1) Mean Square Error (MSE) The mean squared error (MSE) is the average of the squared differences between the predicted and actual values. The formula for calculating MSE is as follows: , in, It is the actual value. It is a predicted value. It refers to the number of samples.

[0091] MSE is highly sensitive to outliers because it uses squared differences. When a model's prediction for a point differs significantly from the true value, this difference is squared, thus accounting for a larger proportion of the total error. Therefore, MSE is often used for models that produce large discrepancies.

[0092] (2) Root Mean Square Error (RMSE) The root mean square error (RMSE) is the square root of the mean square error (MSE). The formula for calculating RMSE is as follows: , The units of RMSE are the same as those of the true and predicted values, which makes it easier to interpret. RMSE is also sensitive to outliers, but because it takes the square root, it is slightly less sensitive to outliers than MSE.

[0093] (3) Mean Absolute Error (MAE) Mean Absolute Error (MAE) is the average of the absolute values ​​of the differences between predicted and actual values. The formula for calculating MAE is as follows: , MAE is less sensitive to outliers than MSE and RMSE because it uses absolute values ​​instead of squares. This makes MAE more robust when dealing with data containing outliers.

[0094] (4) Sum of Squared Errors (SSE) The sum of squared errors (SSE) is the sum of the squares of the differences between all predicted and actual values. The formula for SSE is similar to that for MSE, but without dividing by the sample size n: , SSE is primarily used in the least squares method of linear regression to minimize the squared error between the predicted and actual values. Although SSE and MSE are calculated similarly, they have different meanings: SSE is the accumulation of the original errors, while MSE is the average of the errors.

[0095] Empirical studies have shown that the accuracy of blood drug concentration prediction generally meets the requirements. When patients receive treatment, doctors can first refer to the initial dosing regimen provided by the model, and then continuously observe and record the data during subsequent treatment. By back-calculating the model, the prediction accuracy is verified, and the prediction results are optimized by introducing a random perturbation term based on the patient's historical data, thereby forming a continuous personalized medication recommendation plan.

[0096] This invention, based on multi-dimensional data including patients' genes, physiology, pathology, and lifestyle, aims to uncover key factors influencing drug metabolism and efficacy in the patient's body, and is dedicated to building a precise and efficient personalized medication system. This invention has the following characteristics: 1. Collect multi-dimensional data on patients' medication use, physiological parameters, lifestyle, genotype, and concurrent medications. Clean the collected data and handle missing values ​​using appropriate methods such as deleting entire records, filling in the mean or mode, or imputation. Identify and handle outliers using statistical methods and rule detection. Simultaneously, remove duplicate records to ensure data accuracy and consistency, providing a reliable data foundation for subsequent analysis.

[0097] 2. In the process of iterating individual pharmacokinetic parameters, there are multiple algorithms to choose from. This invention compares various objective optimization algorithms such as particle swarm optimization, genetic algorithm, and differential evolution algorithm, and evaluates the iterative results of each algorithm from the perspectives of performance indicators and efficiency indicators.

[0098] 3. Addressing the Challenges of High-Dimensional Data and Special Patient Issues. With societal development, personalized medicine research faces the challenge of processing high-dimensional data, and traditional models have limitations when dealing with special patient cases. This invention employs the Random Forest algorithm from machine learning to perform a first round of variable selection from a large number of features. Leveraging its ability to handle high-dimensional data and its strong adaptability, it derives a ranking of indicator importance. Based on this, stepwise regression is used for a second round of selection to determine the required variables and derive standard formulas for pharmacokinetic parameters, effectively solving the high-dimensionality problem.

[0099] 4. To address the situation where some "special patients" in real-world scenarios may not have taken the drugs listed in the final covariates, a covariate replacement algorithm is proposed. During the covariate selection process, the importance order of all drugs is recorded. When a special patient is encountered, subsequent covariates are replaced, making the model more personalized and reducing prediction bias.

[0100] 5. Precision Medication Efficacy: This invention, through in-depth research into the individual characteristics of patients, including their genes, physiology, and pathology, combined with pharmacogenomics and population pharmacokinetics theories, can tailor personalized medication plans for each patient. Doctors can accurately calculate the optimal dosage of the drug based on the patient's specific condition, ensuring the drug achieves the best therapeutic effect in the body while minimizing adverse drug reactions. This invention is expected to help more patients achieve better treatment results and recover their health more quickly.

[0101] 6. Safe and Effective Medication: Traditional medication methods often employ a "one-size-fits-all" approach, ignoring individual patient differences and leading to adverse drug reactions in some patients. This invention, by comprehensively considering individual patient characteristics, can accurately predict patient responses to medications, thereby avoiding the use of drugs that may have adverse effects. This helps reduce the incidence of adverse drug reactions and improves patient medication safety.

[0102] 7. Optimizing Medication Process Effectiveness: The personalized medication recommendation model constructed in this invention possesses highly efficient predictive capabilities, quickly and accurately calculating patients' pharmacokinetic parameters and predicted blood drug concentrations. By employing advanced algorithms and models such as particle swarm optimization and random forests, this invention can process large amounts of clinical data in a short time, providing doctors with timely and effective medication recommendations. This model has promising clinical application prospects, providing medical institutions with a convenient medication support tool, helping to improve medical efficiency, reduce doctors' workload, and simultaneously provide patients with higher-quality medical services.

[0103] In summary, the personalized medication recommendation model of this invention has significant expected effects, and can provide patients with more accurate, safe and effective drug treatment plans, thus promoting the development of the field of personalized medicine.

[0104] In one embodiment, such as Figure 4 As shown, a blood drug concentration prediction model is provided, including: Data acquisition module 601 is used to acquire clinical datasets. The clinical dataset includes medication information: date of administration, time of administration, dosage, blood drug concentration, number of consecutive administrations, and dosing interval; physiological parameters: including age, sex, height, weight, and liver and kidney function; lifestyle: whether there are comorbid diseases, whether the user smokes, and whether the user drinks alcohol; genotype and concomitant medications: genotypes of enzymes involved in drug metabolism, and the types and frequencies of concomitant medications. Data optimization module 602 is used to optimize clinical datasets; The algorithm selection module 603 is used to calculate the performance and efficiency indicators of each target optimization algorithm, and after filtering according to the preset performance and efficiency thresholds, obtain a comprehensive score by weighting the performance and efficiency indicators of each algorithm, and select the algorithm with the highest score as the best target optimization algorithm. The blood drug concentration prediction module 604 is used to input the dataset to be tested into the trained blood drug concentration prediction model to obtain the individual blood drug concentration prediction value.

[0105] In one embodiment, such as Figure 5 As shown, an electronic device is provided, including a memory and a processor. The memory stores computer-readable instructions, which, when executed by the processor, cause the processor to perform a blood drug concentration prediction method.

[0106] Electronic devices can be desktop computers, laptops, handheld computers, and cloud servers, among other electronic devices. Electronic devices may include, but are not limited to, processors and memory. Those skilled in the art will understand that the figures are merely examples of electronic devices and do not constitute a limitation on the electronic device; it may include more or fewer components than illustrated, or different components.

[0107] The processor can be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

[0108] Memory can be an internal storage unit of an electronic device, such as a hard drive or RAM. Memory can also be an external storage device of an electronic device, such as a plug-in hard drive, SmartMedia Card (SMC), Secure Digital (SD) card, or Flash Card. Memory can also include both internal and external storage units. Memory is used to store computer programs and other programs and data required by the electronic device.

[0109] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can also be implemented in other ways. The apparatus embodiments described above are merely illustrative; for example, the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram and / or flowchart, and combinations of blocks in block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0110] In addition, the functional modules in the various embodiments of the present invention can be integrated together to form an independent part, or each module can exist independently, or two or more modules can be integrated to form an independent part.

[0111] If the aforementioned functions are implemented as software functional modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0112] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the invention should be included within the scope of protection of the invention. It should be noted that similar reference numerals and letters in the following figures denote similar items; therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.

[0113] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A method for predicting blood drug concentration, characterized in that, include: Acquire clinical datasets, which are divided into test datasets and training sets; The training set is optimized iteratively using two or more objective optimization algorithms until the optimal result is obtained; Calculate the performance and efficiency metrics of each objective optimization algorithm, and after filtering according to the preset performance and efficiency thresholds, obtain a comprehensive score by weighting the performance and efficiency metrics of each objective optimization algorithm, and select the one with the highest score as the best objective optimization algorithm. A pre-defined blood drug concentration prediction model is trained based on the optimal objective optimization algorithm; Input the dataset to be tested into the trained blood drug concentration prediction model, and output the predicted blood drug concentration values.

2. The prediction method according to claim 1, characterized in that, Before the clinical dataset is divided into the test dataset and the training set, it also includes... The random forest algorithm was used on the covariates in the clinical dataset to determine the importance ranking of the indicators in each data point in the clinical dataset; For clinical datasets ranked by the importance of indicators, covariates were screened through stepwise regression, and data of a set quantity were selected as the core covariate set. Based on the pharmacokinetic mechanism and core covariate set, the route of administration and pharmacokinetic structural equations are determined, a blood drug concentration-time pharmacokinetic structural equation is established, and parameters such as clearance rate CL, apparent volume of distribution V, and absorption rate constant ka are introduced to form an individualized concentration-time prediction model. Based on individualized concentration-time prediction models and actual blood drug concentration and time data, the individual pharmacokinetic parameters of clearance CL, apparent volume of distribution V, and absorption rate constant Ka were determined.

3. The prediction method according to claim 2, characterized in that, Based on individualized concentration-time prediction models and actual blood drug concentration and time data, pharmacokinetic parameters were determined, including: In an individualized concentration-time prediction model, actual blood drug concentration and time data are input, and pharmacokinetic parameters are estimated through regression equations. These pharmacokinetic parameters include clearance rate CL, apparent volume of distribution V, and absorption rate constant Ka. Based on the basic parameters and core covariates, individualized pharmacokinetic parameters are obtained; then, combined with the dosage and dosing regimen, they are substituted into the pharmacokinetic structural equation to obtain the final pharmacokinetic parameters.

4. The prediction method according to claim 2, characterized in that, When there are missing values ​​in the core covariate set, a covariate imputation algorithm is used to fill in the missing values.

5. The prediction method according to claim 2, characterized in that, For clinical datasets ranked by the importance of indicators, after screening covariates through stepwise regression, the process also includes forming a covariate set from all the obtained covariates. When the combined medications in the dataset to be tested are included in the core covariate set, the blood drug concentration model is trained based on the dataset to be tested, and the predicted blood drug concentration is output. If the combined medications in the dataset to be tested do not include the core covariate set, then a matching combined medication is selected from the covariate set and added to the core covariate set to form a personalized core covariate set.

6. The prediction method according to claim 2, characterized in that, In the trained pharmacokinetic model dataset, the individual core covariates are substituted into the pharmacokinetic structural equation to obtain prior parameters. Then, Bayesian updates are performed using the individual's subsequent concentration monitoring data to obtain individual posterior parameters. Based on the individual posterior parameters and the dosing regimen, the predicted blood drug concentration is determined and output.

7. The prediction method according to claim 2, characterized in that, The pharmacokinetic parameter clearance rate CL was regressed exponentially, and the formula is as follows: , Where a is the baseline, e is the natural logarithm, and b1 is the covariate. The logarithmic linear coefficients, The variable is a categorical variable, consisting of one or more of the following: gender, smoking status, alcohol consumption, and genotype. b1 is a numerical variable, which can be one or more of age, height, and weight; b2 is a covariate. The power exponent.

8. The prediction method according to claim 1, characterized in that, The clinical dataset includes: Medication details: Date of administration, time of administration, dosage, blood drug concentration, number of consecutive administrations, and dosing interval; Physiological parameters: including age, sex, height, weight, and liver and kidney function; Lifestyle: Whether you have any underlying medical conditions, whether you smoke, and whether you drink alcohol; Genotype and concomitant medications: Genotypes of enzymes involved in drug metabolism, types and frequencies of concomitant medications.

9. A blood drug concentration prediction model, characterized in that, The data acquisition module is used to collect clinical datasets; A data optimization module is used to optimize clinical datasets and execute the prediction method of any one of claims 2-7; The algorithm selection module is used to calculate the performance and efficiency metrics of each objective optimization algorithm. After filtering based on preset performance and efficiency thresholds, a comprehensive score is obtained by weighting the performance and efficiency metrics of each algorithm, and the algorithm with the highest score is selected as the best objective optimization algorithm. The blood drug concentration prediction module is used to input the dataset to be tested into the trained blood drug concentration prediction model to obtain the individual blood drug concentration prediction value.

10. An electronic device comprising a memory and a processor, the memory storing computer-readable instructions, wherein when the computer-readable instructions are executed by the processor, characterized in that, The processor is then configured to execute the blood drug concentration prediction method as described in any one of claims 1-8.