Method for constructing a model for regulating analysis of sinusitis based on data of inhibition of nasal polyp cells
By constructing an LK analysis prediction model based on nasal polyp cell inhibition data, the problem of the inability to accurately predict the early progression of CRSwNP in existing technologies has been solved, enabling reliable prediction of CRSwNP patients' conditions and improving the accuracy of clinical diagnosis and treatment.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TIANJIN MEDICAL UNIVERSITY GENERAL HOSPITAL
- Filing Date
- 2026-03-12
- Publication Date
- 2026-06-26
AI Technical Summary
Existing CRSwNP disease analysis technology cannot provide objective and reliable predictions in the early stages of disease. It relies on the assessment of existing clinical phenotypes and lacks standardized judgment logic, leading to inconsistencies in doctors' experience-based judgments and making it impossible to accurately predict disease progression.
By collecting Lund-Kennedy scores and nasal polyp cell inhibition data from CRSwNP patients, an LK analysis prediction model was constructed, including sample screening, index preprocessing, and lag order acquisition, and an ARIMAX model was established for prediction.
It provides clinicians with an objective and reliable tool for predicting disease progression, improving the accuracy and reliability of predictions, reducing model complexity, and enhancing interpretability.
Smart Images

Figure CN122290949A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of CRSwNP disease analysis technology, specifically to a method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data. Background Technology
[0002] CRSwNP disease analysis technology is a multi-dimensional, full-process clinical analysis technology system based on the pathophysiological mechanism of chronic sinusitis with nasal polyps. It integrates quantitative assessment of clinical phenotype, analysis of imaging features, detection of molecular and cellular markers, longitudinal time-series data modeling, and prognostic risk stratification prediction. The core objectives of this technology are to accurately quantify the severity of the disease, dynamically monitor disease progression, analyze the inflammatory driving mechanism, and predict disease risk, providing technical support for the individualized diagnosis and treatment of CRSwNP patients.
[0003] Existing CRSwNP disease analysis techniques often employ complex scoring systems to assess disease severity when analyzing and predicting CRSwNP patients' Lund-Kennedy scores, such as the Lund-Kennedy nasal endoscopy scoring system and the Lund-MacKay sinus CT scoring system. However, these scoring systems rely on existing organic lesions such as nasal polyp morphology, mucosal edema, and sinus imaging opacities. In the progression of CRSwNP, tissue morphology is often the last abnormality to appear, with corresponding markers emerging much earlier than score changes. For example, when the score rises, the patient's inflammation has already progressed to the stage of polyp regeneration and extensive mucosal edema, failing to provide early warning and intervention in the early stages of the disease. Furthermore, the prediction of disease progression during treatment typically relies on the physician's clinical experience, lacking standardized judgment logic and thresholds. Physicians' experience is highly dependent on existing clinical phenotypes, such as polyp recurrence or symptom relief, and can only predict disease progression after it has already occurred, failing to proactively predict changes before symptoms appear. Moreover, there is insufficient consistency in the judgments of physicians with different experience regarding the disease development of the same patient. Therefore, existing CRSwNP disease analysis technologies, when analyzing and predicting Lund-Kennedy scores in CRSwNP patients, cannot provide clinicians with objective and reliable analytical tools to help them reliably predict the disease progression of CRSwNP patients during treatment. Summary of the Invention
[0004] This invention aims to at least partially address one of the technical problems in the prior art. It obtains raw sample data of CRSwNP patients by collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data; then, it performs screening to remove unqualified raw sample data from CRSwNP patients, obtaining standard sample data; further, it performs index preprocessing and obtains the optimal lag order to obtain reference sample data; and finally, it constructs an LK analysis prediction model to analyze and predict the Lund-Kennedy scores of CRSwNP patients. This addresses the problem that existing CRSwNP disease analysis techniques, when analyzing and predicting Lund-Kennedy scores of CRSwNP patients, cannot provide clinicians with objective and reliable analytical and predictive tools to help them reliably predict the disease progression of CRSwNP patients during treatment.
[0005] To achieve the above objectives, this application provides a method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data, comprising the following steps:
[0006] Lund-Kennedy scores and corresponding nasal polyp cell inhibition data were collected from CRSwNP patients to obtain raw sample data of CRSwNP patients;
[0007] The original sample data was screened to remove the original sample data of unqualified CRSwNP patients, and standard sample data was obtained.
[0008] The index is preprocessed based on standard sample data, and the optimal lag order is obtained to get reference sample data.
[0009] Based on reference sample data, an LK analysis prediction model was constructed, and the Lund-Kennedy score of CRSwNP patients was analyzed and predicted.
[0010] Further, the Lund-Kennedy scores and corresponding nasal polyp cell suppression data of CRSwNP patients were collected to obtain the raw sample data of CRSwNP patients, including the following sub-steps:
[0011] Patients with chronic sinusitis and nasal polyps are designated as CRSwNP patients; any CRSwNP patient is designated as the first patient.
[0012] During the treatment of the first patient, nasal polyp tissue and peripheral blood were collected from the first patient at the first time interval. Based on the collected peripheral blood, the relative expression levels of miR-143-3p, TET1 mRNA, IFN-γ protein, and the proportion of IFN-γ+CD4+Th1 cells were obtained. Based on the nasal polyp tissue, the CD4+ T cell infiltration density of the nasal polyp tissue was obtained. The number of collections was recorded to obtain the nasal polyp cell inhibition data of the first patient. The first time interval was T1.
[0013] Further, collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data from CRSwNP patients to obtain raw sample data for CRSwNP patients also includes the following sub-steps:
[0014] The Lund-Kennedy score of CRSwNP patients is recorded as the LK score; the LK score of the first patient is obtained simultaneously each time nasal polyp cell inhibition data is collected, and the disease score data of the first patient is obtained.
[0015] The nasal polyp cell inhibition data and disease score data of the first patient were combined and recorded as the original sample data of the first patient. The original sample data of multiple first patients were obtained repeatedly.
[0016] Further, the original sample data is screened to remove ineligible CRSwNP patient data, resulting in standard sample data. This process includes the following sub-steps:
[0017] The relative expression levels of miR-143-3p, TET1 mRNA, IFN-γ protein, IFN-γ+CD4+Th1 cells, and CD4+T cell infiltration density in nasal polyp tissue were uniformly recorded as analytical indicators.
[0018] For the first patient, if any analytical indicator or LK score is missing in the original sample data of the first patient, the first patient is marked as an ineligible patient; otherwise, the patient is marked as a basic patient. The original sample data of all basic patients are obtained repeatedly.
[0019] Further, the process of filtering the original sample data to remove ineligible CRSwNP patient data and obtaining standard sample data includes the following sub-steps:
[0020] If the first patient is the basic patient, any one of the analytical indicators is designated as the first indicator, and the collected first indicators are arranged in the order of collection and recorded as the first indicator sequence.
[0021] Calculate the coefficient of variation of the first indicator sequence, denoted as the first coefficient for the first patient; denote any data point in the first indicator sequence as X.i Where i represents the position number, calculate (X i+1 -X i ) / X i , denoted as the adjacent rate of change corresponding to Xi;
[0022] Based on the first indicator sequence, all corresponding adjacent rates of change are repeatedly calculated, and the average value is recorded as the second coefficient for the second patient.
[0023] Repeatedly calculate the first and second coefficients of the first indicator for all basic patients, and then normalize them separately. Then add the normalized first and second coefficients of the first patient to obtain the indicator coefficient of the first indicator for the first patient. Repeatedly obtain the indicator coefficients of all analytical indicators of the first patient, calculate the mean, and normalize them. Record this as the fluctuation score of the first patient.
[0024] Further, the process of filtering the original sample data to remove ineligible CRSwNP patient data and obtaining standard sample data includes the following sub-steps:
[0025] The relative expression levels of miR-143-3p and TET1 mRNA are designated as pair 1; the relative expression levels of TET1 mRNA and IFN-γ protein are designated as pair 2; and the relative expression levels of IFN-γ protein and the proportion of IFN-γ+CD4+Th1 cells are designated as pair 3. In pair 1, the expected changes of the two analytical indicators are opposite, while in pair 2 and pair 3, the expected changes of the two analytical indicators are the same.
[0026] For pair 1, the two analytical indicators are arranged in the order of collection, and the proportion of the number of times the change direction of the two analytical indicators matches the expected value in all adjacent collections is recorded according to the corresponding sequence. This proportion is denoted as the coordination compliance rate of pair 1.
[0027] Repeatedly obtain the synergy compliance rate corresponding to pair 2 and pair 3, then calculate the mean of all synergy compliance rates, denoted as BR, and record 1-BR as the synergy score of the first patient.
[0028] Further, the process of filtering the original sample data to remove ineligible CRSwNP patient data and obtaining standard sample data includes the following sub-steps:
[0029] Calculate the mean of the fluctuation score and the synergistic score of the first patient, and record it as the comprehensive abnormal score of the first patient; repeat the process to obtain the comprehensive abnormal scores of all basic patients, and calculate the mean AP and standard deviation AB of all comprehensive abnormal scores;
[0030] If the first patient's overall abnormal score is greater than AP + 3 × AB, then the patient is marked as an ineligible patient; otherwise, the patient is marked as an eligible patient. All eligible patients are obtained repeatedly, and the standard sample data of all eligible patients are recorded as the standard sample data.
[0031] Furthermore, the process of preprocessing the indicators based on the standard sample data and obtaining the optimal lag order to get the reference sample data also includes the following sub-steps:
[0032] If the first patient is a qualified patient, then for the first indicator sequence of the first indicator, calculate X. i / X1, denoted as X i The relative change value, and X i Replace with X i / X1, where X1 is the first data in the first indicator sequence; repeat the replacement of all data in the first indicator sequence to obtain the relative change sequence of the first indicator;
[0033] Repeatedly acquire the relative change sequences of all analytical indicators and LK scores. Based on all the relative change sequences, the relative change values of miR-143-3p relative expression level, TET1 mRNA relative expression level, IFN-γ protein relative expression level, IFN-γ+CD4+Th1 cell proportion and CD4+T cell infiltration density of nasal polyp tissue collected at any one time are recorded as AE1, AE2, AE3, AE4 and AE5 in sequence.
[0034] Calculate (AE2+AE3+AE4+AE5) / AE1, and denote it as the corresponding inflammatory factor coefficient. Based on all the relative change sequences, repeatedly obtain the inflammatory factor coefficient corresponding to each acquisition, and arrange them in the acquisition order, and denote it as the first factor sequence.
[0035] Furthermore, the process of preprocessing the indicators based on the standard sample data and obtaining the optimal lag order to get the reference sample data also includes the following sub-steps:
[0036] The data in the first factor sequence are sequentially labeled AY1 to AY2. n Where n is the total number of data in the first factor sequence; the relative change sequence of the LK score of the first patient is denoted as the first score sequence, and the data are sequentially denoted as AF1 to AF2. n ;
[0037] Set the candidate lag orders to 1, 2, and 3; for candidate lag order 1, calculate AY1 to AY2. n-1 The first factor sequence and AF2 to AF n The Pearson correlation coefficient of the first rating sequence is denoted as the evaluation coefficient of the candidate lag order 1;
[0038] Repeatedly calculate the evaluation coefficients of all candidate lag orders, and record the candidate lag order with the largest evaluation coefficient as the possible lag order; repeatedly obtain the possible lag orders of all qualified patients, and record the possible lag order with the largest proportion as the optimal lag order k1;
[0039] AY1 to AY n-k1 Denote it as the second factor sequence, and AF 1+k1 To AF n This is recorded as the second scoring sequence; the second factor sequence and the second scoring sequence of the first patient are recorded as patient sample data, and patient sample data of all eligible patients are repeatedly obtained and recorded as reference sample data.
[0040] Furthermore, based on the reference sample data, an LK analysis prediction model was constructed, and the Lund-Kennedy score of CRSwNP patients was analyzed and predicted, including the following sub-steps:
[0041] The ARIMAX model is designated as the initial model. The input of the initial model is a partial second factor sequence and the corresponding partial second score sequence, and the output is the LK score for the next k2 iterations. The initial model is trained using reference sample data, and the LK analysis and prediction model is obtained after the training is completed, where k2 is the set number of iterations.
[0042] For CRSwNP patients to be analyzed, corresponding nasal polyp cell inhibition data and disease score data were collected, and the corresponding second factor sequence and second score sequence were obtained by index preprocessing. These were then input into the LK analysis prediction model to obtain the LK score of the CRSwNP patient for the next k2 times.
[0043] The beneficial effects of this invention are as follows: This invention obtains raw sample data of CRSwNP patients by collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data; the raw sample data is screened to remove unqualified raw sample data of CRSwNP patients, resulting in standard sample data; based on the standard sample data, index preprocessing is performed, and the optimal lag order is obtained to obtain reference sample data; an LK analysis and prediction model is constructed based on the reference sample data, and the Lund-Kennedy scores of CRSwNP patients are analyzed and predicted; when analyzing and predicting the Lund-Kennedy scores of CRSwNP patients, it can provide clinicians with an objective and reliable analysis and prediction tool, helping clinicians to reliably predict the disease progression of CRSwNP patients during treatment;
[0044] This invention transforms simple mean differences into an indicator of instability by calculating the coefficient of variation and adjacent change rates and synthesizing them into a fluctuation score. By collecting the coherence rate through indicator pairing, it can detect whether different indicators change synchronously in the expected direction during follow-up. It can eliminate samples with high measurement noise as well as individuals with biological abnormalities or special conditions, thereby ensuring the internal consistency of the sample. After converting each indicator into a sequence of relative first values, the scaling effect caused by different baseline expression levels among patients is eliminated, improving the reliability and accuracy of subsequent modeling. By merging multiple inflammation-related indicators into a single factor sequence, information simplification is achieved, and an easily interpretable inflammation metric is formed, reducing input dimensions and model complexity, while retaining key biological signals and enhancing model interpretability. Attached Figure Description
[0045] Figure 1 This is a flowchart illustrating the steps of the method of the present invention;
[0046] Figure 2 This is a flowchart of the comprehensive anomaly score acquisition process of the present invention;
[0047] Figure 3 This is a flowchart of the process for obtaining the optimal hysteresis order in this invention;
[0048] Figure 4 This is a schematic diagram of the electronic device of the present invention. Detailed Implementation
[0049] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0050] Example 1, please refer to Figure 1 As shown, this application provides a method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data, including the following steps:
[0051] Step S1 involves collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data from CRSwNP patients to obtain the raw sample data of CRSwNP patients. Step S1 includes the following sub-steps:
[0052] Step S101: Record patients with chronic sinusitis and nasal polyps as CRSwNP patients; record any CRSwNP patient as the first patient;
[0053] Step S102: During the treatment of the first patient, nasal polyp tissue and peripheral blood of the first patient are collected at a first time interval. Based on the collected peripheral blood, the relative expression levels of miR-143-3p, TET1 mRNA, IFN-γ protein, and IFN-γ+CD4+Th1 cells are obtained. Based on the nasal polyp tissue, the CD4+T cell infiltration density of the nasal polyp tissue is obtained. The number of collections is recorded to obtain the nasal polyp cell inhibition data of the first patient. The first time interval is T1. In this embodiment, T1 = 30 days, that is, data is collected once every 30 days. It can be flexibly set and is generally [20 days, 60 days].
[0054] miR-143-3p is a microRNA. In CRSwNP patients, miR-143-3p expression is significantly downregulated compared to normal levels, and its expression level is significantly negatively correlated with the LK score. TET1 is a direct target gene of miR-143-3p and a core intermediate signaling node in the pathway. TET1 mRNA and protein expression are significantly upregulated in peripheral blood and nasal polyp tissues of CRSwNP patients, showing a significant negative correlation with miR-143-3p expression and a significant positive correlation with the LK score. The density of CD4+ T cell infiltration is significantly increased in nasal polyp tissues of CRSwNP patients, and this is significantly positively correlated with the LK score.
[0055] IFN-γ is a characteristic core effector of Th1 cells and the final downstream effector molecule of the pathway. It can directly damage the nasal mucosal epithelial barrier, induce mucosal edema, promote inflammatory cell infiltration, and drive nasal polyp tissue remodeling. It is a core effector marker of inflammation in CRSwNP. The serum IFN-γ protein level in CRSwNP patients is significantly elevated and is significantly positively correlated with LK score.
[0056] IFN-γ+CD4+Th1 cells are the core effector cells of CRSwNP inflammation. After activation, they secrete large amounts of IFN-γ, driving local chronic inflammation of the sinuses and the formation of nasal polyps. The proportion of IFN-γ+CD4+Th1 cells in the peripheral blood of CRSwNP patients is significantly increased and is significantly positively correlated with LK score.
[0057] Step S103: Record the Lund-Kennedy score of the CRSwNP patient as the LK score; when collecting nasal polyp cell inhibition data for the first patient each time, simultaneously obtain the LK score of the first patient to obtain the first patient's disease score data.
[0058] Step S104: Combine the nasal polyp cell inhibition data and disease score data of the first patient and record them as the original sample data of the first patient. Repeat the acquisition of the original sample data of multiple first patients.
[0059] In practice, the Lund-Kennedy score is a widely used endoscopic scoring system in otolaryngology. It is mainly used to objectively assess the health status of the nasal cavity and sinuses, reflecting clinical characteristics such as mucosal edema, mucus, and polyp size, and to quantify the severity of CRSwNP.
[0060] Step S2 involves screening the original sample data, removing unqualified CRSwNP patient data, and obtaining standard sample data. Step S2 includes the following sub-steps:
[0061] Step S201: The relative expression levels of miR-143-3p, TET1 mRNA, IFN-γ protein, IFN-γ+CD4+Th1 cells, and CD4+T cell infiltration density in nasal polyp tissue are uniformly recorded as analytical indicators.
[0062] Step S202: For the first patient, if any analytical indicator or LK score is missing in the original sample data of the first patient, the first patient is marked as an unqualified patient; otherwise, the patient is marked as a basic patient. The original sample data of all basic patients are obtained repeatedly. Complete data is required for subsequent calculations. If there are missing data in the middle, imputation will be required. However, imputation will artificially smooth fluctuations and change data characteristics, which will affect subsequent judgment and modeling.
[0063] Step S203: If the first patient is the basic patient, any one of the analysis indicators is designated as the first indicator, and the collected first indicators are arranged in the order of collection and recorded as the first indicator sequence.
[0064] For step S204, please refer to... Figure 2 As shown, calculate the coefficient of variation of the first indicator sequence, denoted as the first coefficient for the first patient; denote any data point in the first indicator sequence as X. i Where i represents the position number, calculate (X i+1 -X i ) / X i , denoted as the adjacent rate of change corresponding to Xi; the first coefficient is used to measure the overall dispersion of the index;
[0065] Step S205: Repeatedly calculate all corresponding adjacent change rates according to the first indicator sequence, and calculate the average value, which is recorded as the second coefficient of the second patient; the second coefficient is used to measure the degree of short-term fluctuation of the indicator; if a patient's indicator has low overall fluctuation, but has repeated short-term large oscillations, it should also be regarded as a sample with strong volatility, and may be marked as abnormal; under normal and standardized treatment, the changes of LK score and 5 analytical indicators are regular and relatively smooth, and there will be no irregular and violent fluctuations.
[0066] Step S206: Repeatedly calculate the first coefficient and second coefficient of the first indicator for all basic patients, and then perform normalization processing on each. Then add the normalized first coefficient and second coefficient of the first patient to obtain the indicator coefficient of the first indicator of the first patient. Repeatedly obtain the indicator coefficients of all analytical indicators of the first patient, calculate the mean, and perform normalization processing, which is recorded as the fluctuation score of the first patient. The higher the fluctuation score, the more abnormal the patient's data is. Normalization processing is used to transform the scale of different indicators into a uniform range, i.e. [0, 1].
[0067] In step S207, the relative expression levels of miR-143-3p and TET1 mRNA are designated as pair 1; the relative expression levels of TET1 mRNA and IFN-γ protein are designated as pair 2; and the relative expression levels of IFN-γ protein and the proportion of IFN-γ+CD4+Th1 cells are designated as pair 3. In pair 1, the expected changes of the two analytical indicators are opposite; that is, when one analytical indicator increases, the other should decrease, and vice versa. In pair 2 and pair 3, the expected changes of the two analytical indicators are the same; that is, when one analytical indicator increases, the other should also increase, and vice versa.
[0068] Step S208: For pair 1, arrange the two analytical indicators according to the collection order, and according to the corresponding sequence statistics, in all adjacent collections, the proportion of times the change direction of the two analytical indicators in two consecutive collections matches the expected proportion of the total number of adjacent collections is recorded as the concordance rate of pair 1; a high concordance rate indicates that the patient's data follows known biological laws in time, reducing the probability of measurement errors or interference from non-target pathological processes;
[0069] Step S209: Repeatedly obtain the synergy compliance rate corresponding to pair 2 and pair 3, and then calculate the mean of all synergy compliance rates, denoted as BR. Record 1-BR as the synergy score of the first patient. The higher the synergy score, the more serious the deviation of the data from known biological laws, and the more abnormal the data.
[0070] Step S210: Calculate the mean of the fluctuation score and the synergistic score of the first patient, and record it as the comprehensive abnormal score of the first patient; repeat the process to obtain the comprehensive abnormal scores of all basic patients, and calculate the mean AP and standard deviation AB of all comprehensive abnormal scores;
[0071] Step S211: If the comprehensive abnormal score of the first patient is greater than AP + 3 × AB, then mark it as an unqualified patient; otherwise, mark it as a qualified patient; repeat the process to obtain all qualified patients, and record the standard sample data of all qualified patients as the standard sample data.
[0072] For example, if the mean AP of all comprehensive abnormal scores is 0.22 and the standard deviation AB is 0.08, then the corresponding AP + 3 × AB = 0.46; the coefficients of all analytical indicators for a certain patient are 0.20, 0.25, 0.23, 0.27, and 0.23, respectively. After calculating the mean and normalizing, the fluctuation score is 0.149; the three concordance rates for this patient are 0.8, 1.0, and 0.8, respectively, with a mean of 0.867 and a concordance score of 0.133; therefore, the comprehensive abnormal score for this patient is 0.141. Since 0.141 < 0.46, this patient is a qualified patient.
[0073] In the specific implementation process, the calculation of the comprehensive anomaly score can also be based on the actual needs of setting corresponding weights for the fluctuation score and the coordination score, and then performing a weighted average.
[0074] Step S3 involves preprocessing the indicators based on the standard sample data and obtaining the optimal lag order to obtain the reference sample data. Step S3 includes the following sub-steps:
[0075] For step S301, please refer to... Figure 3 As shown, if the first patient is a qualified patient, then for the first indicator sequence of the first indicator, calculate X. i / X1, denoted as X i The relative change value, and X i Replace with X i / X1, where X1 is the first data point of the first indicator sequence; repeatedly replace all data in the first indicator sequence to obtain the relative change sequence of the first indicator; subsequent models require that the input time series must be stationary, but the LK scores and absolute values of core indicators vary greatly among different patients. Directly using the original values will not only result in a non-stationary sequence, but will also smooth out the individual's own treatment change trend; replace the original values with relative change values to eliminate individual baseline differences and make the time series of different patients comparable in scale;
[0076] Step S302: Repeat the acquisition of all analytical indicators and the relative change sequence of LK score. Based on all the relative change sequences, record the relative change values of miR-143-3p relative expression level, TET1 mRNA relative expression level, IFN-γ protein relative expression level, IFN-γ+CD4+Th1 cell proportion and CD4+T cell infiltration density of nasal polyp tissue in any one acquisition as AE1, AE2, AE3, AE4 and AE5 respectively in order.
[0077] Step S303: Calculate (AE2+AE3+AE4+AE5) / AE1, and record it as the corresponding inflammatory factor coefficient. Based on all the relative change sequences, repeatedly obtain the inflammatory factor coefficient corresponding to each collection, and arrange them according to the collection order, and record them as the first factor sequence. The model used for subsequent modeling is extremely sensitive to the collinearity of multiple exogenous variables. Directly putting all 5 analysis indicators into the model will lead to inaccurate coefficient estimation, model overfitting, and a significant decrease in prediction accuracy. However, converting the 5 analysis indicators into inflammatory factor coefficients can make the model coefficient estimation more stable, completely avoid overfitting, and reduce the complexity of the model.
[0078] Step S304: Record the data in the first factor sequence sequentially as AY1 to AY1. n Where n is the total number of data in the first factor sequence; the relative change sequence of the LK score of the first patient is denoted as the first score sequence, and the data are sequentially denoted as AF1 to AF2. n ;
[0079] Step S305: Set the candidate lag orders to 1, 2, and 3; for candidate lag order 1, calculate AY1 to AY2. n-1 The first factor sequence and AF2 to AF n The Pearson correlation coefficient of the first scoring sequence is denoted as the evaluation coefficient of the candidate lag order 1. Ordinary modeling defaults to using the analysis indicators of the same period to predict the LK score of the same period. In essence, it is to explain the disease rather than predict the disease, which wastes the core rule that the changes of CRSwNP inflammatory indicators lead the LK score by 1 to 3 months. Inflammation changes first, and it will be reflected in the endoscopic LK score 1 to 3 months later.
[0080] Step S306: Repeatedly calculate the evaluation coefficients of all candidate lag orders, and record the candidate lag order with the largest evaluation coefficient as the possible lag order; repeatedly obtain the possible lag orders of all qualified patients, and record the possible lag order with the largest proportion as the optimal lag order k1;
[0081] Step S307, transfer AY1 to AY n-k1 Denote it as the second factor sequence, and AF 1+k1 To AF n This is recorded as the second scoring sequence; the second factor sequence and second scoring sequence of the first patient are recorded as patient sample data, and patient sample data of all eligible patients are repeatedly obtained and recorded as reference sample data;
[0082] In practice, by obtaining the optimal lag order, the model can use the coefficients of inflammatory factors 1 to 3 months in advance to predict future LK scores, using real forward predictive signals rather than synchronous data from the same period, thereby improving clinical value and predictive reliability.
[0083] Step S4 involves constructing an LK analysis prediction model based on reference sample data and analyzing and predicting the Lund-Kennedy score for CRSwNP patients. Step S4 includes the following sub-steps:
[0084] Step S401: The ARIMAX model is designated as the initial model. The input of the initial model is a partial second factor sequence and the corresponding partial second score sequence, and the output is the LK score for the next k2 times. The initial model is trained using reference sample data to obtain the LK analysis and prediction model, where k2 is the set number of times. In this embodiment, k2 is 1, and is generally 1 or 2; that is, predicting the LK score for the next 1 or 2 times.
[0085] Step S402: For CRSwNP patients to be analyzed, collect the corresponding nasal polyp cell inhibition data and disease score data, perform index preprocessing to obtain the corresponding second factor sequence and second score sequence, and input them into the LK analysis prediction model to obtain the LK score of the CRSwNP patient in the next k2 times.
[0086] In the actual implementation process, other machine learning models can also be selected as the initial model according to the actual application scenario. For example, if there are enough samples, the TFT model or GRU model can also be selected as the initial model.
[0087] Example 2, please refer to Figure 4 As shown, Figure 4 A schematic diagram of an electronic device is provided, which may include a processor, a communication interface, a memory, and a communication bus. The processor, communication interface, and memory communicate with each other via the communication bus. The memory stores computer-readable instructions, and the processor can call these instructions. When the processor executes a computer-readable instruction, it performs steps such as those in the method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data, to achieve the following functions: collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data from CRSwNP patients to obtain raw sample data; filtering the raw sample data to remove unqualified raw sample data from CRSwNP patients to obtain standard sample data; performing index preprocessing based on the standard sample data and obtaining the optimal lag order to obtain reference sample data; constructing an LK analysis prediction model based on the reference sample data and analyzing and predicting the Lund-Kennedy scores of CRSwNP patients.
[0088] Furthermore, when the logical instructions in the aforementioned memory can be implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0089] Example 3: This application also provides a computer-readable storage medium storing a computer program thereon. When the computer program is executed by a processor, it performs the steps of the sinusitis regulation analysis model construction method based on nasal polyp cell inhibition data to achieve the following functions: collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data of CRSwNP patients to obtain raw sample data of CRSwNP patients; screening the raw sample data to remove unqualified raw sample data of CRSwNP patients to obtain standard sample data; performing index preprocessing based on the standard sample data and obtaining the optimal lag order to obtain reference sample data; constructing an LK analysis prediction model based on the reference sample data and analyzing and predicting the Lund-Kennedy scores of CRSwNP patients.
[0090] Based on the above description of the embodiments, the embodiments of the present invention can be provided as methods, systems, or computer program products. Based on this understanding, the above technical solutions, in essence or in terms of their contribution to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or certain parts of the embodiments.
[0091] In the embodiments provided in this application, it should be understood that the disclosed system or method can be implemented in other ways. The embodiments described above are merely illustrative. For example, the division of modules or units is only a logical functional division, and there may be other division methods in actual implementation. Furthermore, multiple modules or units may be combined or integrated into another system, or some features may be ignored or not executed. Additionally, the coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces. The indirect coupling or communication connection between systems, modules, and units may be electrical, mechanical, or other forms.
[0092] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.
Claims
1. A method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data, characterized in that, Includes the following steps: Lund-Kennedy scores and corresponding nasal polyp cell inhibition data were collected from CRSwNP patients to obtain raw sample data of CRSwNP patients; The original sample data was screened to remove the original sample data of unqualified CRSwNP patients, and standard sample data was obtained. The index is preprocessed based on standard sample data, and the optimal lag order is obtained to get reference sample data. Based on reference sample data, an LK analysis prediction model was constructed, and the Lund-Kennedy score of CRSwNP patients was analyzed and predicted.
2. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 1, characterized in that, The process of collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data from CRSwNP patients to obtain raw sample data for CRSwNP patients includes the following sub-steps: Patients with chronic sinusitis and nasal polyps are designated as CRSwNP patients; any CRSwNP patient is designated as the first patient. During the treatment of the first patient, nasal polyp tissue and peripheral blood were collected from the first patient at the first time interval. Based on the collected peripheral blood, the relative expression levels of miR-143-3p, TET1 mRNA, IFN-γ protein, and the proportion of IFN-γ+CD4+Th1 cells were obtained. Based on the nasal polyp tissue, the CD4+ T cell infiltration density of the nasal polyp tissue was obtained. The number of collections was recorded to obtain the nasal polyp cell inhibition data of the first patient. The first time interval was T1.
3. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 2, characterized in that, Collecting Lund-Kennedy scores and corresponding nasal polyp cell inhibition data from CRSwNP patients to obtain raw sample data for CRSwNP patients also includes the following sub-steps: The Lund-Kennedy score of CRSwNP patients is recorded as the LK score; the LK score of the first patient is obtained simultaneously each time nasal polyp cell inhibition data is collected, and the disease score data of the first patient is obtained. The nasal polyp cell inhibition data and disease score data of the first patient were combined and recorded as the original sample data of the first patient. The original sample data of multiple first patients were obtained repeatedly.
4. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 3, characterized in that, The process of filtering the raw sample data to remove ineligible CRSwNP patient data and obtaining standard sample data includes the following sub-steps: The relative expression levels of miR-143-3p, TET1 mRNA, IFN-γ protein, IFN-γ+CD4+Th1 cells, and CD4+T cell infiltration density in nasal polyp tissue were uniformly recorded as analytical indicators. For the first patient, if any analytical indicator or LK score is missing in the original sample data of the first patient, the first patient is marked as an ineligible patient; otherwise, the patient is marked as a basic patient. The original sample data of all basic patients are obtained repeatedly.
5. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 4, characterized in that, The process of filtering the raw sample data to remove ineligible CRSwNP patient data and obtaining standard sample data also includes the following sub-steps: If the first patient is the basic patient, any one of the analytical indicators is designated as the first indicator, and the collected first indicators are arranged in the order of collection and recorded as the first indicator sequence. Calculate the coefficient of variation of the first indicator sequence, denoted as the first coefficient for the first patient; denote any data point in the first indicator sequence as X. i Where i represents the position number, calculate (X i+1 -X i ) / X i , denoted as the adjacent rate of change corresponding to Xi; Based on the first indicator sequence, all corresponding adjacent rates of change are repeatedly calculated, and the average value is recorded as the second coefficient for the second patient. Repeatedly calculate the first and second coefficients of the first indicator for all basic patients, and then normalize them separately. Then add the normalized first and second coefficients of the first patient to obtain the indicator coefficient of the first indicator for the first patient. Repeatedly obtain the indicator coefficients of all analytical indicators of the first patient, calculate the mean, and normalize them. Record this as the fluctuation score of the first patient.
6. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 5, characterized in that, The process of filtering the raw sample data to remove ineligible CRSwNP patient data and obtaining standard sample data also includes the following sub-steps: The relative expression levels of miR-143-3p and TET1 mRNA are designated as pair 1; the relative expression levels of TET1 mRNA and IFN-γ protein are designated as pair 2; and the relative expression levels of IFN-γ protein and the proportion of IFN-γ+CD4+Th1 cells are designated as pair 3. In pair 1, the expected changes of the two analytical indicators are opposite, while in pair 2 and pair 3, the expected changes of the two analytical indicators are the same. For pair 1, the two analytical indicators are arranged in the order of collection, and the proportion of the number of times the change direction of the two analytical indicators matches the expected value in all adjacent collections is recorded according to the corresponding sequence. This proportion is denoted as the coordination compliance rate of pair 1. Repeatedly obtain the synergy compliance rate corresponding to pair 2 and pair 3, then calculate the mean of all synergy compliance rates, denoted as BR, and record 1-BR as the synergy score of the first patient.
7. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 6, characterized in that, The process of filtering the raw sample data to remove ineligible CRSwNP patient data and obtaining standard sample data also includes the following sub-steps: Calculate the mean of the fluctuation score and the synergistic score of the first patient, and record it as the comprehensive abnormal score of the first patient; repeat the process to obtain the comprehensive abnormal scores of all basic patients, and calculate the mean AP and standard deviation AB of all comprehensive abnormal scores; If the first patient's overall abnormal score is greater than AP + 3 × AB, then the patient is marked as an ineligible patient; otherwise, the patient is marked as an eligible patient. All eligible patients are obtained repeatedly, and the standard sample data of all eligible patients are recorded as the standard sample data.
8. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 7, characterized in that, Preprocessing the indicators based on standard sample data and obtaining the optimal lag order to obtain reference sample data also includes the following sub-steps: If the first patient is a qualified patient, then for the first indicator sequence of the first indicator, calculate X. i / X1, denoted as X i The relative change value, and X i Replace with X i / X1, where X1 is the first data in the first indicator sequence; repeat the replacement of all data in the first indicator sequence to obtain the relative change sequence of the first indicator; Repeatedly acquire the relative change sequences of all analytical indicators and LK scores. Based on all the relative change sequences, the relative change values of miR-143-3p relative expression level, TET1 mRNA relative expression level, IFN-γ protein relative expression level, IFN-γ+CD4+Th1 cell proportion and CD4+T cell infiltration density of nasal polyp tissue collected at any one time are recorded as AE1, AE2, AE3, AE4 and AE5 in sequence. Calculate (AE2+AE3+AE4+AE5) / AE1, and denote it as the corresponding inflammatory factor coefficient. Based on all the relative change sequences, repeatedly obtain the inflammatory factor coefficient corresponding to each acquisition, and arrange them in the acquisition order, and denote it as the first factor sequence.
9. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 8, characterized in that, Preprocessing the indicators based on standard sample data and obtaining the optimal lag order to obtain reference sample data also includes the following sub-steps: The data in the first factor sequence are sequentially labeled AY1 to AY2. n Where n is the total number of data in the first factor sequence; the relative change sequence of the LK score of the first patient is denoted as the first score sequence, and the data are sequentially denoted as AF1 to AF2. n ; Set the candidate lag orders to 1, 2, and 3; for candidate lag order 1, calculate AY1 to AY2. n-1 The first factor sequence and AF2 to AF n The Pearson correlation coefficient of the first rating sequence is denoted as the evaluation coefficient of the candidate lag order 1; Repeatedly calculate the evaluation coefficients of all candidate lag orders, and record the candidate lag order with the largest evaluation coefficient as the possible lag order; repeatedly obtain the possible lag orders of all qualified patients, and record the possible lag order with the largest proportion as the optimal lag order k1; AY1 to AY n-k1 Denote it as the second factor sequence, and AF 1+k1 To AF n This is recorded as the second scoring sequence; the second factor sequence and the second scoring sequence of the first patient are recorded as patient sample data, and patient sample data of all eligible patients are repeatedly obtained and recorded as reference sample data.
10. The method for constructing a sinusitis regulation analysis model based on nasal polyp cell inhibition data according to claim 9, characterized in that, The LK analysis prediction model was constructed based on reference sample data, and the Lund-Kennedy score of CRSwNP patients was analyzed and predicted, including the following sub-steps: The ARIMAX model is referred to as the initial model. The input of the initial model is a partial second factor sequence and the corresponding partial second score sequence, and the output is the LK score for the next k2 times. The initial model is trained using reference sample data, resulting in the LK analysis and prediction model, where k2 is the set number of iterations. For CRSwNP patients to be analyzed, corresponding nasal polyp cell inhibition data and disease score data were collected, and the corresponding second factor sequence and second score sequence were obtained by index preprocessing. These were then input into the LK analysis prediction model to obtain the LK score of the CRSwNP patient for the next k2 times.