A method for predicting gestational hypertension in early pregnancy and its application
By constructing a predictive model using the transcription start site features of specific genes, the shortcomings of gestational hypertension prediction in early pregnancy are addressed, achieving efficient and accurate risk assessment of gestational hypertension and preeclampsia, and providing the possibility of early intervention and risk reduction.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BGI GENOMICS CO LTD
- Filing Date
- 2024-12-30
- Publication Date
- 2026-06-30
AI Technical Summary
Current technologies for predicting gestational hypertension in early pregnancy are inadequate, and more accurate, non-invasive detection methods are needed.
Using transcription start site features of the LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 genes, a predictive model was constructed to assess the risk of gestational hypertension, including the prediction of gestational hypertension and preeclampsia.
It enables effective prediction of the risk of gestational hypertension in early pregnancy, provides opportunities for early intervention, reduces medical costs, and improves maternal and infant outcomes.
Smart Images

Figure CN122314366A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of biomedicine, specifically to a set of genes associated with gestational hypertension, methods for predicting gestational hypertension and preeclampsia in early pregnancy using these genes, and their related applications. Background Technology
[0002] Hypertensive disorders in pregnancy (HDP) pose a significant threat to maternal health worldwide and is a key factor contributing to maternal morbidity and mortality. It not only severely challenges the physical health of pregnant women but also places a heavy health and economic burden on society and families. To accurately predict HDP, reduce potential harm to pregnant women, and avoid unnecessary medical interventions, researchers have been working tirelessly to develop more precise and non-invasive predictive technologies. Against the backdrop of rapid development in artificial intelligence and biomedical technology, biomarkers, as an important non-invasive diagnostic tool, have received widespread attention from researchers and have shown great potential in the prediction of HDP. Current technologies include various methods and products that utilize various biochemical indicators (including protein biomarkers and metabolic biomarkers) or a combination of biochemical and clinical indicators to predict HDP. These predictive methods require the collection of multiple types of information, and their clinical application and predictive results have certain limitations. Currently, some gene detection methods using SNP sites and miRNAs have also emerged to predict HDP, but their effectiveness and clinical application remain insufficient.
[0003] Therefore, finding a better and simpler method to predict gestational hypertension in early pregnancy remains an urgent problem to be solved. Summary of the Invention
[0004] To address the above problems, this invention provides a set of genes associated with gestational hypertension, a method for predicting gestational hypertension in early pregnancy using these genes, and related applications.
[0005] The first aspect of the present invention provides the application of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in the prediction of gestational hypertension.
[0006] In some embodiments, the above applications include the use of reagents for detecting at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in cfDNA samples in the preparation of products for predicting pregnancy-induced hypertension.
[0007] In some implementations, the above applications include the use of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in the construction of a pregnancy-induced hypertension prediction model.
[0008] In some embodiments, the above application includes: obtaining transcript start site feature values of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in the cfDNA sample of the pregnant woman to be tested, inputting the transcript start site feature values into a prediction model, and determining whether there is a risk of gestational hypertension based on the output results.
[0009] The transcription start site characteristic value mentioned above is the number of sequences in the corrected transcription start site region, which is calculated using the following formula:
[0010]
[0011] Among them, the number of reads of TSS i The total number of reads is the number of sequences aligned to the transcription start site region of the relevant gene; the total number of reads is the total number of valid sequencing sequence alignments for each sample; and the length of TSS is the length of the TSS. i The length of the transcript is represented by the promoter region, which ranges from -1kb to +1kb upstream and downstream of the transcription start site for each transcript.
[0012] Preeclampsia is a severe form of gestational hypertension, and this invention also provides the application of at least one of the aforementioned related genes in the prediction of preeclampsia. It should be understood that the above-described applications in gestational hypertension are also applicable to the prediction of preeclampsia.
[0013] As one application of the above, a second aspect of the present invention provides a method for constructing a pregnancy-induced hypertension prediction model, the method comprising:
[0014] Transcription start site features of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in pregnant women's cfDNA samples were obtained as relevant modeling factors. Training sample sets were constructed using these modeling factors and their corresponding pregnant women, including pregnant women with gestational hypertension and those without gestational hypertension.
[0015] Based on the above training sample set, several types of models are trained, and the trained models are evaluated. The best prediction model is determined based on the model evaluation index.
[0016] As one application of the above, a third aspect of the present invention provides a method for predicting gestational hypertension. The method includes: obtaining transcript start site feature values of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 from the cfDNA sample of the pregnant woman to be tested; inputting the transcript start site feature values into a prediction model obtained based on the construction method of the second aspect of the present invention; and determining whether there is a risk of gestational hypertension based on the output results.
[0017] A fourth aspect of the present invention provides a predictive product for assessing the risk of gestational hypertension, the product comprising a memory for storing a program; and a processor for executing the program stored in the memory to implement the prediction method described in the third aspect of the present invention. A computer-readable storage medium is also provided, on which a program is stored that can be executed by a processor to implement the prediction method as described in the third aspect of the present invention.
[0018] As one application of the above, a fifth aspect of the present invention provides a method for constructing a preeclampsia prediction model, the method comprising:
[0019] Transcription start site features of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in pregnant women's cfDNA samples were obtained as relevant modeling factors. Training sample sets were constructed using these modeling factors and their corresponding pregnant women, including pregnant women with preeclampsia and pregnant women without preeclampsia.
[0020] Based on the above training sample set, several types of models are trained, and the trained models are evaluated. The best prediction model is determined based on the model evaluation index.
[0021] As one application of the above, a sixth aspect of the present invention provides a method for predicting preeclampsia, the method comprising: obtaining transcript start site feature values of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in a cfDNA sample of a pregnant woman to be tested; inputting the transcript start site feature values into a prediction model obtained based on the construction method of the second aspect of the present invention; and determining whether there is a risk of preeclampsia based on the output results.
[0022] A seventh aspect of the present invention provides a predictive product for preeclampsia risk assessment, the product comprising a memory for storing a program; and a processor for executing the program stored in the memory to implement the prediction method described in the sixth aspect of the present invention. A computer-readable storage medium is also provided, on which a program is stored that can be executed by a processor to implement the prediction method as described in the sixth aspect of the present invention.
[0023] An eighth aspect of the present invention provides a computer-readable storage medium storing a prediction model obtained by the construction method provided in the second or fifth aspect above.
[0024] A sixth aspect of the present invention provides a pregnancy hypertension / or preeclampsia prediction kit for detecting at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in a cfDNA sample, the kit comprising primers for specifically amplifying at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108.
[0025] The beneficial effects of this invention are as follows: This invention collects maternal blood samples in early pregnancy, obtains cfDNA information through routine sequencing, calculates the transcription start site characteristic values of the above 7 genes, and constructs a predictive model that can predict the risk of gestational hypertension (including preeclampsia). This predictive model can effectively and robustly predict the risk of gestational hypertension (including preeclampsia) in early pregnancy, so as to assist doctors in early pregnancy intervention for examinees and reduce the risk of gestational hypertension.
[0026] The key advantages of this invention are: convenient blood collection in early pregnancy and a detection method that facilitates clinical application. Furthermore, the detection can be performed before 20 weeks of gestation, allowing for timely intervention in gestational hypertension and timely guidance on aspirin use in cases of preeclampsia, effectively improving maternal and infant outcomes. The model's high predictive accuracy provides more precise clinical intervention guidance and reduces medical costs. Attached Figure Description
[0027] Figure 1 This is the ROC plot of the optimal model for predicting gestational hypertension in this embodiment of the invention.
[0028] Figure 2 This is the ROC plot of the optimal model for predicting preeclampsia in this embodiment of the invention. Detailed Implementation
[0029] This invention, through the study of cfDNA samples in early pregnancy, identified a group of genes associated with the risk of gestational hypertension: LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108.
[0030] Specifically, this invention extracts samples from pregnant women aged 10-17. +6 Blood samples were collected, and cfDNA plasma was extracted for library construction. After library construction, sequencing was performed. Specifically, in this embodiment of the invention, the sequencing platform used was the BGI Genomics T7 sequencing platform, PE100 sequencing, with a sequencing depth of approximately 20X. The purpose of sequencing was to obtain the transcription start site characteristic values (TSS rpkm) of the aforementioned genes; therefore, the sequencing platform or method itself does not constitute a limitation of this invention. Those skilled in the art can determine the appropriate sequencing depth based on the specific sequencing method used. The FastQ data obtained from the sequencing were quality controlled, and the quality-controlled samples were used for the next step of reference genome alignment. After alignment, multiple alignment sequences and PCR repetitive sequences were removed. The transcription start site characteristic values (TSS rpkm) were calculated from the aligned data; specifically, the number of reads of TSS was counted for each transcription start site region. i Divide it by the total number of valid sequencing reads and the length of the transcript per sample. i The sequencing depth was corrected. The promoter regions of each transcript are -1kb to +1kb upstream and downstream of the transcription start site.
[0031]
[0032] After the above calculations, this invention selects seven transcription start site features involving the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 that are significantly different and stable in the training set from the whole genome as feature values and predictive markers for subsequent model construction.
[0033] Furthermore, the present invention uses the aforementioned transcription start site feature value (TSS rpkm) as a modeling factor, and constructs a training sample set with these modeling factors and their corresponding pregnant women, wherein the pregnant women include pregnant women with gestational hypertension and pregnant women without gestational hypertension.
[0034] Based on the above training sample set, several types of models are trained, and the trained models are evaluated. The best prediction model is determined based on the model evaluation index.
[0035] In some specific embodiments, a random forest model is used, taking the above seven feature values as model input values and outputting whether the pregnant woman has gestational hypertension, to construct a gestational hypertension prediction model. Besides the random forest model mentioned above, other machine learning, deep learning, reinforcement learning, and other algorithms can also be used to construct prediction models suitable for this invention.
[0036] In some specific embodiments, the transcription start site features of all seven genes can be used as input in model construction. In other specific embodiments, such as those exemplified in Table 4 of the present invention, one, two, three, four, five, or six of the aforementioned seven genes can also be used as input to construct the prediction model, achieving similar prediction results. Therefore, different prediction models correspond to different inputs.
[0037] It should be noted that, as mentioned in the above description, the seven genes and their various combinations can be fully utilized in model construction to obtain multiple different prediction models and ROC curves (Receiver Operating Characteristic curves). Based on the different ROC curves, the sensitivity, specificity, positive predictive value, and negative predictive value of each model in predicting gestational hypertension can be calculated, and the optimal model can be selected.
[0038] In one specific embodiment of the present invention, the threshold can be set as a threshold when the specificity of the model is 90%, or as a threshold when the specificity is 80% or 85%. It should be understood that other methods can also be used to determine the threshold.
[0039] It should be understood that once the prediction model is constructed using the above method, when predicting the risk of gestational hypertension, the combination of seven genes used in the determined prediction model is used to calculate the relevant transcription start site feature values of the gene sequencing results corresponding to the pregnant woman being tested. The model then automatically calculates the risk value and determines the risk by comparing it with a set threshold: values above the threshold are considered high risk, and values below the threshold are considered low risk. It should be understood that the method for obtaining transcription start site feature values in the sample from the gestational hypertension prediction model construction method discussed in this paper is also applicable to the prediction method for gestational hypertension.
[0040] In some specific embodiments, the method for predicting gestational hypertension disclosed in this invention includes:
[0041] 1) Obtain samples from pregnant women. These samples include, but are not limited to, plasma, whole blood, serum, urine, saliva, amniotic fluid, cerebrospinal fluid, and nipple aspiration.
[0042] 2) Extract cfDNA from the sample and construct a sequencing library for sequencing. The sequencing depth is approximately 20X.
[0043] 3) The FastQ data obtained from the sequencing were quality controlled, and the quality-controlled samples were aligned with the reference genome. After alignment, multiple alignment sequences and PCR repetitive sequences were removed. The transcription start site characteristic value (TSS rpkm) was calculated from the aligned data.
[0044] The transcription start site characteristic value mentioned above is the number of sequences in the corrected transcription start site region, which is calculated using the following formula:
[0045]
[0046] Among them, the number of reads of TSS i The total number of reads is the number of sequences aligned to the transcription start site region of the relevant gene; the total number of reads is the total number of valid sequencing sequence alignments for each sample; and the length of TSS is the length of the TSS. i The length of the transcript is represented by the promoter region, which ranges from -1kb to +1kb upstream and downstream of the transcription start site for each transcript.
[0047] 4) Input the obtained transcription start site feature values into the prediction model to obtain the risk value of pregnancy-induced hypertension. Compare the risk value with a predetermined threshold: values higher than the threshold are considered high risk, and values lower than the threshold are considered low risk.
[0048] As mentioned earlier, preeclampsia is a severe form of gestational hypertension. Studies have shown that administering low-dose aspirin to pregnant women before 16 weeks of gestation can significantly reduce the risk of preeclampsia. During the research process, this invention discovered that the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 also have predictive effects on preeclampsia. Therefore, the methods described above for constructing gestational hypertension prediction models are also applicable to constructing and predicting preeclampsia prediction models. The difference lies in the fact that, during model construction, the preeclampsia sample set includes pregnant women who have experienced preeclampsia and those who have not, and a corresponding prediction model suitable for preeclampsia risk is constructed. Furthermore, the model outputs high and low risk indicators for preeclampsia during application. Those skilled in the art, after reading the above specific application description of gestational hypertension, should understand how to use the aforementioned LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 genes to construct a preeclampsia prediction model, and how to use the constructed preeclampsia prediction model to predict preeclampsia in pregnant women. Therefore, the construction method of the preeclampsia prediction model and the preeclampsia prediction method will not be elaborated here. The following embodiments will provide a more specific illustrative description of the construction and effectiveness of the above-mentioned prediction models for gestational hypertension and preeclampsia.
[0049] Meanwhile, this invention provides a kit for predicting gestational hypertension or preeclampsia by detecting at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in a sample. The sample contains cfDNA. The detection refers to the extraction or acquisition of the relevant genes from the sample. In one specific embodiment, the genes can be obtained from the sample through specific amplification. In this case, the kit includes primers for specifically amplifying any one, two, three, four, five, six, or all seven genes of LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108.
[0050] The present invention will now be described in further detail with reference to specific embodiments and accompanying drawings.
[0051] Example 1
[0052] 1. Sample collection
[0053] 1.1. Inclusion criteria:
[0054] Inclusion criteria for gestational hypertension: ① Abnormal blood pressure: New-onset systolic and (or diastolic) blood pressure ≥140 / 90 mmHg after 20 weeks of gestation, with at least 4 hours between two blood pressure measurements, and blood pressure returning to normal after delivery. ② No history of hypertension: No history of hypertension previously, occurring for the first time during pregnancy.
[0055] Inclusion criteria for preeclampsia: After 20 weeks of gestation, pregnant women with systolic blood pressure ≥140 mmHg and / or diastolic blood pressure ≥90 mmHg, accompanied by any one of the following: urinary protein quantification ≥0.3g / 24h, or urinary protein / creatinine ratio ≥0.3, or random urinary protein ≥(+) (testing method when quantification of protein is not available); no proteinuria but accompanied by involvement of any of the following organs or systems: abnormal changes in vital organs such as the heart, lungs, liver, and kidneys, or the blood system, digestive system, or nervous system, or placental-fetal involvement, etc.
[0056] Inclusion criteria for healthy controls: Samples were full-term pregnancies without pregnancy complications, with the fetus growing well at birth, and no obstetric, medical, or surgical complications during pregnancy. Exclusion criteria: ① Other pregnancy complications; ② Severe heart, liver, or kidney dysfunction; ③ Patients with autoimmune diseases or malignant tumors. Pregnant women with abnormalities due to chromosomal abnormalities, congenital abnormalities, premature birth, or multiple pregnancies were excluded.
[0057] 1.2. Participants:
[0058] Based on the above inclusion criteria, we collected residual blood samples from pregnant women who underwent NIPT screening in early pregnancy at six partner hospitals, and screened out women aged 10-17. +6 Samples tested weekly.
[0059] Table 1. Brief Introduction to Sample Sources
[0060]
[0061]
[0062] 2. Sequencing of cfDNA:
[0063] Collect the blood samples of pregnant women. Use EDTA-K2 blood collection tubes and complete plasma separation within 6 hours after blood collection. Plasma separation conditions: centrifuge at 1600g and 4°C for 10 minutes, then take the supernatant, and centrifuge again at 16000g and 4°C for 10 minutes. Finally, obtain the supernatant plasma, and store the processed plasma at -80°C. The input volume of plasma for cfDNA extraction is 200 μL. Use nucleic acid extraction reagents (BGI [Hubei Medical Device Preparation No. 20150250]) to extract cfDNA, and use the MGIEasy cell-free DNA library preparation kit set (MGI) for library construction. The library concentration greater than 8 ng / μL is considered qualified. Sequencing is performed using the MGISEQ T7 sequencing platform with PE100 sequencing, and the sequencing depth is about 20x.
[0064] 3. Obtaining cfDNA characteristics:
[0065] 3.1. Preprocessing of sequencing data:
[0066] Perform quality control on the obtained fastq data of sequencing, use the fastp software to obtain the quality control report, and perform the next step of reference genome alignment for the samples passing the quality control. Use the bwa software to align the sequencing sequences to the hg38 version of the genome, and then remove the multi-aligned sequences and PCR duplicate sequences.
[0067] 3.2. Calculation of cfDNA characteristic values:
[0068] Statistically count the number of sequences aligned in each transcription start site region (reads number of TSS i ), divide it by the total number of effectively sequenced sequence alignments of each sample (total reads number) and the length of the transcript (lengthof TSS i ), and perform sequencing depth correction on it. The length of each transcript refers to -1 kb to +1 kb upstream and downstream of the transcription start site, with a total of 2 kb.
[0069]
[0070] In this example, select the transcription start site characteristic values (TSS rpkm) of 7 genes that are significantly different and stable in the training set from the whole genome as the characteristic values and prediction markers for subsequent model construction. The specific information is shown in Table 2.
[0071] Table 2. Transcription start site information of 7 genes
[0072]
[0073] <000017O>
[0074] 4. Construct a prediction model for pregnancy-induced hypertension
[0075] 4.1. Model Training
[0076] In this example, the transcription start site features (TSS rpkm) of seven selected genes are used to construct a random forest model to predict the risk of gestational hypertension and preeclampsia. Specifically, the random forest model is tuned to select the most suitable combination of hyperparameters. With the hyperparameters fixed, 100 10-fold crossover operations are performed on the training set, and the model with the best AUC (Area Under Curve) result on the validation set is selected as the final model.
[0077] 4.2. Model Prediction
[0078] After training the model on the training set to obtain the optimal model, this example uses the optimal model to perform risk prediction on the validation set. The AUC results for the gestational hypertension model are approximately 0.78, and the AUC results for the preeclampsia model are approximately 0.81. Specific indicator results are shown in Table 3, and their ROC curves (Receiver Operating Characteristic Curves) are as follows. Figure 1 and Figure 2 As shown.
[0079] Table 3. Evaluation Metrics of the Optimal Model
[0080]
[0081] 4.3. Other Models
[0082] In addition to using all seven TSS features (TSS rpkm) as model features, this example also evaluated the modeling performance of all permutations and combinations of the TSS feature values (TSS rpkm) of the seven genes. Some combinations achieved equally excellent predictive results. The final results are shown in Tables 4 and 5.
[0083] Table 4. Prediction results of the model combining 7 features of gestational hypertension
[0084]
[0085]
[0086]
[0087]
[0088] *The AUC of the dataset is the result of 10 model predictions with fixed hyperparameters, representing the mean ± 1 / 2 range.
[0089] Table 5. Prediction results of the model combining 7 features of preeclampsia
[0090]
[0091]
[0092]
[0093]
[0094] *The AUC of the dataset is the result of 10 model predictions with fixed hyperparameters, representing the mean ± 1 / 2 range.
[0095] The above examples illustrate the present invention only to aid in understanding it and are not intended to limit the scope of the invention. Those skilled in the art can make various simple deductions, modifications, or substitutions based on the principles of this invention.
Claims
1. The application of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in the prediction of gestational hypertension or preeclampsia.
2. Application of reagents for detecting at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in cfDNA samples in the preparation of products for predicting pregnancy-induced hypertension or preeclampsia.
3. Application of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in the construction of a predictive model for gestational hypertension or preeclampsia.
4. The application of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in the prediction of gestational hypertension or preeclampsia, wherein the application includes: By obtaining the transcript start site feature value of at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in the cfDNA sample of the pregnant woman to be tested, the transcript start site feature value is input into the prediction model, and the output result is used to determine whether there is a risk of gestational hypertension or preeclampsia. The number of sequences in the transcription start site region after correction for transcript start site eigenvalues is calculated using the following formula:
5. A method for constructing a predictive model for gestational hypertension, characterized in that, The construction method includes: Transcription start site features of at least one of LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 from pregnant women's cfDNA samples were obtained as relevant modeling factors. A training sample set was constructed using these modeling factors and their corresponding pregnant women, including those with and without gestational hypertension. The sequence number of transcription start site regions after correction of the transcript start site features was also considered. Several types of models are trained based on the training sample set, and the trained models are evaluated. The best prediction model is determined based on the model evaluation index.
6. A method for predicting gestational hypertension, characterized in that, The prediction method includes: Transcription start site feature values of at least one of LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in the cfDNA sample of the pregnant woman to be tested are obtained. The number of sequences in the transcription start site region after correction of the transcription start site feature values is also obtained. The transcription start site feature values are input into the prediction model obtained by the construction method as described in claim 5, and the risk of gestational hypertension is determined based on the output results.
7. A predictive product for assessing the risk of gestational hypertension, characterized in that... include: Memory, used to store programs; A processor for implementing the prediction method as described in claim 6 by executing a program stored in the memory.
8. A method for constructing a preeclampsia prediction model, characterized in that, The construction method includes: Transcription start site features of at least one of LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 from pregnant women's cfDNA samples were obtained as relevant modeling factors. A training sample set was constructed using these modeling factors and their corresponding pregnant women, including those with and without preeclampsia. The sequence number of transcription start site regions after correction of the transcript start site features was also considered. Several types of models are trained based on the training sample set, and the trained models are evaluated. The best prediction model is determined based on the model evaluation index.
9. A method for predicting preeclampsia, characterized in that, The prediction method includes: Transcription start site feature values of at least one of LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3, and LOC105376108 in the cfDNA sample of the pregnant woman to be tested are obtained. The number of sequences in the transcription start site region after correction of the transcript start site feature values is also obtained. The transcript start site feature values are input into the prediction model obtained by the construction method as described in claim 8, and the risk of preeclampsia is determined based on the output results.
10. A predictive product for preeclampsia risk assessment, characterized in that... include: Memory, used to store programs; A processor for implementing the prediction method as described in claim 9 by executing a program stored in the memory.
11. A computer-readable storage medium, characterized in that, The medium stores a program that can be executed by a processor to implement the method as described in claim 6 or 9.
12. A computer-readable storage medium, characterized in that, The predictive model obtained by the construction method as described in claim 5 or 8 is stored on the medium.
13. A pregnancy-induced hypertension or preeclampsia prediction kit for detecting at least one of the genes LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108 in a cfDNA sample, said kit comprising primers that specifically amplify at least one of LOC124902572, EEF1A1P16, LOC105369767, GOLM2, CDRT7, MEIS3 and LOC105376108.