Screening markers for pancreatic cancer susceptible populations

By utilizing high-frequency mutation sites in the CFTR gene and other germline mutant genes, biomarker compositions and prediction systems were developed, solving the problem of geographical and racial differences in early pancreatic cancer detection and enabling precise screening of susceptible populations for pancreatic cancer.

CN117757935BActive Publication Date: 2026-06-16THE NAVAL MEDICAL UNIV OF PLA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
THE NAVAL MEDICAL UNIV OF PLA
Filing Date
2024-01-24
Publication Date
2026-06-16

Smart Images

  • Figure FT_1
    Figure FT_1
  • Figure FT_2
    Figure FT_2
  • Figure FT_3
    Figure FT_3
Patent Text Reader

Abstract

The application belongs to the field of biomedicine and particularly relates to a use of an embryonic mutation gene in preparation of a screening reagent for a tumor-susceptible population, wherein the embryonic mutation gene is selected from one or more of BRCA1, BRCA2, PALB2, NBN, BARD1, RAD51D and CFTR embryonic mutation genes; and the reagent further comprises a preparation for detecting an embryonic mutation site of a CFTR gene, wherein the CFTR embryonic mutation site is selected from one or more of p.L69F, p.R74W, p.L88X, p.L100I, p.P140S, p.V201M, p.R297Q, p.G451R, p.E681V, p.H949P, p.G970D, p.R1097C, p.A1136T, p.G1298A, p.C1355Y, p.A1364V, p.R1453W and p.S1456R; the marker can be used for efficiently screening a tumor-susceptible population, and the method is more suitable for Asian population and has a more accurate effect.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of biomedicine, specifically relating to the screening of pancreatic cancer susceptible populations, and more specifically to a set of biomarker compositions for screening pancreatic cancer susceptible populations; it also relates to the application of the biomarker compositions in the preparation of diagnostic reagents for pancreatic cancer susceptible populations. Background Technology

[0002] Pancreatic cancer is currently the 7th leading cause of cancer-related death worldwide, and its mortality rate is closest to its incidence rate among malignant tumors (495,773 new cases and 466,003 deaths worldwide in 2020). Its 5-year survival rate is 9% to 11%, which seriously endangers human life and health (SUNG H, FERLAY J, SIEGEL RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J]. CA Cancer J Clin, 2021, 71(3): 209-249.). The latest data shows that in 2022, China had 134,374 new cases of pancreatic cancer and 131,203 deaths from pancreatic cancer. The incidence rate of pancreatic cancer ranked 8th among all cancers and the mortality rate ranked 6th. Among them, the incidence rate of males ranked 5th and the mortality rate ranked 3rd, both higher than that of females, showing a certain gender difference (XIA CF, DONG XS, LI H, et al. Cancer statistics in China and United States, 2022: profiles, trends, and determinants[J]. Chin Med J(Engl), 2022, 135(5): 584-590). The health statistics released by the American Cancer Society differ slightly from those in China. In 2022, the American Cancer Society projected 62,210 new cases of pancreatic cancer and 49,830 deaths from pancreatic cancer, ranking only 12th in incidence among all cancers. However, it is worth noting that the overall mortality rate of pancreatic cancer in the United States has surpassed that of breast cancer, becoming the third leading cause of cancer death. This trend is prevalent in both developed countries and developing countries with rapid economic growth (SIEGEL RL, MILLER KD, FUCHS HE, et al. Cancer statistics, 2022[J]. CACancer J Clin, 2022.72(1):7-33.).The incidence of pancreatic ductal adenocarcinoma (PDAC) varies greatly with race and region, with the highest incidence in Western Europe and North America and the lowest in East Africa and Central and South Asia, with a difference of more than 10 times (WILD CP, WEIDERPASS E, STEWART B W. World cancer report: cancer research for cancer prevention[M]. Lyon: IARC Publications, 2020).

[0003] Early symptoms of pancreatic cancer are often subtle and atypical, making diagnosis difficult. Most patients are already in the middle or late stages when they seek medical attention, losing the opportunity for radical surgery. Studies have shown that testing for the serum marker CA19-9 is sufficient for the general population, while imaging screening for pancreatic cancer is essential for high-risk individuals (PREVENTIVE SERVICES TASK FORCE US, OWENSD K, DAVIDSON KW, et al. Screening for pancreatic cancer: US preventive services task force reaffirmation recommendation statement[J]. JAMA, 2019, 322(5):438-444). The Pancreatic Cancer Early Detection Consortium proposes the use of standardized imaging and MRI reporting templates in pancreatic cancer screening to improve the consistency and accuracy of screening (HUANG CC, SIMEONE DM, LUK L, et al. Standardization of MRI screening and reporting in individuals with elevated risk of pancreatic ductal adenocarcinoma: consensus statement of the PRECEDE consortium[J]. AJRAm J Roentgenol, 2022, 219(6): 903-914.).Meanwhile, a recent prospective study (OVERBEEK KA, LEVINKI JM, KOOPMANN BDM, et al. Long-term yield of pancreatic cancer surveillance in high-risk individuals[J]. Gut, 2022, 71(6): 1152-1160.) shows that annual physical examinations with endoscopic ultrasound and MRI / cholangiopancreatography can bring significant benefits to asymptomatic individuals with a genetic predisposition to pancreatic cancer. The study utilizes artificial intelligence to analyze pancreatic image features to improve the efficiency and accuracy of PDAC risk prediction and information processing, thus better serving high-risk individuals (QURESHI TA, JAVED S, SARMADI T, et al. Artificial intelligence and imaging for risk prediction of pancreatic cancer: a narrative review[J]. ChinClin Oncol, 2022, 11(1): 1.). The results show that the accuracy of artificial intelligence in classifying pancreatic cancer risk reached 89.3%, with sensitivity and specificity reaching 86.0% and 93.0%, respectively. However, early pancreatic cancer screening based solely on imaging modalities is clearly insufficient to provide reliable predictive results. Recently, several novel serum biomarkers have been proposed, and studies have reported that serum protein N-glycan can serve as one of the early detection indicators of pancreatic cancer risk in high-risk individuals (LEVINK IJM, KLATTE DCF, HANNA-SAWIRES RG, et al. Longitudinal changes of serum protein N-glycan levels for earlier detection of pancreatic cancer in high-risk individuals[J]. Pancreatology, 2022, 22(4):497-506.). A group of long non-coding RNAs with oncogenic effects in pancreatic cancer can also serve as novel biomarkers and potential intervention targets for pancreatic cancer (DA). VF,SOSAO J,DA SILVA PELLEGRINA DV,et al.Annotation and functional characterization of long noncoding RNAs deregulated in pancreatic adenocarcinoma[J].CellOncol,2022,45(3):479-504.). Another study identified the serum miRNA signature of PDAC patients and determined a fingerprint of nine miRNAs (miR-205-5p, 934, 192-5p, 194-5p, 194-3p, 215-5p, 375-3p, 552-3p and 1251-5p) for identifying early PDAC patients (KANDIMALLA R, SHIMURAT, MALLIK S, et al. Identification of serum miRNA signature and establishment of a nomogram for risk stratification in patients with pancreatic ductal adenocarcinoma[J].AnnSurg,2022,275(1):e229-e237.). Another study suggests that the methylation level of miRNAs can also serve as an important biomarker for the early diagnosis of pancreatic cancer, with sensitivity and specificity even superior to carcinoembryonic antigen (CEA) and CA19-9 (KONNO M, KOSEKI J, ASAI A, et al. Distinct methylation levels of mature microRNAs in gastrointestinal cancers[J]. Nat Commun, 2019, 10(1):3888.). Finally, progress has been made in the application of extracellular vesicles in pancreatic cancer monitoring. By measuring the proportion of IgG-positive extracellular vesicles in small plasma samples, the effectiveness of treatment in pancreatic cancer patients can be rapidly assessed (COUTO N, ELZANOWSKA J, MAIA J, et al. IgG+extracellular vesicles measure therapeutic response in advanced pancreatic cancer[J]. Cells, 2022, 11(18):2800.).

[0004] Recent multi-omics studies on pancreatic cancer and its precancerous lesions have revealed some important mechanisms of genetic alterations that cause the development and progression of pancreatic cancer. Studies have found that about 10% of familial clustering of pancreatic cancer is caused by hereditary tumor syndromes and hereditary pancreatitis resulting from rare high-risk genetic variants (Raimondi, S., P. Maisonneuve, and A.B. Lowenfels, Epidemiology of pancreatic cancer: an overview. Nat Rev Gastroenterol Hepatol, 2009.6(12):p.699-708). Germline mutations in BRCA1, BRCA2, ATM, PALB2, CDKN2A, MLH1 and MSH2, STK11 and TP53 genes are major susceptibility genes for familial pancreatic cancer (Roberts, NJ, et al., Whole Genome Sequencing Defines the Genetic Heterogeneity of Familial Pancreatic Cancer. Cancer Discov, 2016. 6(2): p. 166-75. & Zhen, DB, et al., BRCA1, BRCA2, PALB2, and CDKN2A mutations in familial pancreatic cancer: a PACGENEstudy. Genet Med, 2015. 17(7): p. 569-77. & Kastrinos, F., et al., Risk of pancreatic cancer in families with Lynch syndrome. Jama, 2009. 302(16): p. 1790-5.). As the population undergoing gene testing for pancreatic cancer patients expands, more pancreatic cancer susceptibility gene mutations are being discovered in pancreatic cancer patients without a family history of familial cancer syndromes (Wood, LD, MBYurgelun, and MGGoggins, Genetics of Familial and Sporadic Pancreatic Cancer. Gastroenterology, 2019. 156(7): p. 2041-2055.).High-risk individuals are defined as those with a family history of PDAC and / or a known pathogenic germline mutation. There is currently no consensus on which screening method should be used (GOGGINS M, OVERBEEK KA, BRAND R, et al. Management of patients with increased risk for familial pancreatic cancer: updated recommendations from the International Cancer of the Pancreas Screening (CAPS) Consortium[J]. Gut, 2020, 69(1):7-17.).

[0005] In conclusion, imaging methods, biomarkers, and susceptibility genes for early pancreatic cancer detection cannot be precisely targeted to the individual. Furthermore, given the significant differences in pancreatic cancer incidence across different ethnicities and geographical regions, these methods clearly have limitations. In the era of precision medicine, an individual's susceptibility to pancreatic cancer is crucial for prevention, early screening, and treatment. Summary of the Invention

[0006] In germline mutation research, this invention discovered a group of frequently mutated germline genes, particularly the cystic fibrosis transmembrane conductance regulator (CFTR), which exhibits high-frequency mutations in pancreatic cancer in the Chinese population. Furthermore, harmful germline mutation sites for CFTR were identified. Based on this, this invention was completed.

[0007] In a first aspect, the present invention provides the use of a germline mutant gene in the preparation of a screening reagent for tumor-susceptible populations, wherein the germline mutant gene is selected from one or more of BRCA1, BRCA2, PALB2, NBN, BARD1, RAD51D and CFTR germline mutant genes.

[0008] The germline mutant gene mentioned therein refers to the gene in which the frequency of the mutant allele is ≥0.5.

[0009] Furthermore, the germline mutant gene preferably includes the CFTR gene.

[0010] Furthermore, the tumor-susceptible population includes, but is not limited to, those susceptible to pancreatic cancer, cervical cancer, endometrial cancer, ovarian cancer, hepatocellular carcinoma, small cell lung cancer, non-small cell lung cancer, gastric cancer, colon cancer, intrahepatic bile duct cancer, extrahepatic bile duct cancer, and urothelial carcinoma.

[0011] Furthermore, the population is preferably East Asian, and even more preferably Chinese.

[0012] Furthermore, the reagent further comprises a formulation for detecting germline mutation sites of the CFTR gene, wherein the CFTR germline mutation sites are selected from mutations at one or more sites selected from p.L69F, p.R74W, p.L88X, p.L100I, p.P140S, p.V201M, p.R297Q, p.G451R, p.E681V, p.H949P, p.G970D, p.R1097C, p.A1136T, p.G1298A, p.C1355Y, p.A1364V, p.R1453W, and p.S1456R.

[0013] Furthermore, the CFTR germline mutation site is selected from one or more mutations at sites selected from p.L69F, p.L88X, p.E681V, p.R1097C, p.C1355Y and p.S1456R.

[0014] Furthermore, the CFTR germline mutation site is selected from mutations at the p.L88X and / or p.R1097C sites.

[0015] Furthermore, the mutations at the site include modified mutations, single nucleotide or heterozygous insertion mutations, deletion mutations, truncation mutations, and / or missense mutations.

[0016] Furthermore, the mutation at the site is preferably a missense mutation or a nonsense mutation.

[0017] Furthermore, the reagent is a test kit.

[0018] Furthermore, the test kit also includes reagents or instruments for detecting the biomarkers.

[0019] Secondly, the present invention provides a biomarker composition for screening pancreatic cancer susceptible populations, wherein the biomarker is selected from one or more germline mutant genes of BRCA1, BRCA2, PALB2, NBN, BARD1, RAD51D and CFTR, wherein the germline mutant gene refers to the gene in which the frequency of the mutant allele is ≥0.5.

[0020] Furthermore, the germline mutant gene preferably includes the CFTR gene.

[0021] Furthermore, the biomarker composition further comprises germline mutation sites of the CFTR gene selected from one or more sites chosen from p.L69F, p.R74W, p.L88X, p.L100I, p.P140S, p.V201M, p.R297Q, p.G451R, p.E681V, p.H949P, p.G970D, p.R1097C, p.A1136T, p.G1298A, p.C1355Y, p.A1364V, p.R1453W, and p.S1456R.

[0022] Furthermore, the CFTR germline mutation site is selected from one or more mutations at sites selected from p.L69F, p.L88X, p.E681V, p.R1097C, p.C1355Y and p.S1456R.

[0023] Furthermore, the CFTR germline mutation site is selected from mutations at the p.L88X and / or p.R1097C sites.

[0024] Furthermore, the mutations at the site include modified mutations, single nucleotide or heterozygous insertion mutations, deletion mutations, truncation mutations, and / or missense mutations.

[0025] Furthermore, the mutation at the site is preferably a missense mutation or a nonsense mutation.

[0026] Thirdly, the present invention provides a prediction system for pancreatic cancer susceptible populations, the system comprising a data acquisition module and a prediction module;

[0027] The data acquisition module is used to acquire data on germline mutations in one or more of the BRCA1, BRCA2, PALB2, NBN, BARD1, RAD51D and CFTR genes of the test subject, wherein a germline mutation gene is identified as a gene with a mutation allele frequency ≥0.5.

[0028] Furthermore, the data acquisition module is used to further acquire mutation data of one or more sites among p.L69F, p.R74W, p.L88X, p.L100I, p.P140S, p.V201M, p.R297Q, p.G451R, p.E681V, p.H949P, p.G970D, p.R1097C, p.A1136T, p.G1298A, p.C1355Y, p.A1364V, p.R1453W, and p.S1456R of the CFTR gene of the test subject.

[0029] Furthermore, the data acquisition module is used to acquire mutation data of one or more sites among p.L69F, p.L88X, p.E681V, p.R1097C, p.C1355Y and p.S1456R of the CFTR gene of the test subject.

[0030] Furthermore, the data acquisition module is used to acquire mutation data of the p.L88X and / or p.R1097C sites of the CFTR gene of the test subject.

[0031] The prediction module is used to predict whether a test subject is susceptible to pancreatic cancer based on germline mutation data of genes obtained by the data acquisition module and mutation data of corresponding sites of the CFTR gene.

[0032] Furthermore, the prediction system also includes reagents or instruments for collecting and processing test subject samples.

[0033] Furthermore, the samples include cells, cell lysates, platelets, serum, plasma, vitreous fluid, lymph, synovial fluid, follicular fluid, semen, amniotic fluid, milk, whole blood, blood-derived cells, urine, cerebrospinal fluid, extracts from oral swabs, saliva, sputum, tears, sweat, mucus, tissue culture fluid, tissue extracts, homogenized tissue, cell extracts, etc.

[0034] Furthermore, the prediction system also includes a detection module for detecting whether one or more genes among the BRCA1, BRCA2, PALB2, NBN, BARD1, RAD51D, and CFTR genes of the test subject have germline mutations.

[0035] Furthermore, the detection module is used to detect whether one or more sites in the test subject's CFTR gene, namely p.L69F, p.R74W, p.L88X, p.L100I, p.P140S, p.V201M, p.R297Q, p.G451R, p.E681V, p.H949P, p.G970D, p.R1097C, p.A1136T, p.G1298A, p.C1355Y, p.A1364V, p.R1453W, and p.S1456R, have been mutated.

[0036] Furthermore, the detection module is used to detect whether one or more sites of the CFTR gene of the test subject, namely p.L69F, p.L88X, p.E681V, p.R1097C, p.C1355Y and p.S1456R, have been mutated.

[0037] Furthermore, the detection module is used to detect whether one or more sites of the p.L88X and p.C1355Y of the CFTR gene in the test subject have been mutated.

[0038] Fourthly, the present invention provides a CFTR gene knockout cell model, wherein the CFTR gene of the cell is knocked out, and the cell is PANC03.27 or SU86.86.

[0039] Fifthly, the present invention provides a cell model of CFTR gene overexpression, wherein the cells are PANC-1 or ASPC1.

[0040] Beneficial effects:

[0041] The high-frequency germline mutation genes and harmful germline mutation sites of CFTR germline mutations obtained by screening in this application can be used to efficiently screen susceptible populations for tumors. This method is more suitable for Asian populations and its effect is more accurate. Attached Figure Description

[0042] Figure 1 Germline mutation analysis workflow.

[0043] Figure 2 Germline mutation annotation and filtering process.

[0044] Figure 3 Germline mutation pathogenicity classification. P = "pathogenic", LP = "likely pathogenic", VUS_D = variant of uncertain significance damage, VUS_ND = variant of uncertain significance non-damage, LB = likely benign, B = benign. *High-impact variant types include frameshift mutations, stop codon mutations, or classical splice site mutations. **Hazardous mutations predicted by the algorithm must simultaneously meet the following conditions: SIFT = D, CADD ≥ 20, and MetaSVM = D. ***The mutation frequency should be ≤0.005 for autosomal recessive genetic populations and ≤0.001 for non-autosomal recessive genetic populations.

[0045] Figure 4 Research on queue design.

[0046] Figure 5The number of germline mutations identified and the distribution of VAFs. CSGs = cancersusceptibility genes; VUS_D = variant of uncertain significance damage; VAF = variant allel frequency.

[0047] Figure 6 The PANEL gene overlaps with tumor susceptibility genes. CSGs = cancer susceptibility genes.

[0048] Figure 7 The germline mutation landscape of pancreatic cancer in the Chinese population. Mutated genes are sorted by pathway and mutation frequency, showing genes related to hereditary pancreatitis pathway, homologous recombination repair pathway, mismatch repair pathway, other DNA repair pathways, Fanconi anemia pathway and DNA damage repair pathway, as well as other high-frequency mutated tumor susceptibility genes.

[0049] Figure 8 Mutant allele frequency distribution. From left to right, genes with high mutation frequency are shown.

[0050] Figure 9 Systematic analysis was used to identify loss of heterozygosity (LOH) events. A. LOH events in oncogenes and tumor suppressor genes were identified by comparing the frequencies of variant alleles in tumor and normal samples. Each point represents a mutation. Diagonals represent theoretical germline mutations for which the VAF of normal (control) and tumor samples is identical. B. Changes in somatic copy number detected in tumors indicate the presence of “significant” LOHs in each gene. C. The number of germline mutations for various types of LOHs was displayed in tumor susceptibility genes, highlighting LOHs resulting from the loss of wild-type alleles in tumor suppressor genes. VAF = variant allel frequency; ****: p < 0.0001; **: p < 0.01; ns: non significance.

[0051] Figure 10 CFTR mutation sites and loss of heterozygosity (LOH) event sites. A. CFTR mutation sites: The mutation sites in the three pancreatic cancer cohorts are shown above the chromosome, and the mutation sites in the healthy control cohort are shown below. * indicates LOH sites. B. View the VAF distribution of CFTR LOH sites in the IGV browser. The IGV of the control samples is shown above, and the IGV of the tumor samples is shown below.

[0052] Figure 11 Pan-cancer CFTR mutation status. A. Overall mutation frequency of CFTR in pan-cancer. B. Distribution frequency of different CFTR mutation sites in pan-cancer. PAAD = pancreatic cancer, UCC = cervical cancer, UCEC = endometrial cancer, ESCA = esophageal cancer, OV = ovarian cancer, LIHC = hepatocellular carcinoma, SCLC = small cell lung cancer, STAD = gastric cancer, COAD = colon cancer, ICC = intrahepatic cholangiocarcinoma, NSCLC = non-small cell lung cancer, EHCC = extrahepatic cholangiocarcinoma, UC = urothelial carcinoma.

[0053] Figure 12 Comparison of CFTR expression in pancreatic cancer and adjacent normal tissues. A. Differences in CFTR RNA expression between pancreatic cancer and adjacent normal tissues, from left to right: comparison of 108 pairs of pancreatic cancer tissues and adjacent normal tissues in this study; comparison of pancreatic cancer organoids and normal pancreatic tissue organoids; comparison of pancreatic cancer tissues and adjacent normal tissues from a public database; comparison of microdissected pancreatic cancer tissues and adjacent normal tissues from a public database. B / C. Quantitative analysis comparing the immunohistochemical staining intensity of CFTR in pancreatic cancer tissues and adjacent normal tissues. ORG = organoid, LCM = microdissection, CP = chronic pancreatitis, PDAC = pancreatic ductal adenocarcinoma. ****: p<0.0001; ***: p<0.001; **: p<0.01.

[0054] Figure 13 Germline mutations in CFTR in large tissue samples lead to decreased CFTR expression. A. Changes in CFTR expression in normal / tumor tissues; B. Differences in CFTR expression between wild-type and mutant samples; C. Correlation between pathological assessment of tumor cell content and ABSOLUTE assessment of tumor purity; D. Differences in CFTR expression between wild-type and mutant samples after ABSOLUTE tumor purity correction. WT = wild-type, MT = mutant.

[0055] Figure 14 Germline mutations in CFTR in organoids lead to decreased CFTR expression. A. Differences in CFTR expression between wild-type and mutant samples in organoids; B. Organoids with the CFTR p.E681V mutant specifically express the mutant allele. *: p<0.05 Figure 15 Germline mutations in CFTR resulted in decreased CFTR protein expression. A. Differences in CFTR protein expression between wild-type and mutant samples; B / C. CFTR expression in mutant samples as determined by immunohistochemical / immunofluorescence staining. **: p<0.01.

[0056] Figure 16ATAC data quality control results. A. Insertion fragment distribution: Typical fragment size distribution maps show enrichment around 100 and 200 bp, indicating fragments combining free nucleosomes and mononucleosomes. B. Sequence distribution around transcription start sites: Typical transcription start site enrichment maps show nucleosome-free fragments enriched at the transcription start site, while mononucleosome fragments are missing at the transcription start site but enriched flanking it. C. ATAC signal enrichment at the EMC7 housekeeping gene transcription start site.

[0057] Figure 17 Comparison of CFTR promoter region chromatin open sequencing (assay for transposase-accessible chromatin with high-throughput sequencing, ATAC-seq) signals between normal pancreatic organoids and pancreatic cancer organoids. ****: p<0.0001.

[0058] Figure 18 Epigenetic regulation of CFTR RNA expression. A. Organoids were divided into high-expression and low-expression groups according to the level of CFTR RNA expression. B / C. ATAC signals in the CFTR promoter and enhancer regions of the high- and low-expression groups. ****: p<0.0001. Figure 19 Methylation-specific PCR analysis was performed on the methylation status of CpG islands. Blue-labeled samples represent low CFTR expression, red-labeled samples represent high CFTR expression, ddH2O served as a negative control, M represents methylation, and U represents unmethylation. Figure 20 CFTR expression levels in pancreatic cancer cell lines (data from the Cancer Cell Lines Encyclopedia (CCLE) database).

[0059] Figure 21 qPCR and WB were used to identify the transfection efficiency of cell lines. A / B. Changes in CFTR at the transcriptional and protein levels after transfection with the target plasmid into four pancreatic cancer cell lines. C. The transfection efficiency was detected by qPCR and WB after constructing CFTR overexpression and mutant stable transfectants using the PANC-1 cell line. ***: p<0.001.

[0060] Figure 22 To compare gene expression levels at mutation sites, an asterisk (*) is used to indicate whether there is a statistically significant difference between OE and any mutation site compared to EV. Then, a # is used to indicate whether there is a statistically significant difference between the OE and OE mutation sites.

[0061] Figure 23CFTR inhibits the malignant phenotype of pancreatic cancer cells. A. CCK8 assay to analyze the effect of CFTR on the proliferative function of pancreatic cancer cells. B. Plate colony assay to analyze the effect of CFTR on the proliferative function of pancreatic cancer cells. ***: p<0.001; **: p<0.01; *: p<0.05.

[0062] Figure 24 Analysis of the effect of CFTR on the proliferative function of pancreatic cancer cells using plate cloning experiments.

[0063] Figure 25 Effects of CFTR on the migration and invasion abilities of PANC-1 cells. EV = empty transfection plasmid, OE = overexpression plasmid. ***: p<0.001.

[0064] Figure 26 The effects of mutations at different sites in CFTR on the migration and invasion ability of PANC-1 cells.

[0065] Figure 27 CFTR promotes apoptosis in pancreatic cancer cells. A. Apoptosis assay to detect the effect of CFTR on the apoptosis rate of PANC-1 cells. B. Cell cycle assay to detect the effect of CFTR on the cell cycle of PANC-1 cells. ***: p<0.001; *: p<0.05.

[0066] Figure 28 The CCK8 assay was used to analyze the effect of CFTR mutation on the proliferation function of PANC-1 cells. OE = overexpression, EV = empty plasmid. Detailed Implementation

[0067] The specific embodiments of the present invention will be further described below. It should be noted that these descriptions are for the purpose of aiding understanding the present invention, but do not constitute a limitation thereof. Furthermore, the technical features involved in the embodiments described below can be combined with each other as long as they do not conflict with each other.

[0068] Unless otherwise specified, the experimental methods used in the following embodiments are conventional methods, and the experimental materials used in the following embodiments are all available through conventional commercial channels.

[0069] The "pancreatic cancer susceptible population" mentioned in this invention refers to the population with a high risk of developing pancreatic cancer. If the expression level of nucleic acid or protein at any one or more gene loci in the above biomarker combination is downregulated or not expressed, it indicates that the subject has a higher risk of developing pancreatic cancer than the normal population and is more likely to become a pancreatic cancer patient. This is called the pancreatic cancer susceptible population.

[0070] The term "annotation" as used in this article refers to the classification of genes into several types using annotations from qualified or unqualified laboratories in the ClinVar database (20210123), InterVar database (20180118), SIFT database, MetaSVM database, and CADD database. For example, P = "pathogenic", LP = "likely pathogenic", VUS_D = variant of uncertain significance damage, VUS_ND = variant of uncertain significance non-damage, LB = likely benign, and B = benign. This invention utilizes multiple databases, including ClinVar, InterVar, OMIM, SIFT, MetaSVM, and CADD, primarily for annotating gene detection results. Initial annotation is performed using database screening criteria, followed by filtering based on the population mutation frequency of the locus. Finally, manual review yields the final pathogenicity classification. Given the large number of VUS involved in this invention, the harmfulness of VUS was predicted using three databases (SIFT, MetaSVM, and CADD) and the mutation type (whether it is a frameshift mutation, stop codon mutation, or classical splicing site mutation). More harmful mutations (VUS_D) were selected as candidate sites for downstream analysis, thereby improving the accuracy of VUS annotation. This was validated in subsequent verification, demonstrating a significant difference between carriers of harmful CFTR germline mutations and healthy individuals without CFTR germline mutations.

[0071] The term "loss of heterozygosity" (LOH), used in this article, is a common phenomenon in cancer development. An individual inherits a pathogenic allele, making that gene (usually a tumor suppressor gene) heterozygous. However, this state does not lead to tumor development because the other allele remains functional. If the functional allele undergoes a point mutation or is inactivated through other mechanisms (such as CNV (copy number variation) changes or epigenetic alterations), resulting in LOH, it completely loses its tumor suppressor function. LOH is a common mechanism for tumor suppressor gene inactivation and is the basis for the "second-strike" hypothesis of cancer.

[0072] The term "mutation" as used in this article refers to a nucleotide mutation at a specific site on a chromosome. The CFTR gene discussed in this article refers to a mutation that occurs at its gene site. For example, p.L69F refers to the mutation of amino acid L to F at the 117509074th base on chromosome 7 (chr7). For details, see Table 2 for CFTR mutation sites and annotation information.

[0073] Example

[0074] The research object, database, and software of this invention are as follows:

[0075] FastQC(https: / / www.bioinformatics.babraham.ac.uk / projects / fastqc / );

[0076] Fastp(https: / / github.com / OpenGene / fastp);

[0077] BWA(https: / / github.com / lh3 / bwa);

[0078] Samtools(http: / / www.htslib.org / );

[0079] GATK(https: / / gatk.broadinstitute.org / hc / en-us);

[0080] VarScan2(https: / / varscan.sourceforge.net / );

[0081] ANNOVAR(https: / / anovar.openbioinformatics.org / en / latest / );

[0082] ControlFreeC(http: / / boevalab.inf.ethz.ch / FREEC / );

[0083] GISTIC(https: / / github.com / broadinstitute / gistic2);

[0084] IGV (https: / / igv.org / );

[0085] R(https: / / www.r-project.org / ).

[0086] Research subjects

[0087] Based on the prospective database of pancreatic tumors from the Department of Hepatobiliary and Pancreatic Surgery, First Affiliated Hospital of Naval Medical University, clinicopathological data of patients who underwent pancreatectomy between 2016 and 2020 were reviewed. Based on the pancreatic tumor organoid bank of the Department of Hepatobiliary and Pancreatic Surgery, First Affiliated Hospital of Naval Medical University, organoid samples successfully constructed between June 2018 and December 2020 were collected. From September 2020 to December 2021, clinicopathological data of patients who underwent pancreatectomy were prospectively collected, and patient blood, pancreatic tumor tissue, and adjacent tissues were collected for experiments.

[0088] 1. Inclusion criteria

[0089] ① Retrospective cases: a. Age greater than 18 years and less than 80 years; b. Underwent radical resection of pancreatic cancer due to pancreatic tumor; c. Previous genomic sequencing data (with tumor and paired adjacent normal or blood sequencing data), including WGS (whole genome sequencing), WES (whole exome sequencing), and PANEL gene sequencing.

[0090] ② Prospective cases: a. Age greater than 18 years and less than 80 years; b. Patients who have undergone radical resection of pancreatic cancer due to pancreatic tumors; c. Patients who will eventually undergo gene sequencing (with tumor and paired adjacent normal tissue or blood sequencing data, including WGS, WES, PANEL sequencing); d. Signed informed consent for biological sample collection and sequencing analysis.

[0091] 2. Exclusion Criteria

[0092] ① Postoperative pathological diagnosis: non-pancreatic ductal adenocarcinoma (PDAC), pancreatic adenosquamous carcinoma (PASC), intraductal papillary mucinous neoplasm (IPMN) with invasive carcinoma;

[0093] ② The sequencing data is corrupted or the sequencing quality control is not up to standard.

[0094] 3. Research Cohort

[0095] Patients are categorized based on their gene sequencing data type:

[0096] ①The discovery cohort consists of patients who underwent WGS and WES sequencing in both retrospective and prospective cases;

[0097] ② The validation cohort consists of patients undergoing PANEL gene sequencing from both retrospective and prospective cases;

[0098] ③ The functional validation cohort consists of organoids from the pancreatic tumor organoid library that have been sequenced by WGS.

[0099] ④ Data on the healthy control group were obtained from the Institute of Computing Technology, Chinese Academy of Sciences.

[0100] 4. Data Collection

[0101] Retrospective clinical and pathological data were extracted from the pancreatic tumor prospective database of the Department of Hepatobiliary and Pancreatic Surgery, First Affiliated Hospital of Naval Medical University. Case report forms were collected from prospective cases after patients consented to be included in the cohort. Key information included sex, age, body mass index (BMI), smoking history, alcohol consumption history, personal cancer history, family cancer history, pathological type, and TNM stage. The TNM staging of the tumor was assessed according to the 8th edition of the American Joint Committee on Cancer (AJCC) Staging Manual.

[0102] 5. Follow-up plan

[0103] This study's follow-up was conducted through two methods: regular postoperative follow-up by LinkDoc Technology Co., Ltd. (Beijing, China) and review of outpatient medical records. Follow-up began at patient discharge and ended at patient death or the follow-up cutoff date of December 2022. Overall survival (OS) was the patient's postoperative survival time, and disease-free survival (DFS) was the time from postoperative surgery to disease recurrence or death. Follow-up frequency: monthly for the first 6 months postoperatively, quarterly for the first 2 years postoperatively, and every 6 months for the first 5 years postoperatively. Follow-up methods included outpatient visits, telephone, email, and chat software. Follow-up content included postoperative re-examination results, adjuvant therapy, recurrence status and evidence, and time and cause of death. Patients with a postoperative follow-up period of less than 6 months were defined as lost to follow-up.

[0104] Example 1: CFTR germline mutation screening

[0105] 1. Pancreatic tissue and blood samples

[0106] All pancreatic samples and blood were obtained from pancreatic tumor patients who underwent surgery at the Department of Hepatobiliary and Pancreatic Surgery, First Affiliated Hospital of Naval Medical University. Normal pancreatic tissue was taken from surgical samples located 3 cm away from the tumor. Tumor tissue or normal tissue was determined by pathological examination.

[0107] 2. DNA sequencing and analysis

[0108] (1) Library construction and sequencing

[0109] For selected tumor tissue samples, tumor DNA was extracted using the QIAamp DNA Extraction Kit. DNA was extracted from peripheral blood mononuclear cells using the BloodMini Kit as paired normal sample DNA for patients. DNA quality and yield were measured using a Qubit fluorometer and a Qubit dsDNAHS analysis kit. Library preparation was performed using the Illumina TruSeq kit; for differentiating cohort populations, pre-sequencing exome capture was performed using the NimbleGen SeqCap EZ Human Exome, PANEL gene sequencing capture was performed using the YouSu™ PANEL kit (OrigiMed) and the Agilent SureSelect XT HS (Agilent Technologies), and WGS sequencing was performed without special processing. DNA was fragmented to approximately 350 bp using sonication. End repair was performed, and the DNA was ligated to Illumina sequencing adapters. Finally, the DNA library was sequenced using the Illumina HiSeq X TEN / Illumina-Novaseq 6000 platform (2 × 150 bp paired end readouts).

[0110] (2) Germline mutation detection

[0111] Mutation detection process (see) Figure 1 The genomic sequencing data were processed according to the standard procedures of GATK and VarScan2.

[0112] First, FastQC software was used to perform quality control on all raw WGS, WES, and PANEL sequencing data (FastQC files), and Fastp software was used to remove adapters and low-quality sequences. The data was then compared with the human reference genome (hg38) using BWA software (MEM), and PCR repetitive sequences were removed using GATKMarkDuplicates. Finally, GATK BaseRecalibrator was used for base quality calibration to obtain the final BAM file.

[0113] The HaplotypeCaller tool from GATK was used to detect variants in the sample. Then, the ApplyVQSR and VariantRecalibrator tools from GATK were used for quality control and filtering of the variant detection. Finally, the SelectVariants tool from GATK was used to generate the variant results (variant call format, VCF file) for a single sample.

[0114] For tumor-adjacent tissue (blood) paired samples, the somatic tool of VarScan2 was used to simultaneously detect germline and phylogenetic variations to obtain vcf files; then the processSomatic tool was used to obtain high-confidence variation files, and the somaticFilter tool was called to filter out false positive variation information. Finally, the results of germline and phylogenetic variations were obtained separately.

[0115] The germline variation results obtained from the above two processes are then merged by taking the intersection of the results using the bcftools concat tool to obtain the final germline variation data (vcf file).

[0116] (3) Annotation and filtering of mutation results

[0117] The VCF files are annotated using ANNOVAR software. The resulting variant data will then undergo rigorous evaluation based on their annotations in databases such as ClinVar and InterVar, and will ultimately be categorized as "pathogenic," "likely pathogenic," "variants of uncertain significance (VUS)," "likely benign," or "benign." Figure 2 ).

[0118] The classification of variants is based on guidelines recommended by the American College of Medical Genetics and Genomics and the American Society for Molecular Pathology, and is adapted to the following steps ( Figure 3 ):

[0119] The first step involves using the ClinVar database (20210123) annotation results, with qualified laboratories annotating the "pathogenicity", "possibly pathogenic", "variable of unknown significance", "possibly benign", and "benign" characteristics.

[0120] The second step involves using the annotation results from the InterVar database (20180118) for unannotated or unqualified ClinVar databases.

[0121] The third step is to classify the "variables of unknown significance" obtained in steps 1 and 2 as "possibly pathogenic" if they are high-impact variants such as frameshiftindels, stop codons, or known splice sites.

[0122] The fourth step involves filtering the "pathogenicity" and "possible pathogenicity" results obtained above through population frequency filtering (for autosomal recessive genes, the population frequency must be less than or equal to 0.005; for non-autosomal recessive genes, the population frequency must be less than or equal to 0.001). If the filtering fails, the variant is downgraded to "variable of unknown significance," and adjusted based on its status in the SIFT, MetaSVM, and CADD databases: If both SIFT and MetaSVM are labeled as "D," and the CADD score is greater than or equal to 20, it is labeled as "variant of unknown significance damage (VUS_D)"; otherwise, it is labeled as "variant of unknown significance non-damage (VUS_ND)." The final annotation is "pathogenicity." Mutations in "possible pathogenicity" and VUS_D have been manually reviewed and defined as harmful mutations in this study, including literature review to determine their pathogenicity, gene database verification, clarifying the relationship between pathogenic phenotypes and tumor syndromes, and assessing the impact of mutations on gene function and their mechanisms.

[0123] (4) Copy number variation analysis

[0124] Copy number variation analysis was performed using ControlFreeC software.

[0125] ① Calculation and splitting of copy number configuration files (WGS and WES are respectively input from the default configuration files);

[0126] ② Calculation and segmentation of smooth BAF profiles: Allele content is characterized by BAFs introduced by a known single nucleotide polymorphism (SNP) array (NCBI dbSNPBuild 146).

[0127] ③ Predict the final genotype status, i.e., the copy number and allele content of each segment: independently predict the genotype status of each genome segment by selecting the allele content corresponding to the maximum log-likelihood.

[0128] ④ Use GISTIC to integrate and analyze CNV results.

[0129] (5) Analysis of LOH and the "Second Strike" event

[0130] ① Obtain the VCF file and distinguish between truncation mutations and missense mutations, and calculate them separately;

[0131] ② The assessment of each tumor variant site is based on two aspects related to its variant allel frequency (VAF): Fisher's exact test to see if the VAF of a tumor site is significantly higher than that of the corresponding site in a matched normal sample; and whether the VAF of a tumor site is significantly higher than the characteristic VAF of a general gene population with somatic mutations.

[0132] ③ Summarize and synthesize the results from ②, transform and correct the p-values, and sort them using the standard Benjaminii-Hochberg FDR calculation;

[0133] ④ Calculation of mutation “background”: Obtain the reference allele depth and mutation allele depth for other genes with somatic mutations in the population, thereby obtaining a null distribution that can be used to calculate the tail p value.

[0134] ⑤ When comparing the VAF of tumors with that of normal controls, the event in which the former is significantly larger than the latter is called LOH;

[0135] ⑥ Based on the above LOH results, define the LOH type according to the CNV results: wild-type allele deletion LOH and mutant allele amplification LOH.

[0136] 3. Germline mutation data from healthy control population

[0137] Germline mutation data (vcf files) from healthy controls were obtained from the Institute of Computing Technology, Chinese Academy of Sciences. A total of 2944 peripheral blood WES samples were obtained, with an average sequencing depth of 150-250X. Mutation detection, filtering, and annotation methods were the same as described above.

[0138] 4. CFTR germline mutation data for other cancer types

[0139] Other tumor CFTR germline mutation data (vcf files) were obtained from Shanghai Zhiben Medical Technology Co., Ltd., including 12 types of tumors such as cervical cancer, endometrial cancer, esophageal cancer, ovarian cancer, hepatocellular carcinoma, small cell lung cancer, non-small cell lung cancer, gastric cancer, colon cancer, intrahepatic cholangiocarcinoma, extrahepatic cholangiocarcinoma, and urothelial carcinoma. The data were derived from patient peripheral blood PANEL sequencing, with an average sequencing depth of 600-1,000X. Mutation detection, filtering, and annotation methods were the same as above.

[0140] 5. Statistical methods

[0141] Statistical analysis was performed using R software. Differences in the distribution of count data between the two cohorts and the correlation of clinicopathological features were analyzed using the χ² test or Fisher's exact test. Ordinal categorical variables were analyzed using the independent samples Mann-Whitney U rank-sum test. A two-sided p-value < 0.05 was considered statistically significant.

[0142] 6. Results Analysis

[0143] 6.1 Basic Information about Queues

[0144] The discovery cohort consisted of 389 cases; the validation cohort consisted of 665 cases; the functional validation cohort consisted of 69 cases; healthy controls consisted of 853 cases; and the pan-cancer cohort consisted of 9253 cases (to this sequencing company).

[0145] 6.2 Demographic Characteristics of the Cohort

[0146] Demographic data from the three cohorts (Table 1) show that the sex distribution was slightly predominantly male, with 233 (59.9%), 416 (62.6%), and 45 (65.2%) males in the three cohorts, respectively; the age distribution was predominantly middle-aged and elderly, with 49 (12.6%), 81 (12.2%), and 8 (11.6%) patients under 50 years of age in the three cohorts, respectively; and the BMI was 22.6, 23.5, and 23.1 kg / m², respectively. 2 Patients with a history of smoking accounted for 145 (37.3%), 254 (38.2%), and 30 (43.5%), respectively; patients with a history of drinking alcohol accounted for 61 (15.7%), 95 (14.3%), and 7 (10.1%), respectively.

[0147] Table 1. Demographic and pathological characteristics of the cohort

[0148]

[0149]

[0150] * The percentage of each subclass in the variable = the number of cases in each subclass / (the number of cases in the queue - missing data).

[0151] Staging is based on AJCC assessment only for pancreatic ductal adenocarcinoma and adenosquamous carcinoma. BMI = Body Mass Index, PDAC = pancreatic ductal adenocarcinoma, PASC = pancreatic adenosquamous carcinoma, IPMN = intraductal papillary mucinous tumor of the pancreas.

[0152] 6.3 Germline Mutation Detection, Filtering, and Annotation

[0153] (1) Discover the queue

[0154] The study included sequencing data from 108 WGS and 281 WES samples from 389 patients. Comprehensive germline mutation analysis revealed 358,307,124 single nucleotide variants (SNVs) and 198,351,057 insertion-deletion (INDEL) mutations at the whole genome level, of which 6,568,825 mutations were located in exons (6,423,173 SNVs and 145,652 INDELs).

[0155] After rigorous filtering and pathogenicity annotation, there were 210 "pathogenic" mutation sites, 1,406 "possibly pathogenic" mutation sites, 6,791 variants of uncertain significance damage (VUS_D), 193,438 variants of uncertain significance non-damage (VUS_ND), 5,574,471 "possibly benign" / "benign" mutation sites, and 792,509 unclassified mutation sites.

[0156] Among 150 cancer susceptibility genes (CSGs), 16 pathogenic mutation sites, 15 potentially pathogenic mutation sites, and 120 VUS_D mutation sites were identified. Figure 5 A). The variant allelic frequency (VAF) distribution is around 0.5, consistent with the characteristics of germline mutations (see...). Figure 5 A, Figure 5 B).

[0157] (2) Verification queue

[0158] The validation cohort was divided into two PANELs, covering 612 and 242 common tumor-associated gene exons and some introns, respectively, covering a portion of CSGs. PANEL1 (281 cases) detected 633,671 germline mutation sites, and PANEL2 (384 cases) detected 12,339 germline mutation sites. In 150 CSGs, PANEL1 identified 44 "pathogenic" mutation sites, 23 "possibly pathogenic" mutation sites, and 109 VUS_D mutation sites, while PANEL2 identified 10 "pathogenic" mutation sites, 9 "possibly pathogenic" mutation sites, and 110 VUS_D mutation sites (see [link to relevant documentation]). Figure 6 ).

[0159] (3) Functional verification queue

[0160] The functional validation cohort contained WGS data from organoid / patient peripheral blood paired samples of 69 pancreatic cancer organoids, covering 15.97% of pathogenic (classified as "pathogenic," "possibly pathogenic," and VUS_D) germline mutations. Two "possibly pathogenic" mutation sites and 19 VUS_D mutation sites were detected across 150 CSGs.

[0161] 6.4 Landscape of germline mutations in pancreatic cancer in the Chinese population

[0162] Systematic analysis of WGS and WES data from 389 pancreatic cancer patients in the discovery cohort revealed that 133 pancreatic cancer patients carried a total of 167 germline mutations in CSGs. These germline mutations were concentrated in six major pathways, including the hereditary pancreatitis pathway, homologous recombination repair pathway, mismatch repair pathway, other DNA repair pathways, Fanconi anemia pathway, and DNA damage repair pathway-related genes. These mutated genes were also validated in the validation and functional validation cohorts. The mutated genes were sorted by pathway and mutation frequency, showing genes related to the hereditary pancreatitis pathway, homologous recombination repair pathway, mismatch repair pathway, other DNA repair pathways, Fanconi anemia pathway, and DNA damage repair pathway, as well as other frequently mutated tumor susceptibility genes (see [link to validation cohort]). Figure 7 ).

[0163] CFTR was the gene with the highest mutation frequency, accounting for 3.6%, 4.4%, and 5.8% in the discovery, validation, and functional validation cohorts, respectively. Among these, there were 5 pathogenic mutations, 4 potentially pathogenic mutations, and 27 VUS_D mutations, involving a total of 18 mutation sites, including 2 pathogenic mutation sites, 1 potentially pathogenic mutation site, and 15 VUS_D mutation sites (see Table 2). The detected harmful germline mutations in CFTR were rare in healthy controls, the Chinese population (ChinaMAP), and the gnomAD East Asian population. The mutated gene VAF accounted for approximately 50%, consistent with germline mutation characteristics. Figure 8 The high-frequency mutation sites of CFTR are p.E681V, p.R1097C, p.R74W, p.G970D, and p.R1453W.

[0164] Table 2 CFTR mutation sites and annotation information

[0165]

[0166]

[0167] Note: P = pathogenicity, LP = possibly pathogenic, VUS = variant of unknown significance, VUS_D = harmful variant of unknown significance; T: tolerated D:

[0168] Deletious; novel: Not reported in the CFTR database or the pancreatitis database.

[0169] 6.4 Systematic identification of "secondary strike" incidents

[0170] To better explore the biological impact of pathogenic mutations, LOH events were systematically detected through statistical analysis of VAF differences and CNV analysis in tumor / normal paired samples. In the discovery cohort of CSGs, 3 “significant” LOH events and 18 “probable” LOH events were identified, all of which occurred in tumor suppressor genes (see [link to discovery cohort]). Figure 9 A). To validate the accuracy of LOH and differentiate the types of LOH occurrence, CNVs in tumor / paired samples were further examined, and the data were meta-analyzed using GISTIC. The results showed that both "significant" and "probable" LOHs exhibited similar changes in CNV deletion (see [link to GISTIC analysis]). Figure 9 B) indicates that "potential" LOH events may be due to insufficient sequencing depth or low tumor purity, resulting in statistically significant differences. Among these "potential" LOH events, we found that 11 / 18 were due to the deletion of wild-type alleles. Integrating the results of VAF statistical analysis and CNV analysis, we ultimately identified 35 LOH events distributed across 26 tumor suppressor genes (see [link to analysis]). Figure 9 C) Most LOHs are caused by the deletion of wild-type alleles (26 / 35).

[0171] CFTR is the gene with the highest frequency of LOH (Left-Ended Hypothesis). VAF statistical analysis revealed two "significant" LOHs and three "probable" LOHs, all of which were confirmed by CNV analysis to involve wild-type allele CNV deletions. The LOH sites are p.P140S, p.R279Q, p.E681V, p.C1355Y, and p.S1456R, located in the ABC transmembrane type 1-1 domain, regulatory domain R, ABC transporter 2 domain, and disorder regulatory domain of CFTR, respectively (see [link to relevant documentation]). Figure 10 A). Comparing the VAF distribution of tumor / control samples in the IGV browser further confirmed the authenticity of the LOH event (see [link]). Figure 10 B).

[0172] 6.5 Germline mutations of CFTR in other cancer types

[0173] A total of CFTR germline mutation data were obtained for 9,253 other tumors, including 79 cases of cervical cancer, 56 cases of endometrial cancer, 551 cases of esophageal cancer, 258 cases of ovarian cancer, 1,161 cases of hepatocellular carcinoma, 222 cases of small cell lung cancer, 2,004 cases of non-small cell lung cancer, 797 cases of gastric cancer, 1,217 cases of colon cancer, 493 cases of intrahepatic cholangiocarcinoma, 229 cases of extrahepatic cholangiocarcinoma, and 95 cases of urothelial carcinoma.

[0174] Among these tumors, cervical cancer and endometrial cancer had the highest frequencies of CFTR germline mutations, at 3.8% and 3.6%, respectively. This may be related to the higher mutation frequency of CFTR in women (see Table 4). CFTR mutations were not detected in urothelial carcinoma, and the CFTR mutation frequency in the remaining tumors was less than 2% (see Table 4). Figure 11 A). p.E681V, p.R74W, p.R1453W, and p.G970D are hotspot mutation sites in pan-cancer studies, with p.E681V being mutated in all tumors except urothelial carcinoma (see [link to relevant documentation]). Figure 11 B). Excluding tumors related to the female reproductive system, the mutation frequency of CFTR in pancreatic cancer is significantly higher than in other cancer types.

[0175] Example 2: Correlation Analysis Between Germline Mutations and Pancreatic Cancer

[0176] The CFTR mutation sites of 842 pancreatic cancer patients in the discovery cohort, validation cohort (PANEL2), and functional validation cohort were compared with those in healthy controls. The frequency of CFTR mutations was found to be significantly higher in pancreatic cancer patients than in healthy controls (4.28% vs. 1.56%, p<0.001) (Table 3). Specifically, for mutation sites, p.G970D, p.G1298A, and p.R1453W showed significant differences between the two cohorts (p<0.05). Although there were no statistically significant differences between the two groups for seven other sites (p.L69F, p.L100I, p.H949P, p.A1136T, p.C1355Y, p.A1364V, and p.S1456R) (p=0.052), no mutations were found in the healthy controls.

[0177] Table 3 Comparison of CFTR mutation frequencies in pancreatic cancer and healthy control cohorts.

[0178]

[0179] An analysis of the demographic and clinical characteristics of 842 pancreatic cancer patients with and without CFTR harmful mutations revealed that patients with CFTR harmful mutations had a higher proportion of family history of cancer (44.4% vs. 21.3%, p = 0.001), and a higher proportion of female patients with CFTR harmful mutations than those without (61.1% vs. 38.0%, p < 0.001) (Table 4).

[0180] Table 4. Clinical characteristics of pancreatic cancer patients carrying / not carrying harmful germline mutations of CFTR

[0181]

[0182] BMI = Body Mass Index

[0183] In summary, 18 harmful germline mutation sites for CFTR were identified. Among them, three sites, p.G970D, p.G1298A, and p.R1453W, showed significant differences between the two cohorts. No mutations were found in the healthy control cohort for p.L69F, p.L100I, p.H949P, p.A1136T, p.C1355Y, p.A1364V, and p.S1456R. Furthermore, LOH events were detected at five sites, p.P140S, p.R279Q, p.E681V, p.C1355Y, and p.S1456R.

[0184] Example 3: Expression Feature Analysis of CFTR

[0185] 1. DNA sequencing and analysis

[0186] The method as described in Example 1.

[0187] (1) Tumor purity assessment

[0188] The Absolute R package was used to quantify the absolute copy number, tumor purity, and ploidy of cells. The parameters were: min.ploidy = 0.95, max.ploidy = 10, max.sigma.h = 0.015, copy_num_type = "total", sigma.p = 0, max.as.seg.count = 15000000, max.non.clonal = 0.05, max.neg.genome = 0.005, and min.mut.af = 0.05. The remaining parameters used their default values. The tumor purity information of the sample was obtained from the output.

[0189] 2. RNA sequencing and analysis

[0190] (1) Library construction and sequencing

[0191] Total RNA was isolated from tumor tissue and adjacent normal tissue using TRIzol reagent, and RNA purification was performed using the RNAeasyMini Kit according to the manufacturer's instructions. RNA degradation and contamination were assessed by agarose gel electrophoresis. RNA purity was measured using a spectrophotometer, and RNA integrity was determined using the RNANano 6000 Assay Kit on a Bioanalyzer 2100 system. Ultra TM RNA Library Prep Kit for Construct a strand RNA-seq library according to the manufacturer's instructions.

[0192] mRNA was enriched and fragmented from total RNA using Oligo dT magnetic beads; first-strand cDNA was synthesized using random hexamer primers; second-strand cDNA was then synthesized using DNA polymerase I and RNase H; fragment size selection, adapter ligation, and PCR amplification were then performed; and the quality of the cDNA library was evaluated on a Bioanalyzer 2100 system.

[0193] Sequencing samples with index tags were clustered in the cBot Cluster Generation System flow cell using the Illumina PE Cluster Kit (Illumina, San Diego, California, USA), and the DNA library was sequenced on the IlluminaHiseq XTEN platform to generate 150bp paired-end reads (Illumina, San Diego, California, USA).

[0194] (2) Quality control, comparison and transcript quantification

[0195] RNA data quality was analyzed using FastQC. Sequencing adapters were removed and low-quality sequences were filtered out using Fastp software. Sequencing data were mapped to the human reference genome hg38 using HISAT2 software. Gene expression levels (read counts) were calculated using featureCounts software with the parameter -pBt exon and other parameters set to default. GRCh38 (V95) was used as the genome annotation file.

[0196] The read count was standardized using FPKM, as shown in the following formula:

[0197]

[0198] 3. RNA data from public databases

[0199] Pancreatic cancer gene expression data were retrieved on the R2 (Genomics Analysis and Visualization Platform, http: / / r2.amc.nl) platform, and four sets of data that met the expectations were selected. Transcriptome expression data were downloaded. GSE28735 was used for comparing CFTR in cancer / adjacent tissues in bulk tissue sequencing, E-MEXP-1121

[61] was used for comparing CFTR in cancer / adjacent tissues in microdissected bulk tissue sequencing, Bailey data was used to analyze the prognostic differences between high and low CFTR expression in bulk tumor tissues, and GSE17891 was used to analyze the prognostic differences between high and low CFTR expression in microdissected bulk tumor tissues.

[0200] 4. ATAC-seq and analysis

[0201] (1) Library construction and sequencing

[0202] Collect organoids from culture dishes, wash the matrix gel with pre-cooled PBS, collect nuclei according to the manufacturer's instructions, and use TruePrep... TM DNALibrary Prep Kit V2 for ATAC sequencing libraries were constructed using the (Vazyme TD501) kit and sequenced using the Illumina Nova platform.

[0203] (2) Data processing and analysis

[0204] The data was quality controlled using FastQC software. Fastp software was used to trim the paired-end sequences of ATAC-seq used for Illumina linkage, transposase sequences, and low-quality sequences. Then, BWAMEM was used to map the data to the human reference genome hg38, and Samtools was used to filter mitochondrial sequences, non-unique alignments, and repetitive sequences. After alignment, the ATACseqQC R package was used for further quality control. MACS2 software was used to detect peaks. The bam files were converted to bigwig files using deepTools bamCoverage and normalized using RPKM. Finally, multiBigwigSummary BED-file was used to integrate the signal intensities of all bigwig files within a specific chromosomal region, generating a signal intensity matrix.

[0205] 5. Pathological and immunohistochemical evaluation

[0206] (1) Paraffin sample acquisition and tissue sectioning

[0207] Based on patient information from the discovery and validation cohorts, paraffin blocks of 200 tumor tissues and adjacent tissues from the Department of Pathology, First Affiliated Hospital of Naval Medical University were used to prepare sections of tumor tissues and adjacent tissues, with 10 consecutive sections for each sample.

[0208] (2) Hematoxylin-eosin (HE) staining, followed by mounting with neutral resin mounting medium.

[0209] (3) Immunohistochemical staining

[0210] (4) Quantitative analysis of immunohistochemical staining

[0211] (5) Immunofluorescence staining: The sections were observed and images were acquired under a Nikon fluorescence microscope. The ultraviolet excitation wavelength was 330-380nm and the emission wavelength was 420nm; the FITC green light excitation wavelength was 465-495nm and the emission wavelength was 515-555nm.

[0212] 6. Methylation-specific PCR

[0213] (1) Organoid resuscitation and culture

[0214] Immerse the cryovials in a 37°C water bath for approximately 3-5 minutes, until the contents are completely thawed. Transfer the liquid from the cryovials to labeled centrifuge tubes, wash twice with basal culture medium, and centrifuge at approximately 1,000 rpm for 3 minutes. Based on the amount of cell pellet, inoculate suspension plates with appropriate amounts of corresponding culture medium and matrix gel. Observe the organoid culture under a microscope. When the density of organoid suspension growth reaches approximately 70%, proceed to the next experimental step.

[0215] (2) Cell Culture

[0216] PANC-1 cells were cultured in DMEM medium (with 10% FBS and 0.2% ant-mpp) for 2 days until the cells reached 90% saturation in the culture dish.

[0217] CFPAC-1 cells were cultured in IMDM medium (with 10% FBS and 0.2% ant-mpp) for 3 days until the cells reached 90% of the culture dish.

[0218] (3) DNA extraction from organoids and cell lines

[0219] ① Organoids / cell lines were digested and resuspended into cell suspension, then centrifuged at 10,000 rpm for 1 min. After discarding the supernatant, 200 μL of buffer GA was added, and then the suspension was shaken until completely resuspended.

[0220] ② Add 20 μL of Proteinase K solution and mix well.

[0221] ③ Add 200 μL of buffer GB, mix thoroughly by inverting, and place at 70°C for 10 min. The solution will become clear. Briefly centrifuge to remove water droplets from the inner wall of the tube cap.

[0222] ④ Add 200 μL of anhydrous ethanol, shake thoroughly to mix, and centrifuge briefly again to remove water droplets from the inner wall of the tube cap.

[0223] ⑤ Add the solution and flocculent precipitate obtained in the previous step to the adsorption column CB3, centrifuge at 12,000 rpm for 30 seconds, discard the waste liquid, and put the adsorption column CB3 back into the collection tube.

[0224] ⑥ Add 500 μL of buffer GD to the adsorption column CB3, centrifuge at 12,000 rpm for 30 seconds, discard the waste liquid, and place the adsorption column CB3 into the collection tube.

[0225] ⑦ Add 600 μL of washing buffer PW to the adsorption column CB3, centrifuge at 12,000 rpm for 30 seconds, discard the waste liquid, and place the adsorption column CB3 into the collection tube.

[0226] ⑧ Repeat step ⑦.

[0227] ⑨ Place the adsorption column CB3 back into the collection tube, centrifuge at 12,000 rpm for 2 minutes, and discard the waste liquid. After placing the adsorption column CB3 at room temperature for several minutes, thoroughly dry any residual rinsing liquid in the adsorption material.

[0228] ⑩ Transfer the adsorption column CB3 into a clean centrifuge tube, add 50-200 μL of elution buffer TE to the middle of the adsorption membrane, let it stand at room temperature for 2 min, centrifuge at 12,000 rpm for 2 min, and collect the solution into a centrifuge tube for later use.

[0229] (4) Sulfite treatment of DNA samples

[0230] Reference Bisulfite Kits instruction manual processing.

[0231] (5) Remove DNA after bisulfite conversion

[0232] ① Briefly centrifuge the PCR tube and transfer the solution intact to a clean 1.5ml microcentrifuge tube.

[0233] ② Add 560 μL of freshly prepared Buffer BL containing 10 μg / ml carrier RNA to each sample. Vortex the solution and then centrifuge rapidly.

[0234] ③ Place the EpiTect spin columns and collection tube on a suitable rack. Transfer the solution from step ⑥ to the corresponding EpiTect spin columns. Centrifuge the EpiTect spin columns at the highest speed for 1 minute. Discard the liquid that has flowed through and return the spin column to the collection tube.

[0235] ④ Add 500 μL of Buffer BW to each rotating column and centrifuge at the highest speed for 1 min. Discard the filtered liquid and return the rotating column to the collection tube.

[0236] ⑤ Add 500 μL of Buffer BD to each rotating column and incubate at room temperature for 15 min. Centrifuge the rotating column at the highest speed for 1 min. Discard the filtered liquid and return the rotating column to the collection tube.

[0237] ⑥ Add 500 μL BW buffer to each rotary column and centrifuge at the highest speed for 1 min. Discard the filtered liquid and return the rotary column to the collection tube.

[0238] ⑦ Place the rotating column into a new 2ml collection tube and centrifuge the rotating column at the highest speed for 1 minute to remove all residual liquid.

[0239] ⑧ Place the open rotating column into a clean 1.5ml microcentrifuge tube and incubate the rotating column in a 56℃ heated plate for 5 minutes.

[0240] ⑨ Place the rotating column into a clean 1.5 ml microcentrifuge tube. Spread 20 μL of EB buffer onto the center of the membrane. Elute the purified DNA by centrifuging at approximately 12,000 rpm for 1 min.

[0241] (6) Methylation-specific PCR

[0242] ① Using bisulfite-treated genomic DNA as a template, a 400 bp fragment was amplified using a methylation-specific PCR kit, and a 20 μL reaction system was constructed. The following systems were prepared using methylation primers and non-methylation primers respectively:

[0243] Methylation primer (M-forward): 5′-GGAGGAGGAAGGTAGGTTTC-3′

[0244] Methylation primer negative chain (M-reverse): 5′-GACCTCTCTTTAAATCCAATTAACAAC-3′

[0245] Unmethylated primer positive strand (U-forward): 5′-GGAGGAGGAAGGTAGGTTTT-3′

[0246] Unmethylated primer negative chain (U-reverse): 5′-AACCTCTCTTAAATCCAATTAACAAC-3′

[0247]

[0248] ② Setting up the PCR reaction cycle:

[0249]

[0250] ③Agarose gel electrophoresis.

[0251] 7. Survival Analysis

[0252] The samples were divided into high and low groups based on the median CFTR expression levels. Survival time was defined in Example 1, Materials and Methods – Study Cohort Construction – Follow-up Plan. Survival analysis was performed using the Kaplan-Meier method, and significance was tested using the Log-rank method.

[0253] 8. Experimental Results

[0254] 8.1 CFTR Expression in Pancreatic Cancer and Adjacent Tissues

[0255] (1) CFTR RNA expression in pancreatic cancer and adjacent tissues

[0256] RNA sequencing results of 108 pancreatic cancer and adjacent normal tissue samples revealed that CFTR expression was downregulated in cancerous tissue compared to surrounding normal tissue (p<0.01).

[0257] Further validation in pancreatic cancer organoids and normal pancreatic tissue organoids (83 vs. 5) revealed that CFTR expression was downregulated, with a more significant difference (p<0.001).

[0258] Analysis of CFTR RNA expression differences between cancerous and adjacent tissues in a public pancreatic cancer database revealed that, in both bulk tissue sequencing (45 vs. 45) and microdissected samples (6 vs. 6) in the public database, CFTR expression in adjacent tissues was significantly higher than that in cancerous tissues (p<0.01). Figure 12 A)

[0259] (2) Expression of CFTR protein in pancreatic cancer and adjacent tissues

[0260] Immunohistochemical staining was performed on surgical pathological specimens from 200 patients in the discovery and validation cohorts to assess CFTR protein expression, and ImageJ software was used for quantitative analysis to evaluate CFTR expression intensity.

[0261] The pancreatic cancer tissue and adjacent pancreatic epithelial cells were delineated on HE-stained sections. Immunohistochemical staining of CFTR was then performed on adjacent sequential sections. Quantitative analysis revealed that the CFTR expression level in normal ductal cells was significantly higher than that in tumor cells (p<0.0001). Figure 13 In patients with chronic pancreatitis (B, C), CFTR expression in reactive normal ductal cells was significantly higher than that in tumor cells (p<0.0001). Figure 12 B, C).

[0262] 8.2 CFTR germline mutations cause decreased CFTR RNA expression

[0263] Previous results have revealed that CFTR expression is decreased in tumor tissues, and similar changes are also observed in cancer and adjacent tissues of CFTR germline mutant samples. Figure 13 A). Comparing the expression of CFTR in normal pancreatic tissues of patients carrying CFTR germline mutations and non-carriers, it was found that the expression level of CFTR in normal pancreatic tissues of CFTR germline mutation carriers tended to be lower than that of non-carriers, but the difference was not statistically significant (p>0.05). Figure 13 B).

[0264] To more accurately compare CFTR expression in pancreatic tumor tissues from CFTR germline mutation carriers and non-carriers, expression levels were first corrected for using tumor purity. Tumor cell content was assessed in HE-stained sections from 30 pancreatic cancer patients. Tumor purity was then assessed using the ABSOLUTE R package based on sequencing data from these 30 patients, and the correlation between the two was analyzed.

[0265] The results showed a good correlation between the tumor cell content assessed by pathology and the tumor purity assessed by ABSOLUTE (R = 0.55, p < 0.01). Figure 13C). Therefore, the tumor purity calculated using ABSOLUTE was used to correct for CFTR expression levels in tumor tissue RNA sequencing results. The results showed that CFTR expression levels in pancreatic tumor tissues of CFTR germline mutation carriers tended to be lower than those of non-carriers, but the difference was not statistically significant (p>0.05). Figure 13 D).

[0266] To verify this hypothesis, we further compared the differences in CFTR expression between carriers and non-carriers of CFTR germline mutations in organoids. Interestingly, after excluding interference from normal tissues, in pancreatic cancer organoids, the four organoids carrying CFTR germline mutations (p.R297Q, p.E681V*2, p.H949P) showed significantly lower CFTR expression levels than CFTR wild-type organoids (p<0.05). Figure 14 A). We also found that the CFTR p.E681V mutant organoids not only showed reduced expression levels, but also specifically expressed the mutant allele in the RNA (A). Figure 14 B).

[0267] 8.3 CFTR germline mutations cause decreased CFTR protein expression

[0268] We also evaluated CFTR expression in wild-type and mutant pancreatic cancer tissues using immunohistochemical staining. We compared the immunohistochemical staining intensity of 27 wild-type CFTR samples and 10 mutant CFTR samples, finding that wild-type CFTR expression was significantly higher than that of mutant samples (p<0.01). Figure 15 A). Simultaneously, we found punctate aggregation of CFTR protein in the cytoplasm in the immunohistochemistry and immunofluorescence staining of mutant samples. Figure 15 B, C).

[0269] 8.4 Epigenetic Effects on CFTR Expression

[0270] (1) CFTR chromatin openness

[0271] To explore the mechanism leading to decreased CFTR expression, we analyzed the openness of the CFTR promoter using a combination of organoid RNA sequencing and ATAC-seq data. The sequencing data quality control results are as follows:

[0272] Insertion fragment distribution analysis: The peaks corresponding to the insert fragment and the nucleosome-free region (<100bp) and the single, dual, and triple nucleosomes (~200, 400, and 600bp, respectively) show a periodic decrease. Figure 16 A); Sequence distribution around transcription start site: Inserted fragments are enriched around the transcription start site of the gene ( Figure 16B); ATAC signal in the housekeeping gene promoter region: Housekeeping gene expression is tissue-specific and can be used as a quality indicator for ATAC-seq. Figure 16 C).

[0273] Comparing the chromatin openness of CFTR in pancreatic cancer organoids and normal pancreatic organoids, it was found that the CFTR promoter region of normal pancreatic organoids was more open than that of pancreatic cancer organoids (p<0.0001). Figure 17 ).

[0274] Using 63 organoids with both RNA and ATAC-seq, the patients were divided into a high CFTR expression group (n=31) and a low CFTR expression group (n=32) based on the median CFTR expression level. Figure 18 A). The CFTR promoter region is defined as the 1.5 kb upstream of the CFTR transcription start site, and the two enhancer regions are defined as the regions 35 kb and 44 kb upstream of the CFTR transcription start site. Figure 18 B). It was found that the chromatin opening signal in the high CFTR expression group was significantly higher than that in the low expression group in both the promoter region and the two enhancer regions (p<0.001). Figure 18 C), and chromatin opening signals and CFTR RNA expression levels in both promoter and enhancer regions showed good correlations (R = 0.65, 0.51, 0.46, ...). Figure 18 C).

[0275] (2) CFTR promoter methylation

[0276] Methylation PCR analysis was performed on 12 organoid samples and two pancreatic cancer cell lines. The 12 organoids were divided into high-expression and low-expression groups based on CFTR expression levels. PANC-1 cells did not express CFTR, while CFPAC-1 cells expressed F508delCFTR. Results showed that both organoids in the high-expression group and CFPAC-1 cells exhibited unmethylated bands, while both organoids in the low-expression group and PANC-1 cells exhibited methylated bands. Notably, organoids PC-133 and PC-135 showed both methylated and slightly unmethylated bands. Figure 19 This result suggests that CFTR expression is regulated by promoter methylation.

[0277] Example 4: Tumor-suppressive function of CFTR and its mutant subtypes

[0278] The plan is to establish four pancreatic cancer cell lines that knock down / overexpress CFTR, followed by the establishment of stable CFTR point mutation cell lines using PANC-1 cells. The tumor suppressor phenotype of CFTR in pancreatic cancer cells will then be explored. Nine mutation sites will be identified: p.R31C, p.L69F, p.L88X, p.E217G, p.E681V, p.R1097C, p.Q1352H, p.C1355Y, and p.S1456R.

[0279] The cell lines used in this invention are pancreatic cancer PANC03.27, pancreatic cancer SU86.86, pancreatic cancer ASPC1, and pancreatic cancer PANC-1; all of which are derived from the Cell Bank of the Chinese Academy of Sciences.

[0280] 1. Construction of CFTR knockdown and overexpression cell lines

[0281] Pancreatic cancer cell lines and their CFTR expression levels were retrieved from the Cancer Cell Line Encyclopedia (CCLE) database. Cell lines with high CFTR expression (SU86.86, PANC03.27) and low CFTR expression (PANC-1, ASPC1) were screened (see [link to CCLE database]). Figure 20 ).

[0282] (1) Construction of CFTR shRNA plasmid vector

[0283] The design, plasmid construction, and extraction of CFTR shRNA sequences were commissioned to Suzhou Genecast Co., Ltd. Two CFTR shRNAs (named CFTR shRNA 1 and shRNA2, respectively) and shNC (blank control) were designed.

[0284] (2) Construction of CFTR overexpression plasmid vector

[0285] The construction and extraction of CFTR overexpression plasmid (CFTR OE) and empty transfer plasmid (CFTR EV) were commissioned to Suzhou GeneGene Co., Ltd.

[0286] (3) Cell preparation and transfection

[0287] Four cell lines, PANC03.27, SU86.86, ASPC1, and PANC-1, were revived and cultured in 10cm dishes. After reaching a density of 80%, they were seeded into 12-well plates. After overnight incubation, the cells were mostly adherent and transfection began when the cell density reached approximately 80%.

[0288] PANC03.27 and SU86.86 cell lines were transfected with CFTR shRNA 1, CFTR shRNA 2, and shNC plasmid; ASPC1 and PANC-1 cell lines were transfected with CFTR OE and CFTR EV plasmids.

[0289] ① For each group of cells and plasmids, prepare them as follows: Add 100μL of Opti-MEM I to a 1.5ml EP tube, then add 2μg of plasmid (tube A). Take another 1.5ml EP tube, add 100μL of Opti-MEM I, add 4μL of Lipofectamin2000, mix well (tube B), and incubate at room temperature for 5min. Then mix tubes A and B and incubate at room temperature for 20min.

[0290] ② Aspirate the culture medium from the 12-well plate and add 800 μL of serum-free culture medium to each well (DEME and RPMI 1640 are different depending on the cell type).

[0291] ③ Add the transfection mixture dropwise to the 12-well plate, mix well, and incubate in an incubator for 4-6 hours.

[0292] ④ Remove the transfection solution, add 1 ml of the corresponding complete culture medium, and continue culturing at 37°C and 5% CO2 for 24 hours.

[0293] 2. Construction of CFTR overexpression stable transgenes

[0294] (1) Vector construction, plasmid extraction and virus packaging

[0295] Using the PANC-1 pancreatic cancer cell line, CFTR overexpression and 9 stable transgenic lines with CFTR point mutation overexpression were constructed. Vector construction, plasmid extraction, and virus packaging were outsourced to Heyuan Biotechnology (Shanghai) Co., Ltd., and the following 11 viral vectors were constructed:

[0296]

[0297] (2) Screening of stable cell lines

[0298] ①Cell plating

[0299] PANC-1 cells were prepared into a cell suspension of 1×10⁵ cells / mL, and 2mL was seeded into each well, i.e., 2×10⁵ cells / well, for one 6-well plate.

[0300] ②Lentinvirus infection

[0301] Infection occurs 12 to 20 hours later.

[0302] Viral load calculation method: (cell count × MOI value / original virus titer) × 10³ = viral load (μL), viral load is as follows:

[0303]

[0304] Add polybrene: Add 10 μL of 1 mg / mL polybrene to each well, so that the final concentration of polybrene in the cell sample is 5 μg / mL.

[0305] Change the culture medium 12-20 hours after infection: discard the culture medium and add 2 mL of fresh culture medium to each well.

[0306] ③ Screening of stable strains

[0307] After 72 hours, add puromycin to a final concentration of 1 μg / mL. Change the culture medium to fresh puromycin every 2-3 days. After drug screening for about two weeks, take fluorescence images and cryopreserve the stable cell lines.

[0308] 3. Cell Culture

[0309] PANC-1 cell lines were cultured in DMEM (10% FBS) complete medium, SU86.86 and ASPC1 cell lines were cultured in RPMI-1640 (10% FBS) complete medium, and PANC03.27 was cultured in RPMI-1640 (15% FBS, 10 U / ml recombinant human insulin) complete medium. Eleven stable transgenic PANC-1 lines were cultured in DMEM (with 10% FBS, 0.2% ant-mpp, and 1 μg / mL puromycin). Cells were passaged after 2-3 days when they reached 80%-90% confluence in the culture dish.

[0310] 4. CFTR qPCR identification

[0311] (1) RNA extraction

[0312] ① Lysis: After aspirating the culture medium from the culture dish and rinsing with PBS, add 350 μL of lysis buffer LB from the RNA extraction kit and mix thoroughly by pipetting repeatedly.

[0313] ② DNA removal: Add to DNA removal column collection tube, incubate at 12,000 rpm for 2 min, and retain the filtrate;

[0314] ③Remove protein: Take the supernatant and add protein removal solution PL;

[0315] ④ RNA adsorption: Add liquid to RNA adsorption column A2, incubate at 12,000 rpm for 1 min, then discard the filtrate;

[0316] ⑤ Add 500 μL of binding solution BD, incubate at 12,000 rpm for 1 min, then discard the filtrate;

[0317] ⑥ Washing: Place RNA adsorption column A2 into the collection tube, add 700 μL of washing buffer W, 12,000 rpm, 1 min, discard the filtrate, repeat twice;

[0318] ⑦ Residue removal: Empty column, 12,000 rpm, 2 min (to remove residual rinsing solution W);

[0319] ⑧ Collect RNA: Place the RNA adsorption column into a new centrifuge tube, add 50 μL of RNA-free H2O, incubate at room temperature for 2 min, then centrifuge at 12,000 rpm for 1 min. Collect the filtrate, which is the RNA solution.

[0320] (2) RNA reverse transcription

[0321] ① Residual DNA removal: Prepare RNase-free centrifuge tubes as follows and incubate at 42°C for 2 min;

[0322]

[0323] ② Reverse transcription reaction system: Add 5 μL of 4×HifairⅢsuperMix Plus directly to ①;

[0324] ③ Reverse transcription program settings:

[0325]

[0326] (3) qPCR detection

[0327] ① Use the Fast Cell Direct SYBR Green RT-qPCR Kit, with a reaction volume of 20 μL. Prepare the following system:

[0328] CFTR primer sequence (forward): 5′-AAGCTGTCAAGCCGTGTTCT-3′

[0329] CFTR primer sequence (reverse): 5′-AAGTCCACAGAAGGCAGACG-3′

[0330] GAPDH primer sequence (forward): 5′-GTCTTCACCACCATGGAGAA-3′

[0331] GAPDH primer sequence (reverse): 5′-TAAGCAGTTGGTGGTGCAG-3′

[0332]

[0333] ② Setting up the PCR reaction cycle:

[0334]

[0335] 5. Identification of CFTR protein and FLAG tag

[0336] (1) Total protein extraction

[0337] ① Remove Western and IP cell lysates and protease inhibitor PMSF from -20℃, add 10 μL PMSF to each 990 μL of cell lysates, and place on ice;

[0338] ② Remove the sterilized EP tube, label it, and place it on ice;

[0339] ③ Remove the cells from the incubator, discard the cell culture medium, and wash once with PBS;

[0340] ④ Discard the PBS, add an appropriate amount of pre-cooled Western blotting and IP cell lysis buffer (containing PMSF), lyse the cells on ice for 30 minutes, scrape off the cells, and transfer the sample into an EP tube;

[0341] ⑤ Transfer the centrifuge tube containing lysed cells to a pre-cooled high-speed low-temperature centrifuge and centrifuge at 4°C and 12,000 rpm for 5 min.

[0342] (2) Determination of protein concentration

[0343] ① Cell protein solution: Extract 3 μL from the original solution, add 27 μL of PBS, and dilute 10 times.

[0344] ② Preparation of BCA working solution: Based on the number of samples, prepare an appropriate amount of BCA working solution according to the volume ratio of A to B of 50:1 and mix thoroughly.

[0345] ③ Prepare standard protein gradients: Dilute 2,000 μg / mL BSA standard to 0, 25, 125, 250, 500, 750, 1000, 1500, and 2000 μg / mL.

[0346] ④ Protein concentration determination: Add 25 μL of sample and standard to each well of a 96-well plate; add 200 μL of LCA working solution to each well and incubate at 37℃ for 30 min; measure the absorbance of A562 using a microplate reader; calculate the protein concentration of the sample based on the standard curve and the sample volume used.

[0347] (3) SDS-PAGE gel electrophoresis

[0348] The calculated data for each well were expanded and diluted according to the actual volume of the sample stock solution. Then, loading buffer (loading buffer: cell solution = 1:4) was added, mixed, boiled at 95°C for 5 minutes, placed on a dry ice box, and then centrifuged at 12000g for 5 minutes at 4°C for electrophoresis.

[0349] (6) Transfer membrane

[0350] Before electrophoresis, prepare a PVDF membrane of the same size as the gel (activated with methanol for 5 min, then equilibrated in buffer for 15 min) and filter paper, remove any air bubbles, gently scrape off the stacking gel, rinse the gel in ddH2O, and then rinse in 1× transfer buffer. Place the separating gel on the filter paper, then the membrane on top of the gel, and finally cover with another sponge pad before placing it in the transfer tank.

[0351] (7) Immune response

[0352] The membrane was transferred to a TBST incubation box containing 5% skim milk powder and sealed by shaking on a decolorizing shaker at room temperature for 1 hour. The primary antibody was diluted to an appropriate concentration with TBST containing 5% skim milk powder. The membrane was removed from the blocking solution, and residual air bubbles were removed. The membrane was then incubated overnight on a decolorizing shaker at 4°C. The membrane was washed three times with TBST on a decolorizing shaker at room temperature for 10 minutes each time. The secondary antibody was diluted and brought into contact with the membrane in the same manner. After incubation at room temperature for 1-2 hours, the membrane was washed three times with TBST on a decolorizing shaker at room temperature for 10 minutes each time, and the chemiluminescent reaction was carried out.

[0353] (8) Chemiluminescence

[0354] Mix equal volumes of reagents A and B; place the membrane with the protein side up on a white plate, add the luminescent liquid droplet onto the membrane to fully cover the membrane surface, and then place it in an imaging device to take a picture.

[0355] 6. Cell proliferation activity assay

[0356] (1) Cell Counting Kit-8 (CCK-8) Experiment

[0357] Digest the cells thoroughly with a digestive solution (slightly over-digestion is acceptable), allowing for good cell dispersion without the need for vigorous pipetting. Incomplete digestion requires vigorous pipetting, which can mechanically damage the cells and affect experimental results. After digestion, perform cell counting and adjust the cell count to 20,000 / mL. Seed control, CFTR-overexpressing PANC-1 cells, and CFTR-overexpressing mutant PANC-1 cells into 96-well plates (approximately 2,000 cells per well), with 6 replicates per group. Allow the cells to adhere. Dilute CCK-8 with fresh complete culture medium (CCK-8: medium = 1:10), mix thoroughly, remove the old medium from the wells, and add 110 μL of freshly prepared CCK-8-containing medium to each well. Measure the absorbance at 450 nm using a microplate reader at 0, 24, 48, and 72 hours.

[0358] (2) Plate colony formation experiment

[0359] Control and CFTR-overexpressing PANC-1 cells were cultured on a large scale. Cells in logarithmic growth phase were digested with trypsin, resuspended in complete culture medium, and counted. The cell number in each experimental group was adjusted to 1,000 cells / well, and each group was seeded in triplicate. The medium was changed and the cell status was observed every 3 days, and the cells were cultured for 14 days. Then, the cells were washed once with PBS, and 1 mL of 4% PFA was added to each well for fixation for 15 min. The cells were washed three times with PBS. 1 mL of crystal violet staining solution was added to each well for staining for 10 min. The cells were then photographed.

[0360] 7. Detection of in vitro migration / invasiveness capabilities

[0361] (1) Transfer Experiment

[0362] ① Cell starvation pre-culture: Replace the cells to be used in the experiment with 1% FBS low serum medium and culture for 12 hours.

[0363] ②Preparation of cell suspension stock solution: Digest cells, resuspend in 1% FBS medium, count cells, and the cell concentration is 4×105 / mL.

[0364] ③Preparation of the lower chamber: Add 500 μL of complete culture medium to the lower chamber of the culture plate, place it in the upper chamber, and set aside for use.

[0365] ④ Inoculation into the upper chamber: After resuspending the prepared cell suspension stock solution, inoculate it into the upper chamber.

[0366] ⑤ Cell Culture: Place the cell culture plate in a CO2 incubator for incubation. ⑥ Cell Migration Observation: Observe cell migration after 24 hours, depending on experimental needs.

[0367] ⑦ Cell staining: Wipe away the cells in the upper layer of the upper chamber with a cotton swab, take a new 24-well plate, fix it (4% PFA), stain with crystal violet, and then wash it.

[0368] ⑧ Cell migration photographing and counting: After washing, the upper chamber is inverted on the microscope for photographing, and the cells are counted in each field of view after the photographs are taken.

[0369] (2) Invasion Experiment

[0370] The invasion experiment requires dissolving the Matrigel gel and incubating it overnight at 4°C. Matrigel gels easily at room temperature; therefore, the test tubes and pipette tips used for coating the basement membrane should be pre-cooled to -20°C before the experiment. Basement membrane coating:

[0371] ① Dilute Matrigel with serum-free, pre-cooled cell culture medium at a ratio of 1:3;

[0372] ② Add 100 μL of diluted matrix gel to the Transwell chamber;

[0373] ③ Place the well plate coated with Matrigel in a 37°C incubator and let it stand for 4 hours;

[0374] ④ Gently wash the gel with serum-free cell culture medium.

[0375] Then proceed with the formal operation:

[0376] ① Cell starvation pre-culture: Replace the cells to be used in the experiment with 1% FBS low serum medium and culture for 12 hours.

[0377] ②Preparation of cell suspension stock solution: Digest cells, resuspend in 1% FBS medium, count cells, and the cell concentration is 4×105 / mL.

[0378] ③Preparation of the lower chamber: Add 500 μL of complete culture medium to the lower chamber of the culture plate, place it in the upper chamber and set aside, trying to avoid generating air bubbles.

[0379] ④ Inoculation into the upper chamber: After resuspending the prepared cell suspension stock solution, inoculate it into the upper chamber.

[0380] ⑤ Cell culture: Place the cell culture plate in a CO2 incubator for incubation.

[0381] ⑥ Cell migration observation: Depending on the experimental needs, cell migration is generally observed after 24 hours.

[0382] ⑦ Cell staining: Wipe away the cells in the upper layer of the upper chamber with a cotton swab, take a new 24-well plate, fix it (4% PFA), stain with crystal violet, and then wash it.

[0383] ⑧ Photograph and count cells migrating.

[0384] 8. Apoptosis analysis

[0385] After culturing control and CFTR-overexpressing PANC-1 cells to a sufficient cell volume, the supernatant was transferred to 15 mL centrifuge tubes. The wells were washed once with PBS, and the cells were digested with digestion solution. Cells were collected from the supernatant in the 15 mL centrifuge tubes and centrifuged at 500 g for 5 min. The supernatant was discarded, and 1 mL of pre-chilled PBS was added to each group to wash the cells. The cells were centrifuged at 500 g for 5 min twice. The supernatant was discarded, and 400 μL of binding buffer was added to each group to resuspend the cells in flow cytometry tubes. 5 μL of Annexin V-FITC was added and gently mixed, and the cells were incubated at room temperature in the dark for 15 min. 10 μL of propidium iodide staining solution was added, and the cells were gently mixed and incubated on ice in the dark for 5 min. The cells were then analyzed by flow cytometry.

[0386] 9. Flow cytometry for cell cycle analysis

[0387] After culturing control and CFTR-overexpressing PANC-1 cells to a sufficient cell volume, the well plates were removed, the supernatant was aspirated, and the cells were washed once with PBS. Cells were digested with digestion solution, collected, and centrifuged at 1000g for 5 min. The supernatant was discarded, and 1 mL of pre-chilled PBS was added to each group for washing the cells. The cells were then centrifuged at 1000g for 5 min. The supernatant was discarded, and 2 mL of pre-chilled 75% ethanol (at -20℃) was added for fixation. Cells were fixed for 2 h. After centrifugation at 1000g for 5 min, the supernatant was discarded, and the cells were washed with 1 mL of pre-chilled PBS. After centrifugation at 1000g for 5 min, the supernatant was discarded, and 1 μL of RNase A (working concentration 20 μg / ml) was added. The cells were incubated at 37℃ for 30 min to fully degrade intracellular RNA. The cells were then centrifuged at 1000g for 5 min, and the supernatant was discarded. The cell pellet was slowly and thoroughly resuspended with 0.5 mL of propidium iodide staining solution. The cells were then incubated at 37℃ in the dark for 30 min. The results were analyzed using flow cytometry and ModFitLT Software.

[0388] II. Experimental Results

[0389] 1. Identification results of CFTR overexpression and interference cell lines

[0390] To verify the biological function of CFTR in pancreatic cancer, we transfected four pancreatic cancer cell lines with the target plasmid and the control plasmid, respectively. After 48 hours, we screened the cells with puromycin for 7 days. After the surviving cells recovered to normal, we verified the transfection effect by qPCR and WB.

[0391] The results showed that after treatment with the two shRNAs, the expression levels of CFTR in PANC03.27 and SU86.86 were significantly lower at both the RNA and protein levels compared with the control group.

[0392] After ASPC1 and PANC1 were transfected into the overexpression plasmid, the expression levels of CFTR were significantly upregulated at both the RNA and protein levels compared to the control group. Figure 21 A, B).

[0393] To verify the impact of CFTR mutations on its biological function, we constructed stable cell lines with CFTR overexpression and empty plasmid control using the PANC-1 cell line, as well as stable cell lines with CFTR overexpression using the nine point mutations mentioned in the method. Low concentrations of puromycin were added to the complete culture medium to maintain the growth advantage of the stable cells.

[0394] To detect transfection results, we used a 3×FLAG tag carried by the plasmid to identify protein expression in the cells. Western blotting results showed that CFTR overexpression and eight point-mutant CFTR-overexpressing stable cell lines showed a protein band around 180 kDa. No bands were detected in PANC-1 cells, empty plasmid control cells, and p.L88X CFTR-overexpressing cells. Figure 21 C) CFTR-overexpressing PANC-1 cells showed higher protein expression than other point-mutant overexpressing cell lines. p.L88X is a truncated mutation; further examination of CFTR RNA expression in p.L88X CFTR-overexpressing cells revealed that CFTR RNA expression was upregulated in p.L88X CFTR-overexpressing cells compared to PANC-1 cells and empty plasmid control cells. Figure 22 In summary, all 10 stable transgenic strains constructed expressed CFTR as expected.

[0395] 2. CFTR inhibits the proliferation of pancreatic cancer cells.

[0396] To investigate the function of CFTR in pancreatic cancer cells, CFTR expression was knocked down in two pancreatic cancer cell lines that highly expressed CFTR, and CFTR was overexpressed in two pancreatic cancer cell lines that did not express CFTR. CCK-8 proliferation assays showed that, compared with their respective control groups, CFTR expression was significantly reduced.

[0397] Decreased CFTR expression in PANC03.27 and SU86.86 cell lines promoted cell proliferation, while overexpression of CFTR in ASPC1 and PANC-1 cell lines inhibited cell proliferation. Figure 23 A).

[0398] Meanwhile, the colony formation assay showed that, compared with the colony numbers of PANC03.27, SU86.86 cells with CFTR knockdown and ASPC1, PANC-1 cells with CFTR overexpression on day 14, the colony formation number significantly increased after CFTR knockdown and significantly decreased after overexpression. Figure 23 B).

[0399] like Figure 24 Overexpression of OE CFTR has a strong tumor suppressor effect. Except for p.E217G, the other mutation sites all affect OE function. The most significant are the p.L88X and p.C1355Y mutation sites, which completely abolish the function of OE CFTR and turn it into a colony-forming ability with the same ability as EV.

[0400] 3. CFTR inhibits the invasion of pancreatic cancer cells.

[0401] To investigate the effect of CFTR on the invasive ability of pancreatic cancer cells, we conducted migration and invasion experiments using PANC-1 and CFTR overexpression of PANC-1. Compared with the control group, CFTR overexpression inhibited the migration and invasion ability of PANC-1 cells. Figure 25 ). Figure 26 This study characterizes the effects of mutations at different sites in CFTR on PANC-1 cell migration.

[0402] 4. CFTR promotes apoptosis in pancreatic cancer cells.

[0403] To verify whether CFTR induces apoptosis in PANC-1 cells, the apoptosis rate of PANC-1 cells and CFTR-overexpressing PANC-1 cells was detected by flow cytometry. The results showed that CFTR overexpression promoted apoptosis in PANC-1 cells. Figure 27 A). Simultaneously, cell cycle analysis revealed that CFTR overexpression led to more cells arresting in the G0-G1 phase, while the proportion of cells in the S and G2 phases decreased. Figure 27 B).

[0404] 5. Compare the proliferative function of 9 CFTR mutant PANC-1 cells and wild-type CFTR-overexpressing PANC-1 cells.

[0405] The results showed that overexpression of CFTR significantly inhibited tumor cell proliferation, while the inhibitory effect of the nine mutant CFTRs on cell proliferation was relatively weakened. Figure 28 A). Combining the findings of punctate aggregation of CFTR protein in the cytoplasm and weakened tumor-suppressive function, it is speculated that the mutant CFTR may undergo abnormal folding. Western blotting results revealed that wild-type CFTR formed bands around 150 kDa and 200 kDa, respectively. The band sizes represent the folded CFTR after processing in the endoplasmic reticulum and Golgi apparatus. However, six sites—p.E271G, p.E681V, p.R1097C, p.Q1352H, p.C1355Y, and p.S1456R—only showed 150 kDa bands, indicating they did not enter the Golgi apparatus for further processing and thus failed to generate mature CFTR protein, resulting in their inactivation. Figure 28 B).

Claims

1. The use of a reagent for detecting germline mutant genes in the preparation of screening reagents for pancreatic cancer susceptible populations, wherein the germline mutant gene is a CFTR germline mutant gene, and the germline mutant gene refers to a gene in which the frequency of the mutant allele is ≥0.5; the CFTR germline mutant site is selected from one or more sites selected from p.L88X and p.C1355Y.

2. The use as described in claim 1, characterized in that, The reagent is a test kit.

3. A biomarker composition for screening pancreatic cancer susceptible populations, wherein the biomarker is a CFTR germline mutant gene, wherein the germline mutant gene refers to a gene with a mutant allele frequency ≥0.5; the germline mutant site of the CFTR germline mutant gene is selected from one or more sites selected from p.L88X and p.C1355Y.

4. A prediction system for pancreatic cancer susceptibility populations, the system comprising a data acquisition module and a prediction module; the data acquisition module is used to acquire mutation data of one or more sites of the p.L88X and p.C1355Y sites of the CFTR gene in test subjects, and the prediction module is used to predict whether a test subject is susceptible to pancreatic cancer based on the germline mutation data of the gene obtained by the data acquisition module and the mutation data of the corresponding sites of the CFTR gene.