A crisper in vivo screening method for identifying microenvironment adaptive genes and application thereof
By screening tumor microenvironment adaptive genes in mice and utilizing differential expression analysis and sgRNA library screening technology, the problems of resource waste and high cost in existing technologies have been solved, achieving efficient and accurate screening of tumor microenvironment adaptive genes and reducing costs.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHONGQING MEDICAL UNIVERSITY
- Filing Date
- 2026-03-16
- Publication Date
- 2026-06-12
Smart Images

Figure CN122201418A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of tumor functional gene screening technology, and in particular to a CRISPR in vivo screening method and its application for identifying microenvironment-adaptive genes. Background Technology
[0002] Tumor dormancy is a crucial biological basis for treatment failure and recurrence of malignant tumors. Clinical observations have revealed that approximately 30%-40% of breast cancer patients develop distant metastases several years or even more than a decade after primary tumor resection. The fundamental mechanism lies in the fact that a small number of tumor cells enter a dormant state, evading the killing effects of traditional radiotherapy and chemotherapy, and are reactivated and proliferate under suitable microenvironmental stimulation. With the development of single-cell sequencing technology, researchers have discovered significant differences between dormant and proliferating tumor cells at the transcriptomic level. However, static expression profiling analysis is insufficient to reveal the causal relationship between gene function and microenvironmental adaptation.
[0003] Current gene function studies primarily rely on in vitro cell models, using siRNA / shRNA libraries for screening. However, in vitro culture cannot simulate the complex microenvironmental stresses in vivo, including key factors such as hypoxia gradients, nutrient competition, immune surveillance, and stromal cell signaling. In recent years, the maturity of CRISPR-Cas9 gene editing technology has brought revolutionary breakthroughs to functional genomics research, especially lentivirus-mediated whole-genome library screening, which has made significant progress in vitro. However, applying these technologies to the in vivo environment still faces major challenges: whole-genome libraries contain tens of thousands of sgRNAs, requiring extremely high coverage to ensure statistical significance in in vivo screening, which is technically difficult and costly; the lack of targeted screening strategies for specific biological processes (such as dormancy) leads to a large amount of resources being wasted on irrelevant genes.
[0004] To address this technical bottleneck, this application establishes a targeted CRISPR in vivo screening system based on dormancy-related differentially expressed genes and expands functional coverage through epigenetic modules, providing an efficient and precise technical platform for analyzing tumor adaptive survival mechanisms in the real in vivo microenvironment. Summary of the Invention
[0005] The purpose of this invention is to address the shortcomings of existing technologies by proposing a CRISPR in vivo screening method and its application for identifying microenvironment-adaptive genes.
[0006] To achieve the above objectives, the present invention adopts the following technical solution: A CRISPR in vivo screening method for identifying genes adapted to the microenvironment includes the following steps: S1: Obtain transcriptome sequencing data of dormant tumor cells and proliferating tumor cells from public databases, perform differential expression analysis using differential expression analysis tools, and screen upregulated differential genes with log2FoldChange > 1 and padj < 0.05 as the core candidate gene set; S2: Design at least two sgRNAs for each gene in the core candidate gene set, and introduce positive controls, negative controls, neutral controls and epigenetic / functional regulatory gene controls to form an sgRNA library; S3: The sgRNA library was cloned into a lentiviral vector, and the coverage and uniformity of the library were evaluated by transformation, large-scale plasmid extraction, and next-generation sequencing (NGS). S4: CRISPR library plasmids were packaged using a second-generation lentivirus packaging system, and viral titers were determined by quantitative polymerase chain reaction (qPCR). S5: Using a low functional multiplicity of infection (MOI), target cells stably expressing Cas9 protein were infected with a lentiviral library. Positive cell populations stably integrated with sgRNA were obtained by flow cytometry sorting, and cell samples were collected at time T0. S6: The sorted cells were in situ inoculated into immune-healthy syngeneic mice (BALB / c), and tumor samples were collected at a fixed time endpoint (4 weeks after inoculation) to ensure that the tumors in the library group underwent sufficient and equal-duration microenvironmental selection pressure in vivo. S7: When the screening endpoint is reached, tumor tissue is collected, genomic DNA is extracted, the integrated sgRNA region is amplified by PCR and an NGS sequencing library is constructed. S8: The sequencing data of T0 and endpoint samples were compared and quantitatively analyzed. By calculating the logarithmic fold change and statistical significance of sgRNA abundance, genes that were significantly enriched or depleted under microenvironmental selection pressure were screened and functional enrichment analysis was performed.
[0007] As a further aspect of the present invention: In step S1, the transcriptome sequencing data are obtained from public databases or tumor cell transcriptome data obtained by self-sequencing. The comparison objects are dormant tumor cells and proliferating tumor cells. After differential expression analysis, upregulated core candidate genes are screened out.
[0008] As a further aspect of the present invention: in step S2, the sgRNA library consists of the following parts: Core candidate gene sgRNAs: Four sgRNAs were designed for each of the 431 core candidate genes, for a total of 1724 sgRNAs. Epigenetic / functional regulatory gene sgRNAs: targeting 13 epigenetic or functional regulatory genes, totaling 28; Positive control sgRNAs: including Rpa3 and Pcna, 3 each, for a total of 6; Negative control sgRNAs: non-targeting sgRNAs, totaling 173; Neutral control sgRNAs: 5 sgRNAs targeting the Rosa26 site; The above sections contain a total of 1936 sgRNAs.
[0009] As a further aspect of the present invention: In step S3, the library is constructed using Gibson Assembly recombination technology, in which the enzyme-digested and recovered pL-CRISPR.EFS.GFP vector is ligated with the PCR-enriched sgRNA library, and after transformation, 1.5E+06 clones are obtained, with a cloning fold of 774. NGS verification shows that the library coverage is 100% and the uniformity (Skew Ratio) is 1.93.
[0010] As a further embodiment of the present invention: In step S4, the lentivirus packaging uses transfer plasmid pL-CRISPR.EFS.GFP, packaging plasmid psPAX2 and envelope plasmid pMD2.G in a mass ratio of 4:3:2. PEI transfection reagent is used to transfect HEK293T cells. Viral supernatant is collected 48-72 hours after transfection, concentrated and purified, and the viral titer is determined by qPCR. The titer is not less than 2.16E+08 TU / mL.
[0011] As a further embodiment of the present invention: In step S5, the target cells are mouse breast cancer cell lines 4T07 that have been stably integrated with the SpCas9 gene via lentivirus and have been functionally verified. Lentiviral infection is performed with a functional MOI of 0.2-0.3, and 7 μg / mL of polybrene is added to improve the infection efficiency. After 24 hours of infection, the culture medium is replaced and cultured until the virus is fully integrated. Green fluorescent protein (GFP) positive cell populations are then sorted out for subsequent experiments.
[0012] As a further step of the present invention: In step S6, BALB / c female mice (6 weeks old) are used as an in vivo model. GFP-positive cells labeled with 1E+06 sgRNA libraries are in situ seeded into the mammary fat pads (MFP) of mice. A fixed calendar time point is used as the screening endpoint. Tumor samples from all experimental animals are collected simultaneously on week 4 (day 28) after cell seeding.
[0013] As a further aspect of the present invention: In step S8, the differential analysis uses the RRA algorithm to compare the changes in sgRNA abundance between the endpoint tumor sample and the T0 cell sample, and screens out genes that are significantly enriched in the endpoint sample as "positive screening genes" and genes that are significantly depleted in the endpoint sample as "negative screening genes".
[0014] A method for identifying tumor microenvironment adaptive target genes, using a CRISPR in vivo screening method to identify negative selection genes, wherein the negative selection genes are involved in, but are not limited to, the following biological processes or pathways: extracellular matrix remodeling, cytoskeleton regulation, chemotaxis, PI3K-Akt signaling pathway, TGF-beta signaling pathway, glycolysis / gluconeogenesis, HIF-1 signaling pathway, mitophagy, and cytokine-cytokine receptor interactions; the method is used for the development of antitumor drug targets.
[0015] One approach for CRISPR in vivo screening to identify genes for microenvironment adaptation includes: (1) Screening for tumor microenvironment adaptive functional genes, including screening and identifying key functional genes related to tumor microenvironment adaptation; (2) Identifying tumor therapeutic targets includes analyzing the survival, metabolism and immune escape mechanisms of tumor cells under hypoxic, nutrient competition, physical stress or immunosuppressive microenvironment conditions; (3) Applications in the preparation of tumor diagnostic markers, including providing data support and theoretical basis for the development of therapeutic targets, diagnostic markers or intervention strategies targeting the tumor microenvironment.
[0016] Compared with existing technologies, this invention provides a CRISPR in vivo screening method and its application for identifying microenvironment adaptation genes, which has the following beneficial effects: 1. By integrating bioinformatics data and based on differential expression data of dormant and proliferating tumor cells, core candidate genes were identified through rigorous differential screening criteria. Combined with a comprehensive and reasonable sgRNA design (covering core candidate genes, epigenetic / functional genes, and multiple controls), the study was able to accurately screen genes related to tumor microenvironment adaptation. A group of functional genes that showed significant abundance changes in the tumor microenvironment were successfully identified, providing key gene information for in-depth exploration of the adaptation mechanism of the tumor microenvironment.
[0017] 2. During library construction, strict control was maintained at every stage, from sgRNA design and synthesis to vector cloning and transformation. Through reasonable assembly methods, high transformation efficiency, and precise computational control, the final plasmid library achieved a total clone count of 1.5E+06, a fold increase of 774x, and 100% coverage as measured by NGS, with good homogeneity (Skew Ratio = 1.93), ensuring the high quality and comprehensiveness of the library.
[0018] 3. The library of this invention contains only 1936 sgRNAs. Compared to whole-genome libraries, this invention reduces the size by approximately 100 times. This significantly reduces the cell coverage required for a single in vivo screening (from tens of millions of cells to approximately 2 million cells), greatly saving on experimental animal costs and sequencing expenses while ensuring 100% coverage and good uniformity in NGS measurements, making complex in vivo screening more feasible and economical.
[0019] 4. A second-generation lentiviral packaging system was employed to precisely control the mass ratio of packaging plasmids to transfer plasmids. Suitable cells were selected for packaging, and transfection was performed using the PEI transfection reagent. Virus collection and processing methods were optimized, and viral activity was accurately assessed through viral titer determination. In the exploration of the MOI gradient, the functional MOI value was determined using a Poisson distribution model to ensure single-cell integration of a single sgRNA sequence. This provided optimal conditions for stable integration of the library into target cells and effective screening, improving the accuracy and specificity of the screening.
[0020] 5. A suitable mouse breast cancer cell line was selected as the target cell line. A cell library successfully integrating sgRNA was obtained through rigorous infection and sorting procedures. This library was then inoculated into the mammary fat pad of mice to establish an in situ tumor model, simulating the real in vivo microenvironment. During the screening process, a fixed calendar time point (week 4 post-inoculation) was used as a unified endpoint, replacing the traditional volume-triggered endpoint strategy. This fundamentally eliminated the problem of asynchronous sampling time caused by differences in tumor growth rates among individuals, ensuring that all animals in the same batch experienced the same duration of in vivo selection pressure. This significantly improved the reproducibility and statistical reliability of the screening results, while also complying with animal experiment ethics requirements, making the screening results closer to the actual in vivo situation.
[0021] 6. Through high-throughput library construction, sequencing, and differential analysis, combined with KEGG and GO enrichment analysis, we can not only accurately calculate changes in sgRNA read abundance and screen for differentially expressed genes, but also deeply analyze the functions and mechanisms of these genes from multiple levels, including biological processes and signaling pathways. The results showed that negatively selected genes were mainly enriched in several biological processes closely related to tumor microenvironment adaptation, involving multiple regulatory links of tumor cells in response to microenvironmental stress. This comprehensively revealed the adaptation strategies of tumor cells in the microenvironment and verified the technical feasibility and stability of the screening system.
[0022] The method of this invention has clearly defined parameters for each step and standardized operation, and can be implemented in laboratories with routine molecular biology experimental conditions, exhibiting good reproducibility. Attached Figure Description
[0023] Figure 1This is a schematic diagram of the technical process of a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 2 This is a statistical chart of the quality of next-generation sequencing of plasmid libraries for a CRISPR in vivo screening method for identifying genes adapted to the microenvironment proposed in this invention. Figure 3 This is a distribution map of sgRNA readings for a CRISPR in vivo screening method for identifying genes adapted to the microenvironment proposed in this invention. Figure 4 This is a qPCR standard curve of ALB standard for a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 5 This is a qPCR standard curve of RRE standard for a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 6 This is a diagram showing the optimal functional MOI results confirmed by MOI gradient exploration in a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 7 This is a distribution diagram of the original data composition of a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 8 This is a table showing the sequencing quality and error rate of a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 9 This is a graph showing the relationship between base quality value and error rate for a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 10 This is a GC content distribution diagram of a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 11 This is a summary table of sgRNA sequencing statistics for a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 12 This is a sequence alignment rate distribution diagram of a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 13 This is a statistical chart showing the number of missing sgRNAs in a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 14This is a Gini index distribution diagram of a CRISPR in vivo screening method for identifying microenvironment adaptation genes proposed in this invention. Figure 15 This is a negative selection gene list for a CRISPR in vivo screening method for identifying genes adapted to the microenvironment proposed in this invention. Figure 16 This is a list of positive screening genes for a CRISPR in vivo screening method for identifying genes adapted to the microenvironment proposed in this invention. Detailed Implementation
[0024] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments.
[0025] Example 1: A CRISPR in vivo screening method for identifying genes adapted to the microenvironment. First, the experimental materials are prepared, including: 1. Bioinformatics data Differential expression data are from a study published in Cell (PMID:35447074). RNA-seq data were obtained from the GEO database GSE198715. Comparison objects: dormant tumor cells vs. proliferating tumor cells 2. Candidate gene screening materials Difference screening criteria: After DESeq2 analysis, log2FoldChange > 1 and padj < 0.05 were selected. Upregulate genes Final number of genes: 431 genes 3. sgRNA Design Materials sgRNA design software: Chopchop v3 Genome version: Mouse genome, mm10 (GRCm38) Design principles: Prefer sequences located near the 5' end of exons with a GC content of 40-60%, avoid ≥4 consecutive T sequences, and prioritize sequences with high off-target scores. 1) Core candidate genes Number of genes: 431.
[0026] Design strategy: 4 sgRNAs per gene Total: 431 × 4 = 1724 items 2) Additional epigenetic / functional regulators Thirteen epigenetic-related genes (such as chromatin remodeling factors) and a variable number of sgRNAs, totaling 28: Kat2a (2 items), Chd3 (1 item), Chd6 (3 items) Baz1b (2 entries), Baz2a (1 entry), Ehmt1 (2 entries), Hdac11 (1 entry) Hdac2 (2 entries), Prmt1 (1 entry), Prmt2 (1 entry) Smarca2 (5 stigmas), Smarca4 (3 stigmas), Brwd1 (4 stigmas) Total: 28 items 3) Positive Controls Rpa3 (3 samples) + Pcna (3 samples) Total: 6 items 4) Negative Controls NC sgRNA-1 to NC sgRNA-173 Total: 173 items 5) Neutral controls (Rosa26 Controls) Five sequences targeting the Rosa26 site (Rosa_JS, Rosa_Addgene, Rosa_Asr_1, Rosa_Asr_2, Rosa_Asr_3) Total: 5 items 4. Carrier and Library Construction Materials CRISPR vector used: pL-CRISPR.EFS.GFP The sgRNA delivery vector used is pL-CRISPR.EFS.GFP (U6 promoter drives sgRNA, EFS promoter drives GFP reporter gene). Competent cell type: Electroporation competent cells were used. Assembly method: The digested and recovered vector is recombined with the PCR-enriched library using a Gibson Assembly reaction. Conversion efficiency (total clones, CFU): 1.5E+06 Final total number of clones in the plasmid library: 1.5E+06 Cloning factor: 774x NGS measured coverage: 100% Uniformity: Skew Ratio = 1.93 Mouse strain / age / sex: BALB / c, 6 weeks, female Number of cells injected: 1E +06 In vivo screening time: 4 weeks (to ensure tumor cells undergo sufficient microenvironmental selection pressure in vivo, including hypoxia, nutrient competition, and matrix signaling effects, and to ensure consistent selection pressure duration across animals). Endpoint sampling time: Day 28 post-cell seeding (week 4), endpoint sampling was performed simultaneously on all animals in both the library and empty vector groups. At this time, the tumor volume in the library group ranged from approximately 450–610 mm³, allowing for sufficient cell acquisition to meet the NGS sequencing coverage depth requirement of at least 500× per sgRNA. A fixed time endpoint was used instead of a volume threshold to trigger the endpoint, eliminating systematic interference from asynchronous sampling on sgRNA abundance comparisons.
[0027] Example 2: A CRISPR in vivo screening method for identifying genes adapted to the microenvironment, comprising the following steps: S1: Differential gene screening method: Download GSE198715 expression matrix Differential analysis was performed using DESeq2. Filtering criteria: log2FoldChange > 1, padj < 0.05 Extracting dormant upregulated genes S2: CRISPR Subcube Design Method Four sgRNAs were designed for each core candidate gene. Added negative control (173 x 1), positive control (2 x 3), neutral control (5), and additional epigenetic / functional gene controls (28). Excluding sgRNAs with low off-target scores Add connector sequence Batch oligonucleotide synthesis S3: Document Building Process Phase I Objective: To synthesize a gRNA pool by adding adapters to the designed gRNA sequences. 1) Sufficient amounts of PL615-Lib-synth were synthesized. 2) The amount of oligos obtained meets expectations and can be used for chip enrichment. Phase II Objective: To clone the synthesized gRNA pool into a specified vector. 1) Enrichment of sgRNA Chip dissolution Dissolve the dry powder in deionized water.
[0028] PCR reaction 2) Preparation of the carrier Enzyme digestion 3) Recovery and purification 4) Recombination reaction The enzyme-digested and recovered vector and the PCR-enriched library were used for a recombinant reaction. 5) Carrier transformation Preliminary experimental verification Convert competent states using electroporation Diluted plating, colony counting, and estimated results. Formal Experiment The final number of clones in the plasmid library was 1.5E+06, which is 774 times the size of the library. Calculate theoretical coverage (coverage = total number of clones / number of sgRNAs) Large-scale plasmid extraction Produces approximately 1 mg of endotoxin-free plasmids 3. Results 1) Results of large-scale plasmid extraction 1 mg of endotoxin-free plasmid was extracted and dissolved in endotoxin-free deionized water. Phase III Objective: To perform amplicon PCR on plasmid libraries, sequence and analyze the amplicon sequences using NGS, and evaluate the plasmid libraries. 1) Using the plasmid library as a template, amplify the amplicons required for NGS with a high-fidelity enzyme. After QC, the amplicons are of correct size and bands are visible. Single, conforming to sequencing standards 2) The amplicon was sequenced using NGS-Hiseq, and the results were analyzed. result 1) Next-generation sequencing of the samples was performed using the Illumina HiSeq machine. 2) Coverage and uniformity analysis BI analysis of the NGS data revealed that among the 1936 sgRNAs in the plasmid library, 0 had zero reads, resulting in 100% coverage. Two sgRNAs exhibited sequence repetition. 10% of the sgRNAs had ≤1132 reads, and 90% had ≤2181 reads, a ratio of 1.93.
[0029] S4: Lentiviral Packaging & Titer Determination 1) Lentiviral Packaging System: This study uses a second-generation lentivirus packaging system for virus production. The plasmids used in the packaging include: Transfer plasmid: pL-CRISPR.EFS.GFP Packaging plasmid: psPAX2 Envelope plasmid: pMD2.G The mass ratio of packaging plasmid to transfer plasmid is 4:3:2. 2) Lentiviral packaging HEK293T cells were selected as the virus packaging cells. Transfection was performed when the cell density was approximately 80%–90%. The transfection method used PEI as the transfection reagent.
[0030] Add the following to each petri dish (based on a standard 15cm dish): Transfer plasmid (20 μg): pL-CRISPR.EFS.GFP library plasmid Packaging plasmid (15 μg): typically psPAX2 Enveloping plasmid (10 μg): usually pMD2.G After transfection, change the culture medium and continue culturing for 48-72 hours, then collect the virus-containing supernatant. 3) Virus collection and processing Viral supernatant was collected twice, at 48 hours and 72 hours post-transfection. The supernatant was filtered through a 0.45 μm filter membrane.
[0031] The virus treatment method involves ultrafiltration concentration or PEG8000 precipitation and purification of the original virus solution. 4) Virus titer determination Detection methods 24 hours before the experiment, HT1080 cells were seeded in 24-well plates at approximately 8.0E+04 cells / well and cultured at 37°C under 5% CO2 conditions; Before the experiment, the cells were observed under a microscope to ensure that they were plump, evenly distributed, and free from contamination before proceeding with the subsequent experiments.
[0032] Virus loading: Two gradients of virus stock solution were used for infection, with 0.5 μL and 0.05 μL of virus stock solution, and two replicates were used for each gradient; Virus culture medium change: Carefully change the culture medium 24 hours after viral infection; Genomic DNA was extracted 72 hours after viral infection; viral titer was detected by qPCR.
[0033] S5: MOI gradient exploration & library target cell acquisition 4T07 breast cancer cells were selected as target cells for lentivirus infection of a library. To determine the infection conditions suitable for this CRISPR library screening system, gradient infection experiments were performed on lentiviral libraries.
[0034] Different viral loads were set, and the proportion of GFP-positive cells was detected by flow cytometry to assess lentiviral infection efficiency. The functional MOI was calculated using a Poisson distribution model.
[0035] Methods for Lentiviral Infection and In Vivo Screening Cell Library Construction The mouse breast cancer cell line 4T07 was selected as the target cell for infection experiments.
[0036] The day before infection, cells were seeded into culture flasks to ensure that the cell confluence was approximately 30% at the time of infection.
[0037] The CRISPR library lentivirus was added to the cell culture system at a calculated functional multiplicity of infection (MOI) of approximately 0.2, and polybrene was added to a final concentration of 7 μg / mL to enhance viral infection efficiency.
[0038] After 24 hours of infection, the culture medium was replaced with fresh complete culture medium for continued incubation.
[0039] T0 sample collection The culture medium was changed and cultured again 24 hours after infection. Preferably, a portion of the cells were collected as T0 samples after the virus had completed integration for subsequent baseline sgRNA abundance sequencing analysis.
[0040] Library-positive cell enrichment The remaining cells were sorted by flow cytometry to obtain a cell library in which GFP-positive cells were successfully integrated with sgRNA.
[0041] S6: Construction of mouse in vivo screening model & collection of library samples Using BALB / c female mice (6 weeks old) as an in vivo model, GFP-positive cells labeled with 1E+06 sgRNA libraries were in situ seeded into the mammary fat pads (MFP) of mice. A fixed calendar time point was used as the screening endpoint. At week 4 (day 28) after cell seeding, all experimental animals were sampled synchronously at the endpoint. The determination of the fixed time endpoint was based on the following criteria: (1) The tumor volume of the screening library group at week 4 ranged from 450 to 610 mm³, and the number of cells obtained from a single animal was not less than 3×10⁶. 8(1) The number of cells required to obtain a sequencing coverage depth of not less than 500× for each sgRNA in the library; (2) The use of a time-fixed endpoint instead of a volume-triggered endpoint can ensure that all animals in the same experimental batch experience the same duration of in vivo selection pressure and eliminate the systematic bias of sgRNA abundance caused by asynchronous sampling; (3) When the tumor is sampled in the 4th week, it is still in the active proliferation period and the tissue DNA is in good integrity, which meets the requirements for high-quality NGS library construction; Animals in the empty vector group are sampled synchronously on the same calendar date, and the tumors in the empty vector group are only used for volume comparison and are not used for NGS library construction.
[0042] S7: High-throughput library preparation, sequencing, and differential analysis 1) Genomic DNA extraction Genomic DNA was extracted from the endpoint tumor tissue and cell samples at time T0.
[0043] 2) sgRNA region PCR amplification & next-generation sequencing library preparation Amplification of the sgRNA region integrated into the genome using specific primers. For PCR amplification products that do not contain any sequencing adapter sequences, after passing quality control, 20-50 ng of DNA fragments are taken for end repair, 5' phosphorylation, and... The Illumina platform high-throughput sequencing library is constructed through steps such as dA-Tailing, Adapter Ligation, Clean Up, and Enrichment. To ensure that the gRNA sequence in the sample is fully amplified, a sufficient amount of genomic DNA should be extracted from cells of 300X to 500X size, depending on the size of the gRNA library.
[0044] For each reaction, 1 μg of genomic DNA was amplified (using sufficient reaction tubes until the genomic DNA was completely consumed). After introducing spacer bases of different lengths and partial sequencing adapters, a second round of amplification was performed to introduce complete sequencing adapters and indices. After purification, the resulting high-throughput sequencing library for the Illumina platform was obtained. S8: Bioinformatics Analysis of Library Construction Results Difference analysis Comparison with T0 sample: Calculate the changes in sgRNA read abundance Perform log2 fold change calculation Perform statistical analysis (RRA algorithm) KEGG and GO enrichment analyses were performed on differentially expressed genes.
[0045] Example 3: A CRISPR in vivo screening method for identifying microenvironment adaptation genes, based on the experiments in Examples 1 and 2. The experimental items in this example are: second-generation sequencing statistics of plasmid library samples; The experimental steps include: 1. Amplicon preparation Using a plasmid library as a template, PCR amplification was performed using a high-fidelity enzyme to obtain the amplicon fragments required for NGS sequencing. Quality control requirements: Amplified products must be of correct size, with single bands, and meet sequencing standards. 2. Next-generation sequencing High-throughput sequencing of amplicon samples was performed using the Illumina Hiseq platform. Sequencing strategy: Paired-end sequencing to ensure sequence accuracy. 3. Data Analysis Process Raw data quality control: Filtering low-quality reads to obtain clean data. sgRNA sequence alignment: Matching sequencing reads with the designed sgRNA library. Statistical metrics calculation: key parameters such as coverage, uniformity, and sequencing quality. 4. Experimental Results: like Figure 2 As shown, where: Raw bases(G): The number of bases in the raw data, in units of G; Raw reads: The number of raw data reads; gRNA Clean_reads: The number of reads in the filtered clean data; Effective rate (%): The proportion of clean data reads obtained through filtering to raw data reads; gRNA Clean bases (G): The number of bases in gRNA clean data, expressed in G. Match reads: The number of reads that are perfectly matched; Match read rate (%): The proportion of gRNAs that align to the gRNA library. Grna_mean_depth: The average sequencing depth of gRNAs, i.e., the number of gRNA reads that have aligned to the reference sequence. The total number is divided by the number of gRNAs matched in the gRNA Library; Max_depth: The highest sequencing depth of gRNA, i.e., the highest alignment depth among all gRNAs in the gRNA Library. The number of alignments of gRNAs with multiple reads; Median_depth: The median depth of gRNA sequencing; Totalsgrnas: Total number of gRNAs in the gRNA library Zerosgrnas: The number of missing gRNAs in the sgRNA library; Q20(%): The percentage of bases with a sequencing quality value greater than 20 out of the total bases; Q30(%): The percentage of bases with a sequencing quality value greater than 30 out of the total bases; Error rate (%): The average error rate for all bases; GC(%): Total GC base content.
[0046] Example 4: A CRISPR in vivo screening method for identifying microenvironment adaptation genes, based on the experiments in Examples 1 and 2. The experimental items in this example are: plasmid library coverage and uniformity analysis. The experimental steps include: 1. First, the sequencing reads were compared with the designed library of 1936 sgRNAs; 2. Count the number of valid matching sequences. 3. Calculate coverage (number of detected sgRNAs / total number of designed sgRNAs) 4. Experimental results are as follows Figure 3 As shown; BI analysis of the NGS data revealed that among the 1936 sgRNAs in the plasmid library, 0 had zero reads, resulting in 100% coverage. Two sgRNAs had repetitive sequences, bringing the total number of sgRNAs included in the library to 1936. 10% of the sgRNAs had ≤1132 reads, and 90% had ≤2181 reads, a ratio of 1.93.
[0047] Example 5: A CRISPR in vivo screening method for identifying microenvironment-adaptive genes, based on experiments conducted in Examples 1 and 2. The experimental item in this example is: lentivirus packaging qPCR virus titer detection. The testing steps include: 1. Using a second-generation lentiviral packaging system, viral stock was obtained by transfecting HEK293T cells (transfer plasmid pL-CRISPR.EFS.GFP, packaging plasmid psPAX2, and envelope plasmid pMD2.G in a mass ratio of 4:3:2). Genomic DNA was extracted from HT1080 cells 72 hours after infection. 2. Using the ALB gene as an internal control, standard curves were plotted using RRE and ALB standards, respectively. The qPCR reaction system was 20 μL, containing: 10 μL of ChamQ SYBR Color qPCR Master Mix (Vazyme, catalog number Q411) containing ROX reference dye, 0.4 μL each of 10 μM forward and reverse primers, 1.2 μL of genomic DNA, and the remainder was made up with deionized water. On an ABI 7500 real-time PCR instrument, the following procedure was followed: after pre-denaturation at 95℃ for 3 minutes, 40 cycles of amplification were performed (95℃ for 15 seconds → 60℃ for 30 seconds), and fluorescence signals were collected during the extension phase. After amplification, melting curve analysis was performed to confirm amplification specificity. Finally, the titer was calculated using the formula: 'Lenvironmental copy number per cell = 2 × Lentiviral copy number / ALB copy number'. 3. Test Results: The ALB gene was selected as the internal reference gene for qPCR in the detection.
[0048] A standard curve was plotted using the standard ALB, as shown in the graph. Figure 4 As shown; A standard curve was plotted using the standard sample - RRE; the curve is shown below. Figure 5 As shown; Experimental data analysis was conducted, and the copy number of lentivirus and ALBumin was calculated based on the standard curve. Titer calculation formula: Lentiviral copy number per cell = 2 * Lentiviral copy number / ALB copy number; Titer = Number of cells infected with lentivirus * Number of lentivirus copies per cell / Amount of virus used (TU / mL) The test results are calculated as shown in the table below: Example 6: A CRISPR in vivo screening method for identifying microenvironment-adaptive genes, based on experiments in Examples 1 and 2. The experimental item in this example is: exploration of the MOI gradient of lentivirus libraries. The experimental steps include: 1. 4T07 breast cancer cells were selected as target cells. Different viral loads were added to induce gradient infection. The infection efficiency was assessed by detecting the proportion of GFP-positive cells using flow cytometry. The functional MOI value was calculated using a Poisson distribution model. 2. To ensure that a single cell integrates a single sgRNA sequence, a functional MOI of 0.2–0.3 was ultimately selected as the formal screening criterion.
[0049] 3. By employing a low MOI infection strategy, we ensure that most cells integrate only a single sgRNA sequence; 4. Experimental results are as follows Figure 6 As shown.
[0050] Example 7: A CRISPR in vivo screening method for identifying genes adapting to the microenvironment, based on experiments conducted in Examples 1 and 2. The experimental items in this example are: evaluation results of library construction and sequencing data; The experimental steps include: 1. First, extract sgRNA sequences from the raw data. Obtain valid sequences by splicing paired-end reads and unspliced reads, and count the proportion of clean reads and the proportion of discarded reads. 2. Subsequent base quality analysis showed that Q20 for all samples was ≥98.77% and Q30 was ≥96.71%. Error rate distribution statistics showed that the average error rate for each sample was between 0.26% and 0.35%, and the GC content ranged from 54.29% to 54.61%. 3. Finally, the abundance distribution of sgRNA in T0 samples and endpoint tumor samples (PT-1 / 2 / 3) was compared, and the Gini index and the number of missing sgRNAs were calculated. 4. Experimental Results: gRNA data extraction and statistics, the composition of the raw data is as follows: Figure 7 As shown; (1) Clean reads from combined reads: The number of gRNA sequences extracted from reads successfully spliced from both ends of the raw data, and their proportion in the original data; (2) Clean reads from uncombined reads: The number of gRNA sequences extracted from reads that failed to splice at both ends of the raw data, and their proportion in the original data (depending on the library construction method). (3) Discard reads: Reads that do not contain gRNA sequences and their proportion in the original data.
[0051] Sequencing quality and error rate distribution statistics, the statistical results are as follows: Figure 8 As shown; The results of base quality values and error rates are as follows: Figure 9 As shown; Statistics on base content distribution, the statistical results are as follows: Figure 10 As shown; Summary table of sgRNA sequencing statistics for T0 vs endpoint samples as follows: Figure 11 As shown; The mapping ratio distribution is as follows: Figure 12 As shown; The distribution of the number of missing sgRNAs is as follows: Figure 13 As shown; The distribution plot of the Gini index for each sample is as follows: Figure 14 As shown; Negative selection genes in the list of genes with significant changes, such as Figure 15 As shown; Positive screening genes such as Figure 16 As shown.
[0052] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. A CRISPR in vivo screening method for identifying genes adapted to the microenvironment, characterized in that, Includes the following steps: S1: Obtain transcriptome sequencing data of dormant tumor cells and proliferating tumor cells from public databases, perform differential expression analysis using differential expression analysis tools, and screen upregulated differential genes with log2FoldChange > 1 and padj < 0.05 as the core candidate gene set; S2: Design at least two sgRNAs for each gene in the core candidate gene set, and introduce positive controls, negative controls, neutral controls and epigenetic / functional regulatory gene controls to form an sgRNA library; S3: The sgRNA library was cloned into a lentiviral vector, and the coverage and uniformity of the library were evaluated by transformation, large-scale plasmid extraction, and next-generation sequencing (NGS). S4: CRISPR library plasmids were packaged using a second-generation lentivirus packaging system, and viral titers were determined by quantitative polymerase chain reaction (qPCR). S5: Using a low functional multiplicity of infection (MOI), target cells stably expressing Cas9 protein were infected with a lentiviral library. Positive cell populations stably integrated with sgRNA were obtained by flow cytometry sorting, and cell samples were collected at time T0. S6: The sorted cells were in situ inoculated into immune-healthy syngeneic mice (BALB / c), and tumor samples were collected at a fixed time endpoint (4 weeks after inoculation) to ensure that the tumors in the library group underwent sufficient and equal-duration microenvironmental selection pressure in vivo. S7: When the screening endpoint is reached, tumor tissue is collected, genomic DNA is extracted, the integrated sgRNA region is amplified by PCR and an NGS sequencing library is constructed. S8: The sequencing data of T0 and endpoint samples were compared and quantitatively analyzed. By calculating the logarithmic fold change and statistical significance of sgRNA abundance, genes that were significantly enriched or depleted under microenvironmental selection pressure were screened and functional enrichment analysis was performed.
2. A CRISPR in vivo screening method for identifying microenvironment adaptation genes according to claim 1, characterized in that, In step S1, the transcriptome sequencing data comes from public databases or tumor cell transcriptome data obtained by self-sequencing. The comparison objects are dormant tumor cells and proliferating tumor cells. After differential expression analysis, upregulated core candidate genes are screened out.
3. A CRISPR in vivo screening method for identifying microenvironment adaptation genes according to claim 1, characterized in that, In step S2, the sgRNA library consists of the following parts: Core candidate gene sgRNAs: Four sgRNAs were designed for each of the 431 core candidate genes, for a total of 1724 sgRNAs. Epigenetic / functional regulatory gene sgRNAs: targeting 13 epigenetic or functional regulatory genes, totaling 28; Positive control sgRNAs: including Rpa3 and Pcna, 3 each, for a total of 6; Negative control sgRNAs: non-targeting sgRNAs, totaling 173; Neutral control sgRNAs: 5 sgRNAs targeting the Rosa26 site; The above sections contain a total of 1936 sgRNAs.
4. A CRISPR in vivo screening method for identifying microenvironment-adaptive genes according to claim 1, characterized in that, In step S3, the library was constructed using Gibson Assembly recombination technology. The enzyme-digested and recovered pL-CRISPR.EFS.GFP vector was ligated with the PCR-enriched sgRNA library. After transformation, 1.5E+06 clones were obtained, with a cloning fold of 774. NGS verification showed that the library coverage was 100% and the uniformity (Skew Ratio) was 1.
93.
5. A CRISPR in vivo screening method for identifying microenvironment-adaptive genes according to claim 1, characterized in that, In step S4, the lentivirus packaging used transfer plasmid pL-CRISPR.EFS.GFP, packaging plasmid psPAX2, and envelope plasmid pMD2.G in a mass ratio of 4:3:
2. PEI transfection reagent was used to transfect HEK293T cells. Viral supernatant was collected 48-72 hours after transfection, concentrated and purified, and the viral titer was determined by qPCR. The titer was not lower than 2.16E+08 TU / mL.
6. A CRISPR in vivo screening method for identifying microenvironment-adaptive genes according to claim 1, characterized in that, In step S5, the target cells were mouse breast cancer cell line 4T07. Lentiviral infection was performed with a functional MOI of 0.2-0.3, and 7 μg / mL of polybrene was added to improve the infection efficiency. After 24 hours of infection, the culture medium was changed and the cells were cultured until the virus was fully integrated. Green fluorescent protein (GFP) positive cell populations were then sorted out for subsequent experiments.
7. A CRISPR in vivo screening method for identifying microenvironment adaptation genes according to claim 1, characterized in that, In step S6, BALB / c female mice (6 weeks old) were used as an in vivo model. GFP-positive cells labeled with 1E+06 sgRNA libraries were in situ seeded into the mammary fat pads (MFP) of the mice. A fixed calendar time point was used as the screening endpoint. Tumor samples from all experimental animals were collected simultaneously on week 4 (day 28) after cell seeding.
8. A CRISPR in vivo screening method for identifying microenvironment-adaptive genes according to claim 1, characterized in that, In step S8, the differential analysis uses the RRA algorithm to compare the changes in sgRNA abundance between the endpoint tumor sample and the T0 cell sample, and screens out genes that are significantly enriched in the endpoint sample as "positive selection genes" and genes that are significantly depleted in the endpoint sample as "negative selection genes".
9. A method for identifying tumor microenvironment adaptive target genes, characterized in that, The method described in any one of claims 1-8 is used to identify negative selection genes, and the negative selection genes are involved in biological processes or pathways including but not limited to: extracellular matrix remodeling, cytoskeleton regulation, chemotaxis, PI3K-Akt signaling pathway, TGF-beta signaling pathway, glycolysis / gluconeogenesis, HIF-1 signaling pathway, mitophagy, and cytokine-cytokine receptor interaction; the method is used for the development of antitumor drug targets.
10. An application of the method according to any one of claims 1-9 in CRISPR in vivo screening for identifying microenvironment adaptation genes, comprising: (1) Screening for tumor microenvironment adaptive functional genes, including screening and identifying key functional genes related to tumor microenvironment adaptation; (2) Identifying tumor therapeutic targets includes analyzing the survival, metabolism and immune escape mechanisms of tumor cells under hypoxic, nutrient competition, physical stress or immunosuppressive microenvironment conditions; (3) Applications in the preparation of tumor diagnostic markers, including providing data support and theoretical basis for the development of therapeutic targets, diagnostic markers or intervention strategies targeting the tumor microenvironment.