Rapid screening method of reprogramming factors based on crisper screening combined with single cell sequencing analysis

By combining CRISPR screening with single-cell sequencing, the problems of low efficiency, limited throughput, and insufficient resolution in existing technologies for screening reprogramming factors have been solved. This approach enables efficient, comprehensive, and accurate screening of reprogramming factors and reveals the molecular regulatory network of the reprogramming process.

CN122303404APending Publication Date: 2026-06-30HANGZHOU WUWEN QINGXIN ARTIFICIAL INTELLIGENCE BASIC TECHNOLOGY RESEARCH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU WUWEN QINGXIN ARTIFICIAL INTELLIGENCE BASIC TECHNOLOGY RESEARCH CO LTD
Filing Date
2026-03-25
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies are inefficient and have limited throughput when screening reprogramming factors. They are difficult to analyze heterogeneity and molecular regulatory networks at the single-cell level, lack stage specificity, and are costly, thus failing to comprehensively analyze the reprogramming process.

Method used

We employed a combined CRISPR screening and single-cell sequencing approach to construct whole-genome CRISPR gene knockout and activation libraries. We then performed low-infection multiple infection, sampled multiple time points for single-cell sequencing and data analysis, and used dimensionality reduction and pseudo-time series analysis to identify key factors.

Benefits of technology

This method enables high-throughput, high-resolution, and stage-specific screening of reprogramming factors, improving screening efficiency and accuracy. It also reveals the molecular regulatory network and synergistic effects of the reprogramming process, providing a tool for a deeper understanding of the reprogramming mechanism.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122303404A_ABST
    Figure CN122303404A_ABST
Patent Text Reader

Abstract

This invention discloses a rapid screening method for reprogramming factors based on combined CRISPR screening and single-cell sequencing analysis, belonging to the field of biotechnology. The method includes: constructing a genome-wide CRISPR gene knockout and / or gene activation lentiviral library; infecting target cells with low multiplicity of infection (MLI) and then screening; introducing reprogramming factors to initiate reprogramming, dynamically sampling cells at multiple key time points; constructing single-cell transcriptome and sgRNA libraries for high-throughput sequencing; integrating single-cell data from all time points, using dimensionality reduction, clustering, and pseudo-temporal analysis to define cell states and construct reprogramming trajectories; and correlating sgRNA perturbation information with cell states to identify key factors that promote or inhibit reprogramming at specific stages. This invention achieves high-throughput, high-resolution, and stage-specific reprogramming factor screening across the entire genome, significantly improving screening efficiency and accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to biotechnology, and more particularly to a rapid screening method for reprogramming factors based on a combination of CRISPR screening and single-cell sequencing analysis. Background Technology

[0002] Somatic cell reprogramming into induced pluripotent stem cells (iPS cells) is a major breakthrough in regenerative medicine, with enormous application prospects in disease modeling, drug screening, and cell therapy. Since Yamanaka first discovered in 2006 that four transcription factors (Oct4, Sox2, Klf4, and c-Myc) could reprogram mouse fibroblasts into pluripotent stem cells, iPS cell technology has developed rapidly.

[0003] However, somatic cell reprogramming is an extremely complex biological process involving multiple levels of molecular events, including epigenetic remodeling, metabolic reprogramming, and alterations in signaling pathways. The reprogramming process can be broadly divided into initiation, maturation, and stabilization stages, each controlled by a sophisticated gene regulatory network. Currently, the precise molecular regulatory mechanisms involved in reprogramming are not fully understood, and many potential reprogramming factors remain to be discovered.

[0004] Existing techniques for screening reprogramming factors mainly suffer from the following defects and shortcomings: First, traditional screening methods are inefficient and have limited throughput. Traditional gene function research methods mainly employ a gene-by-gene validation strategy, observing the impact of overexpressing or knocking out candidate genes on reprogramming efficiency. This method is time-consuming and labor-intensive, and cannot meet the needs of systematic screening across the entire genome.

[0005] Second, existing methods struggle to comprehensively analyze the molecular regulatory networks at each stage of reprogramming. Reprogramming is a multi-step, asynchronous process involving multiple intermediate states. Traditional population-level analysis methods can only reflect the average changes in the cell population, failing to capture heterogeneity at the single-cell level and making it difficult to accurately identify genes that play a key role at specific time points or transitional stages.

[0006] Third, the cost is high and the resolution is insufficient. Although CRISPR screening technology has improved screening throughput to some extent in recent years, existing CRISPR screening methods are mainly based on population-level phenotypic screening and cannot resolve the effects of gene perturbation at single-cell resolution. This makes it difficult to discover many reprogramming factors that play a key role in individual cells but are masked at the population level.

[0007] Fourth, there is a lack of stage specificity. Current technologies cannot pinpoint the specific stage of reprogramming in which the selected genes play a role, limiting a deeper understanding of the reprogramming mechanism and the development of targeted regulatory strategies.

[0008] Therefore, developing an efficient, comprehensive, high-resolution method that can accurately screen reprogramming factors is of great scientific significance and practical application value for a deeper understanding of reprogramming mechanisms, optimization of reprogramming schemes, and discovery of new reprogramming factors. Summary of the Invention

[0009] Purpose of the invention: The purpose of this invention is to provide a rapid, efficient, and high-resolution screening method for identifying key factors regulating various stages of cell reprogramming across the entire genome.

[0010] Technical solution: A rapid screening method for reprogramming factors based on combined CRISPR screening and single-cell sequencing analysis, including the following steps: S1. Construct a CRISPR gene knockout sgRNA lentiviral library and / or a CRISPR gene activation sgRNA lentiviral library covering the entire genome. S2. Infect the target cell population with the lentiviral library at a low multiplicity of infection (MOI) of 0.3-0.5, ensuring that each cell receives at most one sgRNA. After infection, screen to obtain positive cells. S3. Introduce the reprogramming factor into the screened cells and sample the cells at multiple key time points during the reprogramming induction process. S4. Prepare single-cell suspensions from cell samples collected at each time point, construct libraries containing both cell cDNA and sgRNA amplicon, and perform high-throughput sequencing. S5. Integrate and analyze single-cell transcriptome data from all time points, define different cell states during the reprogramming process using dimensionality reduction and cluster analysis, and construct the reprogramming trajectory using pseudo-time series analysis. S6. Associate the sgRNA perturbation information of each single cell with its cell state and position on the pseudo-time trajectory, and statistically analyze the enrichment or depletion of specific gene perturbations at different time points and in different cell states to identify genes with significant regulatory effects at specific stages or state transitions.

[0011] Furthermore, the CRISPR gene-activated sgRNA lentiviral library is produced using the dCas9-VPR system or the dCas9-SAM system.

[0012] Furthermore, the reprogramming factor includes one or more of Oct4, Sox2, Klf4, and c-Myc, which are introduced into cells via electroporation.

[0013] Furthermore, the key time points include day 0, day 3, day 7, day 10, and day 14 after reprogramming induction.

[0014] Furthermore, the single-cell suspension preparation utilizes the 10x Genomics Chromium platform for single-cell capture, lysis, and reverse transcription.

[0015] Furthermore, the high-throughput sequencing was performed on the Illumina NovaSeq platform, with a transcriptome sequencing depth greater than 50,000 reads / cell, and sgRNA sequencing to ensure sufficient coverage.

[0016] Furthermore, the dimensionality reduction analysis includes UMAP or t-SNE analysis, and the pseudo-time series analysis employs the Monocle or Slingshot method.

[0017] Furthermore, in step S6, statistical models, including mixed-effects models or negative binomial regression models, are applied to correct for batch effects and library complexity.

[0018] Furthermore, step S7 is included: based on the analysis results, candidate genes with significant promoting or inhibiting effects in key stages of reprogramming or state transitions are screened out, and single-gene CRISPR-KO or overexpression verification is performed.

[0019] A reprogramming factor obtained according to the above method, which plays a promoting or inhibiting role in the initial, intermediate or stable stages of reprogramming.

[0020] Beneficial effects: (1) High throughput and systematicity: Utilizing CRISPR-KO and CRISPRa libraries, this invention enables high-throughput, unbiased screening of all genes knocked out and activated across the entire genome. It can screen tens of thousands of genes at once and directly link them to specific stages, significantly shortening the screening cycle, improving screening efficiency, and accelerating the discovery of key regulatory factors. Compared to traditional gene-by-gene verification methods, this invention offers significant advantages in terms of time and labor costs.

[0021] (2) High resolution and dynamism: By combining single-cell sequencing technology, the impact of gene perturbation (knockout or activation) on the reprogramming process can be analyzed at the single-cell level, accurately capturing the heterogeneity in the cell population and identifying reprogramming factors that play a key role in individual cells but may be masked at the population level, greatly improving the accuracy and sensitivity of screening. At the same time, this invention can analyze the heterogeneity and dynamic changes of the reprogramming process at single-cell resolution, accurately capturing the cell state at different time points.

[0022] (3) Stage-specific identification: By sorting and analyzing single cells at different time points of reprogramming (corresponding to the initiation, maturation and stabilization stages, the specific role of the screened genes in the reprogramming process can be clearly identified. This helps to gain a deeper understanding of the complex molecular regulatory network in the reprogramming process and provides precise targets for targeted regulation of the reprogramming process. This is an advantage that traditional batch screening cannot achieve.

[0023] (4) Comprehensiveness and unbiasedness: Whether it is gene knockout based on CRISPR or gene activation based on dCas9, both operate on the whole genome of the target cell, and can unbiasedly screen out all genes that may affect each stage of reprogramming, avoiding the problem of traditional methods only focusing on known candidate genes and missing important regulatory factors.

[0024] (5) Revealing synergistic effects and networks: Single-cell data can be used to analyze the effects of multiple gene perturbations (in the same cell or different cells) on cell state, which helps to discover synergistic gene networks and provides a new perspective for systematically understanding the molecular mechanisms of reprogramming. Attached Figure Description

[0025] Figure 1 This is a schematic diagram of the method flow of the present invention. Detailed Implementation

[0026] To make the technical solution of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0027] Example like Figure 1 As shown, a rapid screening method for reprogramming factors based on combined CRISPR screening and single-cell sequencing analysis includes: Step 1: Constructing the CRISPR screening library For target cells (such as fibroblasts), construct lentiviral libraries of CRISPR gene knockout (CRISPR-KO) sgRNAs and / or CRISPR gene activation (CRISPRa) sgRNAs covering the entire genome. The CRISPRa system can be either the dCas9-VPR system or the dCas9-SAM system. Library design should ensure multiple sgRNAs covering each gene to improve the reliability of screening.

[0028] Step 2: Document Derivation and Screening The constructed lentiviral library was used to infect the target cell population at a very low multiple of infection (MOI = 0.3-0.5) to ensure that each cell received at most one sgRNA, achieving a "one cell, one sgRNA" correspondence. 72 hours post-infection, uninfected cells were removed by adding a selection drug (such as puromycin) or by fluorescent sorting, yielding a positive cell population carrying sgRNA.

[0029] Step 3: Reprogramming Induction and Dynamic Sampling Classic reprogramming factors (such as Oct4, Sox2, Klf4, and c-Myc) were introduced into the selected cells via electroporation to initiate the reprogramming process. During the reprogramming induction process, samples were taken at several key time points (e.g., day 0, day 3, day 7, day 10, and day 14 post-induction). At each time point, a subset of cells was collected for subsequent single-cell analysis.

[0030] Step 4: Single-cell sample preparation and sequencing Single-cell suspensions were prepared from cell samples collected at each time point to ensure cell viability and monodispersity. Single-cell capture, cell lysis, and reverse transcription were performed using the 10x Genomics Chromium platform to construct a library containing both cellular cDNA (for transcriptome analysis) and sgRNA amplicon (for sgRNA identification). High-throughput sequencing was performed on the Illumina NovaSeq platform, with a transcriptome sequencing depth greater than 50,000 reads / cell, and sgRNA sequencing ensuring sufficient coverage to accurately identify the sgRNA carried by each cell.

[0031] Step 5: Single-cell data analysis A comprehensive analysis of single-cell transcriptome data from all time points was performed, including: (1) Cell state definition: Using dimensionality reduction analysis (such as UMAP, t-SNE) and cluster analysis methods, different cell states / clusters that appear during the reprogramming process are identified and defined, such as: fibroblast initiation state, early intermediate state, late intermediate state, pluripotent stem cell-like state, mature iPSC state, arrested / apoptotic state, etc.

[0032] (2) Trajectory construction: Using pseudo-time series analysis (such as Monocle, Slingshot) to construct reprogrammed trajectories and clarify the transition relationship and time sequence between different cell states.

[0033] (3) Gene perturbation effect association analysis: The sgRNA perturbation information of each single cell (i.e., knocking out or activating a specific gene) is associated with its cell state / cluster and its position on the pseudo-time trajectory. For each gene perturbation (represented by sgRNA), the enrichment or depletion of the cells it carries at different time points and in different cell states / clusters is statistically analyzed.

[0034] Enrichment analysis principle: The knockout sgRNA of a specific gene that is significantly enriched in a certain stage (such as an early intermediate state) suggests that the gene may be an inhibitor of the progression of that stage (after knockout, the cell is more likely to enter that state). The significant enrichment of specific gene-activated sgRNAs at a certain stage suggests that the gene may be a promoting factor for the progress of that stage (after activation, the cell is more likely to enter that state). Conversely, significant depletion may indicate that the gene is either a promoting factor (cells have difficulty entering after knockout) or a repressive factor (cells have difficulty entering after activation).

[0035] Step 6: Identification and Validation of Key Factors Based on single-cell RNA sequencing analysis, a list of candidate genes with significant promoting or inhibiting effects in key stages or state transitions of reprogramming was selected. These genes are the key factors regulating each stage of reprogramming.

[0036] To further verify the reliability of the screening results, top-ranked candidate factors (such as Top 5 promoting factors and Top 5 repressing factors) can be selected and single-gene CRISPR-KO or overexpression experiments can be performed in target cells to observe their effects on reprogramming efficiency and speed, and to verify the accuracy of the screening results.

[0037] This invention provides a revolutionary high-throughput, high-resolution screening method that, by cleverly combining CRISPR whole-genome knockout / activation screening with single-cell transcriptome sequencing, can rapidly and systematically identify gene factors that play key regulatory roles (promoting or inhibiting) at various specific stages of cell reprogramming. This method overcomes the limitations of traditional screening methods and provides a powerful tool for a deeper understanding of reprogramming mechanisms, optimization of reprogramming schemes, and discovery of new reprogramming factors, showing broad prospects in basic research and regenerative medicine applications.

[0038] The embodiments described above are merely illustrative of several implementations of the present invention, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of the present invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these modifications and improvements all fall within the scope of protection of the present invention. Therefore, the scope of protection of this patent should be determined by the appended claims.

Claims

1. A rapid screening method of reprogramming factors based on CRISPR screening combined with single-cell sequencing analysis, characterized in that, Includes the following steps: S1. Construct a CRISPR gene knockout sgRNA lentiviral library and / or a CRISPR gene activation sgRNA lentiviral library covering the entire genome. S2. Infect the target cell population with the lentiviral library at a low multiplicity of infection (MOI) of 0.3-0.5, ensuring that each cell receives at most one sgRNA. After infection, screen to obtain positive cells. S3. Introduce the reprogramming factor into the screened cells and sample the cells at multiple key time points during the reprogramming induction process. S4. Prepare single-cell suspensions from cell samples collected at each time point, construct libraries containing both cell cDNA and sgRNA amplicon, and perform high-throughput sequencing. S5. Integrate and analyze single-cell transcriptome data from all time points, define different cell states during the reprogramming process using dimensionality reduction and cluster analysis, and construct the reprogramming trajectory using pseudo-time series analysis. S6. Associate the sgRNA perturbation information of each single cell with its cell state and position on the pseudo-time trajectory, and statistically analyze the enrichment or depletion of specific gene perturbations at different time points and in different cell states to identify genes with significant regulatory effects at specific stages or state transitions.

2. The method of claim 1, wherein the reprogramming factors are screened by a CRISPR-based screening method combined with single-cell sequencing analysis. The CRISPR gene-activated sgRNA lentiviral library was prepared using either the dCas9-VPR system or the dCas9-SAM system. 3.The method of claim 1, wherein the method is characterized by, The reprogramming factors include one or more of Oct4, Sox2, Klf4, and c-Myc, which are introduced into cells via electroporation. 4.The method of claim 1, wherein the method is characterized by, The key time points include day 0, day 3, day 7, day 10, and day 14 after reprogramming induction.

5. The rapid screening method for reprogramming factors based on CRISPR screening and single-cell sequencing combined analysis according to claim 1, characterized in that, The single-cell suspension was prepared using the 10x Genomics Chromium platform for single-cell capture, lysis, and reverse transcription.

6. The rapid screening method for reprogramming factors based on CRISPR screening and single-cell sequencing combined analysis according to claim 1, characterized in that, The high-throughput sequencing was performed on the Illumina NovaSeq platform, with a transcriptome sequencing depth of more than 50,000 reads / cell and sgRNA sequencing to ensure sufficient coverage.

7. The rapid screening method for reprogramming factors based on CRISPR screening and single-cell sequencing combined analysis according to claim 1, characterized in that, The dimensionality reduction analysis includes UMAP or t-SNE analysis, and the pseudo-time series analysis uses the Monocle or Slingshot method.

8. The rapid screening method for reprogramming factors based on CRISPR screening and single-cell sequencing combined analysis according to claim 1, characterized in that, In step S6, the statistical models used include mixed-effects models or negative binomial regression models, which are used to correct for batch effects and library complexity.

9. The rapid screening method for reprogramming factors based on CRISPR screening and single-cell sequencing combined analysis according to claim 1, characterized in that, It also includes step S7: based on the analysis results, candidate genes with significant promoting or inhibiting effects in key stages of reprogramming or state transitions are screened, and single-gene CRISPR-KO or overexpression verification is performed.

10. A method for rapid screening of reprogramming factors based on CRISPR screening and single-cell sequencing combined analysis according to any one of claims 1 to 9, characterized in that, The factors play a promoting or inhibiting role in the initial, intermediate, or stable stages of reprogramming.