Methods for detecting incorrectly-edited polynucleotide sequence
Rolling Circle Amplification (RCA) enables efficient and cost-effective detection and characterization of incorrectly-edited polynucleotide sequences, addressing the inefficiencies of existing methods by providing high-throughput and universal applicability for both on-target and off-target effects in genetic-editing procedures.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- COUNTAGEN AB
- Filing Date
- 2025-12-16
- Publication Date
- 2026-06-25
Smart Images

Figure EP2025087465_25062026_PF_FP_ABST
Abstract
Description
[0001] METHODS
[0002] The present invention relates to methods for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure. The invention also relates to kits, populations of DNA molecules, and uses related to detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure. The present invention also relates to methods for detecting and / or characterising the presence of one or more payload sequence following a genetic- editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure. The invention also relates to kits, populations of DNA molecules, and uses related to detecting and / or characterising the presence of one or more payload sequence following a genetic- editing procedure; wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0003] Performing genome engineering with precision and accuracy has become the cornerstone for developing effective treatments for genetic diseases. The recent discovery and rapid development of engineered endonucleases, such as Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-based genetic editing, have significantly facilitated the precise targeted modification of gene sequences, with many so-called Cell and Gene Therapies (CGTs) reaching clinical and commercial stages.
[0004] CRISPR-Cas RNA-guided nucleases (RGNs) are powerful genome-editing tools with a wide range of research and potential clinical applications. Despite continued advancements in therapeutic delivery, precision, and efficacy, there remains a generalized concern about the specificity of genetic editing and how targeted procedures might affect or modify unintended genomic targets, posing potential risks of genotoxicity. These risks need to be carefully characterized throughout the entire development cycle of CGT therapies. For example, when S. pyogenes Cas9 nuclease introduces double-stranded DNA breaks (DSBs) within the protospacer adjacent motifs (PAM), repair via non-homologous end joining (NHEJ) can lead to variable-length insertion or deletion mutations (indels).
[0005] For clinical applications, such as gene therapy, detecting even low-frequency alterations (such as indels and / or translocations) is crucial for several reasons. The unintended genetic-editing alterations to the genome can potentially cause new mutations or disrupt other important genes, potentially leading to adverse effects, such as cancer or genetic disorders. Detecting these alterations ensures that only the intended modifications are made (Atkins et al. 2021, Front. Genome Ed. 3:673022) and helps minimize these risks, improving the overall safety of the therapy. In addition, verifying that the intended genetic modification has been successfully made ensures that the therapy will be effective. This is particularly important for treating genetic disorders where precise corrections are needed (Chehelgerdi et al. 2024, Mol Cancer 23(9)). Also, accurate detection and documentation of genetic-editing alterations may be necessary to meet regulatory standards and guidelines. Regulatory agencies often require thorough safety and efficacy evaluations of genetic-editing therapies and detecting and / or characterizing genetic-editing alterations (both on-target and off-target) is essential for meeting these standards and ensuring safety of the therapy.
[0006] Unintended alterations from genetic editing can generally be grouped into two categories: on-target and off-target. Importantly, the unintended alterations can occur at both on- target and off-target sites. On-target alterations refer to alterations that occur at the intended editing site. Off-target alterations, on the other hand, include alterations at unintended locations within the genome due to the gene-editing procedure.
[0007] Identifying alterations (such as insertions or deletions ("indels"), translocations and / or complex genomic rearrangements) across the genome remains a significant challenge. In the context of clinical applications in humans, the human genome comprises over three billion base pairs, making it difficult to comprehensively search for off-target alterations. Small off-target alterations can occur anywhere, including non-coding regions, possibly at very low frequencies, complicating detection.
[0008] On-target editing is commonly characterized by Amplicon Sequencing, where techniques like Sanger sequencing or Illumina sequencing are used to analyse 100-1000 base pairs (bp) around the intended genetic-editing target. Such approaches are however not capable of detecting off-target alterations that have occurred at genomic locations beyond the sequenced region and further from the intended genetic-editing site.
[0009] A number of methods have been developed to assess off-target effects, allowing a genomewide survey of the outcomes for CRISPR-based editing procedures. These methods can be divided into six main categories:
[0010] (i) Whole Genome Sequencing (WGS);
[0011] (ii) in vitro genomic DNA cleavage; (iii) anchored-primer target enrichment;
[0012] (iv) in situ end-capture;
[0013] (v) chromatin immunoprecipitation sequencing (ChlP-seq); and
[0014] (vi) translocation enrichment.
[0015] WGS enables full-genome surveying, but it is often considered inefficient due to a limited sensitivity to detect low frequency alterations (low signal-to-noise ratio). The majority of the data collected by WGS corresponds to unedited genomic DNA, which significantly reduces coverage depth at sequence locations of interest. As a result, WGS is limited by throughput, cost, and efficiency.
[0016] In vitro genomic DNA cleavage techniques, such as Digenome-seq (Kim et al., 2015, Nat Methods 12, 237-243) and SITE-Seq (Cameron et al., 2017, Nat Methods 14, 600-606) rely on CRISPR-induced cleavage of genomic DNA, followed by ligation of sequencing adaptors to enrich and increase coverage in these regions via PCR-based workflows. CIRCLE-seq (Tsai et al., 2017, Nat Methods 14, 607-614) involves circularizing genomic DNA fragments and performing in vitro cleavage downstream, so only cleaved circles are available for library preparation. However, these methods often detect more unintended activity in vitro than occurs in cellular environments and do not account for endogenous DNA repair mechanisms.
[0017] Anchored-primer target enrichment methods, such as GUIDE-seq (Tsai et al., 2015, Nat Biotechnol 33, 187-197), tag and enrich cleavage sites using the non-homologous endjoining (NHEJ) repair pathway, simulating natural gene-editing patterns. Tagged cleavage sites are then amplified by PCR and prepared for sequencing. GUIDE-seq has reported issues with low reproducibility, particularly in complex biological systems like primary cells.
[0018] In situ end-capture methods, such as BLISS (Yan et al., 2017, Nat Commun 8, 15058) and INDUCE-seq (Dobbs et al., 2022, Nat Commun 13, 3989), label cleavage sites directly on fixed and permeabilized cells and tissues. Cleavage sites are tagged via in situ ligation, followed by DNA extraction and library preparation. However, BLISS and INDUCE-seq detect cleavage events at specific time points, which may not represent the full extent of events over time.
[0019] ChlP-seq identifies protein-DNA interactions by using antibodies to capture proteins bound to DNA, followed by sequencing to map their binding sites. DISCOVER-Seq (Wienert et al., 2020, Nat Protoc 15, 1775-1799) targets the MRE11 protein, which detects CRISPR- induced DNA breaks prior to repair. Although DISCOVER-Seq may be more precise in challenging contexts like primary cells or in vivo systems where other methods (such as GUIDE-seq) may struggle, it has a sensitivity threshold of 0.3%. In addition, research suggests that only a minority the sites identified by ChlP-seq correspond to actual cleavage by active Cas9 (Duan et al. 2014, Cell Res. 24(8): 1009-12).
[0020] Translocation enrichment-based methods, such as HTGTS (High-Throughput, Genome- Wide Translocation Sequencing) (Hu et al., 2016, Nat Protoc 11, 853-871) and UDiTaS (Unbiased Detection of Insertions, Translocations, and Deletions) (Giannoukos et al., 2018, BMC Genomics 19, 212) are designed to detect large-scale genomic alterations, such as translocations, insertions, and deletions. HTGTS works by capturing translocations that occur when different chromosomes fuse at cleavage sites. UDiTaS uses universal and genespecific primers combined with Tn5 transposase, PCR amplification and sequencing to simultaneously detect indels, translocations, and large rearrangements. Both methods offer a view of genomic changes beyond simple cleavage site identification, encompassing the outcomes of DNA repair, which are crucial for evaluating the broader impact of geneediting events. However, these techniques miss smaller cleavage events.
[0021] Importantly, a number of the above methods, including Digenome-seq, SITE-Seq, CIRCLE- seq, GUIDE-seq, BLISS, INDUCE-seq and ChlP-seq, use "next generation sequencing" as a means to collect and interpret data. However, the high cost of sequencing makes it impractical, expensive and difficult to scale, particularly when dealing with large cell populations or complex organisms. Furthermore, sequencing-based methods fail to detect some off-target alterations (often due to the nature of site targeting and the specific conditions under which each method operate (Kim et al., 2016, Genome Res. 26(3) :406- 15), and involve multiple steps and procedures which add complexity and inefficiency, or use enzymatic reactions known to introduce errors at the rate of off-targets, potentially masking or falsely calling off-targets. Thus, sequencing-based methods tend to provide variable results depending on the method used (introducing bias), and are not yet practical for routine use due to complexity and cost.
[0022] In practice, it is often necessary to use a combination of methods to gain a complete picture of gene-editing outcomes. However, this multi-method approach is time-consuming, expensive, and not universally applicable across different experimental contexts. Furthermore, variability in sensitivity, reproducibility, and resolution among these techniques poses additional challenges for regulatory approval and clinical translation of gene therapies.
[0023] Therefore, there is a need for unbiased, universal, genome-wide methods that can comprehensively and accurately detect and / or characterise one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure, including both on-target and off-target effects, across a range of cell types and experimental conditions. Such advancements would not only streamline the characterization process but also improve safety and efficacy evaluations, paving the way for more reliable therapeutic applications of genome engineering.
[0024] Against this background, the present inventors have developed methods, uses and kits for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure, which are surprisingly advantageous. As explained below and in the accompanying examples, the method of the invention detects and / or characterises one or more incorrectly-edited polynucleotide sequences from a genetic- editing procedure using Rolling Circle Amplification (RCA).
[0025] RCA has been applied to cancer profiling (Huang et al., 2020, Nanoscale, 12 (4), 2445- 2451), and the detection of pathogens (Neumann et al., 2018, Clin. Chem, clinchem.2018.292979) for the potential use in clinical diagnostics. However, RCA has not previously been used for detecting and / or characterising incorrectly-edited polynucleotide sequences from a genetic-editing procedure. Prior to the present invention, that was not deemed possible, because - for RCA to work - it is necessary to design and use a probe that targets the incorrectly-edited site in the genetically-edited polynucleotide sequence: however, without knowing where in the polynucleotide sequence that incorrectly-edited site was, it was not possible to design such a probe. For that reason, prior to the present invention, those in the field were (as already discussed above) developing methods for detecting incorrectly-edited sites that are much more expensive, complicated and inefficient than the present invention, which uses the relatively cheap, simple and effective RCA process.
[0026] As explained below, and shown in the accompanying Examples, the present invention provides methods, uses and kits that have an unbiased ability to detect and / or characterise incorrectly-edited polynucleotide sequences from a genetic-editing procedure across the whole genome (genome-wide), and that are cost effective and practical enough for routine use. Advantageously, the generated Rolling Circle Amplification Products (RCPs) can be analysed directly via optical imaging, or undergo further processing steps, such as sequencing, if required. Conveniently, further steps, such as enrichment, can be performed to provide even more accurate data.
[0027] In a first aspect, the invention provides a method for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure, the method comprising the steps of:
[0028] (i) providing a sample from a genetic-editing procedure, the sample comprising one or more incorrectly-edited polynucleotide sequence;
[0029] (ii) performing Rolling Circle Amplification, to generate one or more RCA- Products from the one or more incorrectly-edited polynucleotide sequence in the sample; and
[0030] (iii) detecting and / or characterising the one or more incorrectly-edited polynucleotide sequences based on the one or more RCA-Products generated in step (ii).
[0031] Thus, the invention provides a method for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure; the method is based on RCA which, as shown in the accompanying Examples, provides a number of advantages when compared to current methods. As explained in detail below, RCA-based methods are highly-specific, amenable to multiplexed reactions and high-throughput analysis / screening, and can be performed with standard equipment and inexpensive reagents.
[0032] The present invention therefore provides particularly advantageous methods for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure, that address shortcomings of the prior art. Unlike the prior art techniques, and as explained below and in the accompanying examples, the present inventors' method avoids the multiple complex steps using expensive and specialist laboratory equipment. Notably, the present invention can be performed without the need for DNA sequencing; but where sequencing is desired, it is compatible with sequencing platforms, sequencing protocols and instruments.
[0033] Step (i) of the method of the invention comprises providing a sample from a genetic-editing procedure. By "genetic-editing procedure" we include any procedure in which a target polynucleotide sequence is "edited" by addition and / or deletion and / or mutation and / or translocation of one or more nucleotide base in that molecule. The term "genetic-editing" also encompasses "genome-editing" and "gene-editing". Genetic-editing can lead to various types of effects in an organism, depending on the technology and the goals of the modification. Genetic-editing can involve, but is not limited to: disabling, or "knocking out", a specific gene to study its function or eliminate its effects (for example, knocking out a gene that causes a disease); adding, or inserting, a gene into a genome to introduce beneficial traits (for example, disease resistance in crops); replacing a faulty gene with a healthy copy (often used in gene therapy to treat genetic disorders); and / or making precise changes to the DNA sequence (such as correcting a mutation). Several genome editing tools have been developed, including ZFNs, TALENs, and the now-popular CRISPR / Cas9 system, and each of these genetic-editing tools are included under what is referred to herein as "genetic-editing procedure".
[0034] By "a sample from a genetic-editing procedure" we include a sample comprising or consisting of the resulting product of the genetic-editing procedure. As will be appreciated, such a sample may comprise one or more incorrectly-edited polynucleotide sequences. Furthermore, the sample may also comprise one or more unedited polynucleotide sequence and / or one or more correctly-edited polynucleotide sequence.
[0035] Many genetic-editing processes (such as CRISPR-Cas9) involve two main steps:
[0036] 1. Breaking DNA: when an enzyme with catalytic properties (such as Cas9 protein guided by a specific guide RNA sequence (sgRNA), binds to the target polynucleotide sequence and introduces a Double-Strand Break (DSB) at that location. This break is important as it signals the cell's repair mechanisms to repair the broken polynucleotide sequence in Step 2; and
[0037] 2. Repairing DNA: when after the DSB is created, the cell's natural repair processes attempt to repair the DSB. There are two primary pathways for repairing DSB: a. Non-Homologous End Joining (NHEJ): an error-prone repair mechanism that directly repairs the broken DNA ends without a template, often resulting in small insertions or deletions (indels); or b. Homology-Directed Repair (HDR): a more precise repair mechanism that uses a homologous DNA template to repair the DSB, allowing for specific edits such as gene correction or insertion to accurately repair the DSB.
[0038] By "incorrectly-edited polynucleotide sequence", we include a polynucleotide sequence that comprises an unintended alteration at any given nucleotide compared to the original, unedited polynucleotide sequence.
[0039] Importantly, and as explained above, the incorrect-editing (unintended alteration) can occur at both on-target and off-target sites. By "on-target" site, we include the specific, intended location in a polynucleotide sequence where the gene-editing tool, such as CRISPR-Cas9, was designed (for example, by engineering gRNA complementary to a specific target sequence) to make modification. By "off-target" site, we include any location in a polynucleotide sequence, other than the intended on-target site, where unintended alterations may occur (for example, due to an inaccurate activity of the geneediting tool). In other words, incorrect editing may happen when a gene-editing tool makes an unintended alteration at the intended on-target site, and / or make an unintended alteration at an off-target site. Therefore, the incorrectly-edited polynucleotide sequence may be a polynucleotide sequence that has undergone any on-target and / or off-target error during the gene-editing process.
[0040] When incorrect-editing takes place at and / or within the on-target site, it may include one or more from the group comprising: a replication error, mispairing, mutation (including insertions or deletions), chromosomal rearrangement, translocation or aberration. The incorrect-editing at and / or within on-target site may include a single nucleotide alteration, or the incorrect-editing may include more than one nucleotide alteration at and / or within the on-target site.
[0041] When incorrect-editing takes place at and / or within an off-target site, it may include one or more from the group comprising of: a replication error, mispairing, mutation (including insertions or deletions), chromosomal rearrangement, translocation or aberration. The incorrect-editing at and / or within an off-target site may include a single nucleotide alteration, or the incorrect-editing may include more than one nucleotide alteration at and / or within an off-target site. Examples of off-target site incorrect-editing are presented in Figure 6. The alteration within an on-target site and / or off-target site may also include formation of an unintended DSB, without any sequence alteration. As shown in the accompanying Examples, the present method is particularly advantageous in detecting and / or characterising DSBs in a polynucleotide sequence.
[0042] In some embodiments of the methods disclosed herein, the incorrectly-edited polynucleotide sequence comprises at least one alteration at any given nucleotide compared to the original polynucleotide sequence. In some embodiments of the methods disclosed herein, the incorrectly-edited polynucleotide sequence comprises 1 to 5 alterations compared to the original polynucleotide sequence, 1 to 10 alterations compared to the original polynucleotide sequence, or 1 to 100 alterations compared to the wild type polynucleotide sequence.
[0043] By "polynucleotide sequence" we include any biopolymer composed of nucleotide monomers in a chain, for example DNA and / or cDNA and / or RIMA. Each polynucleotide sequence may be present in the sample in an amount of from about from about 0.01 ng to about 5000 ng, such as from about 0.01 ng to about 2000 ng, for example from about 0.1 ng to about 1000 ng. Put another way, in step (i) each polynucleotide sequence may be present in an amount of from about 0.01 ng / pL to 500 ng / pL, such as from about 0.05 ng / pL to about 250 ng / pL.
[0044] The term "detecting" is used herein to include the step of determining the presence of the incorrectly-edited polynucleotide sequence in the target polynucleotide molecule. It will be understood that in the methods of the invention the target polynucleotide molecule is detected by detecting the RCA-Products of Step (ii), which serve as a "reporter" for the incorrectly-edited polynucleotide sequence. Accordingly, detecting the RCA-Products in Step (iii) may include determining, measuring, assessing, or assaying the presence or absence or amount or location of the RCA-Products in any way. The presence of RCA- Product in the sample (i.e. the confirmation of its presence or amount) is indicative or identificatory of the presence of the incorrectly-edited polynucleotide sequence in the target polynucleotide molecule. Thus, detection of the RCA-Products generated in step (ii) allows presence of the incorrectly-edited polynucleotide sequence to be determined.
[0045] The term "characterising" is used herein to include the step of assessing one or more specific attributes, features, or properties of the one or more incorrectly-edited polynucleotide sequence from a genetic-editing procedure. Accordingly, by "characterising" we include one or more from the group comprising: determining the nucleotide sequence of the one or more incorrectly-edited polynucleotide sequence; determining the type and / or nature of the edit within the one or more incorrectly-edited polynucleotide sequence; determining the structure (for example, secondary) of the one or more incorrectly-edited polynucleotide sequence; determining the inter- and / or intramolecular interactions between and / or within the one or more incorrectly-edited polynucleotide sequence; identifying and / or categorizing and / or mapping the one or more incorrectly-edited polynucleotide sequence to known genes, transcripts, and / or isoforms; functional analysis (such as linking the one or more incorrectly-edited polynucleotide sequence to biological functions and pathways to understand its role in a certain process; and / or annotation or identification of functional motifs in the sequence.
[0046] In an embodiment, the invention provides a method wherein incorrectly-edited polynucleotide sequences from a genetic-editing procedure are enriched and / or captured in step (i), before step (ii). Enriching and / or capturing incorrectly-edited polynucleotide sequences from a genetic-editing procedure may help ensure that the incorrectly-edited polynucleotide sequences are adequately represented in the final data, increasing accuracy of the assay.
[0047] By "enriched" or "enrichment" we include one or more steps to purify, and / or concentrate incorrectly-edited polynucleotide sequences from a sample, enhancing their detection and / or characterisation and analysis. Enrichment techniques increase the proportion of incorrectly-edited polynucleotide sequences in a sample (for example, from an undesirably large sample volume), assist in the removal of any inhibiting components (in the context of the present invention, examples of the inhibiting components could be, for example, unbound, unligated probes). Enrichment of incorrectly-edited polynucleotide sequences can be achieved through several methods, such as selective amplification (for example, using primers that specifically bind to a sequence which is common for all incorrectly-edited polynucleotide sequences and PCR); or size selection (gel electrophoresis or other methods to select incorrectly-edited polynucleotide sequences of a specific size).
[0048] By "captured" or "capturing" we include one or more steps to isolate incorrectly-edited polynucleotide sequences from a sample. Capturing of incorrectly-edited polynucleotide sequences can be achieved through several methods, such as hybrid capture and / or immunoprecipitation. In some embodiment, incorrectly-edited polynucleotide sequences are captured using probes that hybridize to a sequence which is common for all incorrectly-edited polynucleotide sequences, which can be in turn isolated using magnetic beads or other methods. In some embodiments, incorrectly-edited polynucleotide sequences from a genetic-editing procedure are captured in step (i) and before step (ii) using magnetic beads. In some embodiments, the magnetic beads may be coated with a one or more ligand (such as streptavidin) to bind incorrectly-edited polynucleotide sequences (such as avidin-conjugated probes able to hybridise to incorrectly-edited polynucleotide sequences).
[0049] In some embodiment, incorrectly-edited polynucleotide sequences are captured by immunoprecipitation, where antibodies are used to bind and isolate target polynucleotide sequences. For example, in chromatin immunoprecipitation (ChIP), antibodies specific to DNA-binding proteins are used to capture DNA-protein complexes.
[0050] Step (ii) of the method of the invention comprises performing Rolling Circle Amplification, to generate RCA-Products from the one or more incorrectly-edited polynucleotide sequence in the sample.
[0051] Unlike prior art-based approaches, the method of the present invention uses Rolling Circle Amplification ("RCA") as a tool for detecting and / or characterising one or more incorrectly- edited polynucleotide sequences from a genetic-editing procedure. Specifically, following the genetic-editing procedure, RCA is used to selectively generate RCA-Products (also referred to herein as "RCA Products", "RCP" or "RCPs") from the one or more incorrectly- edited polynucleotide sequences. After RCA the RCA-Products are present in a liquid sample, which can undergo further processing steps to analyse the RCA-Products, either as a liquid sample or as a solid sample.
[0052] RCA is a well-known single molecule amplification method that allows for digital quantification of nucleic acids. RCA uses highly processive polymerases on a circular single-stranded polynucleotide substrates to generate a long ssDNA (i.e. single-stranded DNA) concatemer in the hundreds of nanometres- to micrometre-range (Baner et al., 1998, Nucleic Acids Res, 26 (22), 5073-5078). RCA is often combined with DNA probes, such as "padlock probes" (PLPs), which are sequence specific oligonucleotides binding in a circular manner to the target strand which can then be covalently linked by a ligation step. A PLP- based RCA assay offers extreme stringency with single base precision (Nilsson et al., 1994, Science, 265 (5181), 2085-2088). Those probes can then be amplified with an enzyme capable of amplifying circular DNA substrates. The enzyme commonly used in RCA is Phi29 DNA polymerase. Phi29 polymerase offers several advantages making it particularly effective for RCA. Phi29 DNA polymerase has a 3' to 5' exonuclease proofreading activity, which allows it to correct errors during DNA synthesis. This property greatly enhances the fidelity of Phi29 DNA polymerase. In addition, Thermus aquaticus (Taq) DNA polymerase, commonly used in PCR, lacks 3' to 5' proofreading activity, resulting in a higher error rate. Phi29 DNA polymerase has an error rate of about 1 in 106nucleotides, whereas Taq DNA polymerase has an error rate of about 1 in 104to 105nucleotides. Therefore, RCA that uses Phi29 DNA polymerase has significantly higher fidelity (or accuracy) compared to PCR based amplification that uses Taq DNA polymerase. By "high fidelity amplification" we include the meaning of amplification that results in amplicons that have very few or no sequence changes relative to the corresponding original sequence. Consequently, due to its high fidelity, Phi29 DNA polymerase is more suitable for applications requiring accurate DNA replication, such as whole genome amplification and sequencing. The inventors' method relies on RCA, which can accurately replicate DNA with minimal errors, making it particularly useful in detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure.
[0053] Step (iii) of the method of the invention requires detecting and / or characterising the one or more incorrectly-edited polynucleotide sequences based on the RCA-Products generated in step (ii).
[0054] In a preferred embodiment, the invention provides a method wherein step (iii) comprises quantifying the one or more RCA-Product and / or determining some or all of the nucleotide sequence of one or more RCA-Product.
[0055] In the methods of the invention, quantitative and qualitative determinations, measurements or assessments are included, including semi-quantitative. Such determinations, measurements or assessments may be relative, for example when two or more different polynucleotide sequences in a sample are being detected, or absolute. As such, the term "quantifying" when used in the context of quantifying the one or more RCA- Products can refer to absolute or to relative quantification. Absolute quantification may be accomplished by inclusion of known concentration(s) of one or more control polynucleotide sequence and / or referencing the detected level of the polynucleotide sequence with known control gene sequence (e.g. through generation of a standard curve). Relative quantification can be accomplished by comparison of detected levels or amounts between two or more different RCA-Products to provide a relative quantification of each of the two or more RCA-Product, i.e., relative to each other.
[0056] The RCA-Products may be detected using any convenient method. The RCA-Products may be detected directly, preferably by one or more detectable moiety that binds to the RCA- Products. Alternatively, the RCA-Product may be detected indirectly, i.e. where the detectable moiety is a member of a signal producing system made up of two or more components.
[0057] In an embodiment, the invention provides a method wherein step (iii) comprises quantifying the RCA-Products generated in step (ii), in order to determine the amount of the incorrectly-edited polynucleotide sequence.
[0058] By "quantifying one or more RCA-Product" we include quantifying in a digital manner. The term "digital manner" refers to the precise detection and enumeration of individual RCA- Products, which allows for the precise detection and enumeration of the specific target polynucleotide sequences (such incorrectly-edited polynucleotide sequences) as at a single-molecule level. In other words, when RCA-Products are quantified digitally, each one detection event corresponds to one original target polynucleotide sequence. Advantageously, this manner of counting enables detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure with high precision.
[0059] In a particular embodiment, RCA-Products can be detected and quantified by any single molecule detection scheme that allows for differential detection, including but not limited to, amplified single-molecule detection, by immobilisation on a solid support such as a microarray, and / or enrichment on a filter membrane. In relation to step (iii), it will be appreciated that any single molecule detection method can be used. Such single molecule detection methods may include fluorescence microscopy, such as epifluorescence microscopy.
[0060] In an embodiment, the invention provides a method wherein the number of RCA-Products in the sample is greater than 1 RCA-Product, such as greater than 10, 100, 200, 400, 500, 1000, 2000, 5000, 10,000, 100,000, or greater than 500,000 RCA-Products. In an embodiment, the invention provides a method wherein instead of, or in addition to, quantifying one or more RCA-Product, some or all of the nucleotide sequence of one or more RCA-Product is determined.
[0061] By "determining the nucleotide sequence" we include the process of identifying the precise composition and / or arrangement of nucleotide bases within the polynucleotide sequence (such as RCA-Product) and / or defining the presence of mutations, polymorphisms, motifs, repeats, or variations that may be present. It will be appreciated that by determining the nucleotide sequence of one or more RCA-Products in step (ii) it is possible to discern the nucleotide sequence of one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure and thereby identify any incorrect gene-editing events that have taken place.
[0062] In an embodiment, the invention provides a method wherein determining some or all of the nucleotide sequence of one or more RCA-Products generated in step (ii) comprises sequencing. Preferably, determining the nucleotide sequence of one or more RCA-Products generated in step (ii) is achieved by next-generation sequencing (NGS), for example, by sequencing-by-ligation (SBL), sequencing-by-synthesis or strand sequencing.
[0063] In the embodiment comprising sequencing-by-ligation (SBL), some or all of the nucleotide sequence of one or more RCA-Product generated in step (ii) is determined directly, preferably by sequencing detectable moieties that bind to the RCA-Products. Sequencing detectable moieties comprises imaging, such as fluorescent microscopy (Ke et al., 2013, Nat Methods 10, 857-860). It will be appreciated that for SBL, RCA-Products are immobilised on a surface.
[0064] In the embodiment comprising next-generation sequencing (NGS), some or all of the nucleotide sequence of one or more RCA-Products generated in step (ii) is determined, for example, by amplifying one or more RCA-Products, or fragments thereof, by PCR, where the amplification products may be sequenced.
[0065] In an embodiment, the invention provides a method wherein the RCA-Product is subjected to shearing (cutting, clipping, trimming or monomerising) into fragments; and preparing the fragments for sequencing, e.g., high-throughput sequencing, by end-repair / a- tailing / ligation of a sequencing adapter, e.g., a single-tailed sequencing adapter. In some embodiments, long tandem repeats of the RCA Product are sheared, for example with restriction endonuclease enzyme, to produce monomeric fragments. These monomeric fragments can then be used for various downstream applications, for example, library preparation for next-generation sequencing. This approach can simplify the process and reduce time, as the DNA can often be used directly in sequencing without extensive purification. Producing monomers from RCA-Products and using the monomers for sequencing library assembly is known in the art.
[0066] In an embodiment the sequence of one or more nucleotide of one or more RCA-Product is determined. In some embodiments of the method disclosed herein, the sequence of at least 2 nucleotides, at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, at least 50 nucleotides, at least 55 nucleotides, at least 60 nucleotides, at least 65 nucleotides, at least 70 nucleotides, at least 75 nucleotides, at least 80 nucleotides, at least 85 nucleotides, at least 90 nucleotides, at least 95 nucleotides or at least 100 nucleotides, or more than 100 nucleotides of one or more RCA-Product is determined.
[0067] In some embodiments of the method of the invention, the nucleotide sequence determined comprises 1 nucleotide, 1 to 5 nucleotides, 1 to 10 nucleotides, 1 to 20 nucleotides, 1 to 50 nucleotides, 1 to 100 nucleotides, 1 to 1000 nucleotides or more than 1000 nucleotides of one or more RCA-Product.
[0068] In a preferred embodiment, the invention provides a method, wherein the method further comprises step (i-a), which step is performed prior to step (ii) and comprises: generating circular single-stranded polynucleotide substrates from the one or more incorrectly-edited polynucleotide sequence in the sample.
[0069] By "circular single-stranded polynucleotide substrates" we include substrates produced using the probes described herein followed by ligation. For example, the circular singlestranded polynucleotide substrates may comprise two or more segments complementary to the target sequence (such as one or more incorrectly-edited polynucleotide sequence) connected by a linker polynucleotide sequence. As explained in detail below, when the complementary segments hybridise to the target polynucleotide sequence, the probe can be ligated and become circularized. In a preferred embodiment, the invention provides a method wherein the circular singlestranded polynucleotide substrates are generated using a one or more oligonucleotide probe which specifically targets the one or more incorrectly-edited polynucleotide sequence. This may be achieved by using one or more probe that targets the one or more incorrectly-edited site and that becomes circular only once it hybridises to the respective target, or by circularising the target nucleic acid fragment using selector probes (explained elsewhere herein).
[0070] In a preferred embodiment, the invention provides a method wherein step (i-a) comprise: integrating an Oligonucleotide-Tag at the incorrectly-edited site in the one or more incorrectly-edited polynucleotide sequence;
[0071] - generating circular single-stranded polynucleotide substrates using a Probe that targets the Oligonucleotide-Tag at the incorrectly-edited site.
[0072] As will be clear to those skilled in the art in light of the teachings of the invention herein, the first step of the genetic-editing procedure (for example, when performed using genome-editing nucleases such as CRISPR / Cas9) introduces a double-strand break into the polynucleotide sequence; that double-strand break then permits the Oligonucleotide- Tag to be integrated into the edited polynucleotide. As will be appreciated, the Oligonucleotide-Tag then enables detection and / or characterisation of incorrectly-edited polynucleotide sequences.
[0073] The Oligonucleotide-Tag is a double stranded polynucleotide sequence. The Oligonucleotide-Tag may comprise modifications in the sequence backbone, for example one or more modification from the list comprising: boranophosphates, phosphorothioates, phosphoramidates, methylphosphonates, amides, triazole linkages, ureas, suaramides, Peptide Nucleic Acids (PNA), morpholino modifications, carbamates, and / or ethylphosphonates.
[0074] In some embodiments, the Oligonucleotide-Tag comprises one or more Peptide Nucleic Acid. Peptide Nucleic Acids are synthetic analogs where the sugar-phosphate backbone is replaced with a peptide-like backbone made of repeating N-(2-aminoethyl)-glycine units. PNA modification may have various advantages for the Oligonucleotide-Tag, for example, enhanced resistance to nuclease degradation, thereby maintaining or improving stability of the Oligonucleotide-Tag. Preferably, the Oligonucleotide-Tag is a double stranded DNA (dsDNA) oligonucleotide. Preferably, the nucleotide sequence of the Oligonucleotide-Tag is different to the nucleotide sequence of the polynucleotide sequences in the sample (for example, DNA sample from a genetic-editing procedure). In that context, by "different" we include that the nucleotide sequence of the Oligonucleotide-Tag is not present in and / or is not complementary to any sequence present in the sample. In some embodiments, the nucleotide sequence of the Oligonucleotide-Tag has no more than 1%, 10% or 20% sequence identity to polynucleotide sequences in the sample.
[0075] In some embodiments, the Oligonucleotide-Tag is between 15 and 75 nucleotides in length, for example, between 15-50 nucleotides in length, between 50-75 nucleotides in length, between 30-35 nucleotides in length, between 60-65 nucleotides in length, or between SO- 65 nucleotides in length. In a preferred embodiment, the Oligonucleotide-Tag is between 15 and 50 nucleotides in length, for example between 20-40 or between 30-35 nucleotides in length.
[0076] In some embodiments, the Oligonucleotide-Tag includes a sequence which serves as a restriction enzyme recognition site. The presence of the restriction enzyme recognition site allows for the specific cleavage of the Oligonucleotide-Tag during subsequent analysis. Advantageously, this helps in distinguishing the Oligonucleotide-Tag integration events from other sequences in the sample, making it easier to detect and / or characterise one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure.
[0077] In a preferred embodiment, one or both of the 5' ends of the Oligonucleotide-Tag are phosphorylated.
[0078] In a preferred embodiment, the Oligonucleotide-Tag comprises one or more exonuclease protection moiety. The exonuclease protection moiety may be located within the Oligonucleotide-Tag sequence (i.e. internally) or at a terminal nucleotide (i.e. external).
[0079] Exonuclease protection may have various advantages for the Oligonucleotide-Tag, such as to enhance resistance to nuclease degradation, thereby maintaining or improving stability of the Oligonucleotide-Tag. In some embodiments, the exonuclease protection moiety may involve replacing one of the non-bridging oxygen atoms in the phosphate backbone with a sulphur atom to form a phosphorothioate linkage.
[0080] In some embodiments, the Oligonucleotide-Tag comprises at least one phosphorothioate linkage, at least 2 phosphorothioate linkages, at least 5 phosphorothioate linkages, at least 10 phosphorothioate linkages, at least 15 phosphorothioate linkages or at least 20 phosphorothioate linkages. In some embodiments, the Oligonucleotide-Tag comprises between 1-20 phosphorothioate linkages, between 2-10 phosphorothioate linkages or between 3-5 phosphorothioate linkages.
[0081] In some embodiments, one or more end of the Oligonucleotide-Tag comprises a phosphorothioate linkage. By "one or more end" we include one or more 3' and / or 5' distal nucleotide of the Oligonucleotide-Tag. Any Oligonucleotide-Tag will have four such ends. In some embodiments, one or both 3' ends of the Oligonucleotide-Tag comprise phosphorothioate linkages, and / or one or both 5' ends of the Oligonucleotide-Tag comprise phosphorothioate linkages. In one embodiment both 3' ends and both 5' ends of the Oligonucleotide-Tag comprise phosphorothioate linkages.
[0082] In some embodiments, the Oligonucleotide-Tag is biotinylated, e.g., comprises one or more biotin moiety. The presence of the one or more biotin moiety may allow enrichment and / or capture of polynucleotide sequences where the Oligonucleotide-Tag was incorporated. In some embodiments, the Oligonucleotide-Tag comprises both one or more phosphorothioate linkage and one or more biotin moiety.
[0083] In some embodiments, one or more different Oligonucleotide-Tags is used (i.e. Oligonucleotide-Tags that differ in terms of nucleotide sequence and / or nucleotide length and / or the restriction enzyme recognition site present and / or other modifications (such as phosphorylation status and / or phosphorothioate linkage and / or biotinylation). In some embodiments, two or more different Oligonucleotide-Tags is used, for example three or more different Oligonucleotide-Tags, four or more different Oligonucleotide-Tags, five or more different Oligonucleotide-Tags, 10 or more different Oligonucleotide-Tags, 15 or more different Oligonucleotide-Tags, 20 or more different Oligonucleotide-Tags, 25 or more different Oligonucleotide-Tags, or 30 or more different Oligonucleotide-Tags are used. In some embodiments, the number of the different Oligonucleotide-Tag used is between 1 and 30, preferably between 1 and 10, more preferably between 1 and 5. In an embodiment, the invention provides a method wherein the Oligonucleotide-Tag comprises two complementary oligonucleotides which comprise or consist of the sequences of SEQ. ID. NO: 1 and SEQ. ID NO:2:
[0084] G*T*T*TAATTGAGTTGTACTGAGGAGCTGCA*T*A* (SEQ. ID. NO: 1) T*A*T*GCAGCTCCTCAGTACAACTCAATTAA*A*C* (SEQ. ID NO: 2) (wherein * designates a phosphorothioate linkage)
[0085] It will be appreciated that different Oligonucleotide-Tags will be integrated randomly at any incorrectly-edited sites, and that different tags can then be used to identify and characterise differently edited sites in the polynucleotide sequences in the sample.
[0086] Preferably, where more than one different Oligonucleotide-Tag is used, each different Oligonucleotide-Tag comprises a unique (different) nucleotide sequence. Each different Oligonucleotide-Tag can be targeted with a different Probe, each specifically designed to target each different Oligonucleotide-Tag. In the embodiment where more than one Oligonucleotide-Tag is used, more than one Probe is used in a multiplex system, where different RCA-Products which originated from a different probe can be detected using different detectable moieties, preferably spectrally separated fluorophores. RCA-Products labelled with spectrally separated fluorophores can also be detected separately (for example, using different microscope filters) which improves accuracy of the detection, for example, by preventing or eliminating interference originating from spectral overlapping of RCA-Products (spectral crowding). / Xdvantageously, this approach enables more accurate detection and / or characterisation of incorrectly-edited polynucleotide sequences from a genetic-editing procedure.
[0087] By "integrating" we include any means to covalently attach and incorporate the Oligonucleotide-Tag into the double strand break of a polynucleotide sequence. Two nonlimiting examples of how the Oligonucleotide-Tag may be integrated include harnessing cell's natural repair mechanisms (such as non-homologous end joining (NHEJ) or homology-directed repair (HDR) explained above) or enzymatic ligation when polynucleotide sequences are isolated from the cells. The integration method may depend on the sample type, and the skilled person will be aware of how to most optimally integrate a given Oligonucleotide-Tag into the polynucleotide sequences in the sample. For example, if the Oligonucleotide-Tag is toxic to cells, it may not be integrated by the natural cell repair mechanisms and may need to be integrated via enzymatic ligation.
[0088] In an embodiment, the invention provides a method wherein the Probe comprises nucleotide sequence capable of binding to the Oligonucleotide-Tag.
[0089] Preferably, the Probe comprises nucleotide sequence that is complementary to the target polynucleotide molecule. Accordingly, one or more part of the Probe may comprise nucleotide sequence that hybridises with the Oligonucleotide-Tag at the incorrectly-edited site in a target-dependent manner. That one or more part of the Probe forms a first ligation end of the Probe. In a preferred embodiment, another part within the same Probe molecule, or the one or more part of another Probe (for example, when "split-like" probes are utilised) can simultaneously bind a target polynucleotide sequence which is adjacent to the incorrectly-edited site. That other part within the same Probe molecule, or the one or more part of another Probe, forms a second ligation end of the Probe. For example, if the Probe comprises two parts comprising nucleotide sequences complementary to the target polynucleotide molecule, one part is designed to hybridise to the Oligonucleotide- Tag incorporated at the incorrectly-edited site to form the first ligation end, and the other part is designed to hybridise to sequence adjacent to the site where the Oligonucleotide- Tag was incorporated to form the second ligation end.
[0090] Upon annealing to the target site, the one or more part which comprises nucleotide sequence that is complementary to the target form one or more ligation site. As explained above, ligation ends are required to be in direct juxtaposition and hybridised to their respective complementary base pairs in the target polynucleotide molecule in order for ligation to take place.
[0091] Accordingly, it will be appreciated that the ligatable ends may be brought into juxtaposition for ligation in various ways, depending on the design of the probe and / or its parts and type of the incorrectly-edited polynucleotide sequence detected.
[0092] Once a ligation product has been formed, the circular single-stranded polynucleotide substrates are amplified, and the resulting RCA-Product detected and / or characterised, thereby to detect and / or characterise the incorrectly-edited polynucleotide sequences. In an embodiment, the invention provides a method wherein the Probe comprises degenerate nucleotide sequence capable of binding at, or adjacent to, nucleotide sequence at the incorrectly-edited site.
[0093] It will be appreciated that the degenerate nucleotide sequence in the Probe enables it to bind to, or adjacent to, nucleotide sequence at the incorrectly-edited site. It is therefore not necessary for the sequence of the incorrectly-edited site to be known when designing the Probe sequence: by including degenerate nucleotide sequence, one or more molecule in a population of Probes will have sufficient homology to sequence at, or adjacent to, the nucleotide sequence of the incorrectly-edited site.
[0094] In a preferred embodiment, the Probe therefore contains a first part comprising nucleotide sequence that is complementary to the Oligonucleotide-Tag, and a second part comprising degenerate nucleotide sequence. In that embodiment, the first part of the Probe is capable of binding to the Oligonucleotide-Tag at the incorrectly-edited site in the one or more incorrectly-edited polynucleotide sequence, and the second part of the Probe is capable of binding at, or adjacent to, nucleotide sequence at the incorrectly-edited site. Such a Probe is therefore capable of binding to the incorrectly-edited site in the one or more incorrectly- edited polynucleotide sequence, and can then be circularised and used as a substrate for Rolling Circle Amplification according to the present method.
[0095] Thus, in that embodiment, the Oligonucleotide-Tag and the Probe is used to identify and target hitherto unknown incorrectly-edited site in the one or more incorrectly-edited polynucleotide sequence.
[0096] A ligation site is the site of ligation of a terminal 5' phosphate group at the 5' ligatable end of a Probe, to the terminal 3' hydroxyl group at the 3' ligatable end of a Probe. Ligation may occur when ligatable ends align and hybridise to their respective complementary base pairs in the target nucleic acid molecule. A ligation site, once formed, is the junction between the nucleotides at the 3' and 5' ligatable ends of the Probe, and its position is defined by the phosphodiester bond formed. In a preferred embodiment, the ligation site is located at the junction spanning the Oligonucleotide Tag and the adjacent polynucleotide sequence upstream or downstream from the Oligonucleotide Tag.
[0097] In a preferred embodiment, the one or more degenerate nucleotide in the Probe may be at a 3' or 5' ligatable end at a ligation site. The term "at" in this context refers specifically to the terminal nucleotide at an end of an oligonucleotide, i.e. the nucleotide which provides the hydroxyl or phosphate group for ligation. Thus, in one embodiment, reference to the Probe comprising a one or more degenerate nucleotide at the ligation site refers to either the terminal 3' nucleotide of a 3' ligation end, or the terminal 5' nucleotide of a 5' ligation end of the Probe. Preferably, the 3' ligatable end comprises one or more degenerate nucleotide.
[0098] In an embodiment, the Probe comprises one degenerate nucleotide at the terminal 5' nucleotide at the 5' ligatable end. In an embodiment, the Probe comprises one degenerate nucleotide at the terminal 3' nucleotide of a 3' ligatable end. In some embodiments, the Probe comprises at least two degenerate nucleotides at the 5' and / or 3' ligatable end, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or at least at least 55degenerate nucleotides at the 5' and / or 3' ligatable end.
[0099] In some embodiments, the Probe comprises between 1 to 50 degenerate nucleotides at the 5' and / or 3' ligatable end, preferably between 15 to 50 degenerate nucleotides at the 5' or 3' ligatable end.
[0100] In a most preferred embodiment, the Probe comprises between 15 to 50 degenerate nucleotides at the 3' ligatable end.
[0101] In some embodiments, the Probe will hybridise to the Oligonucleotide-Tag, or part thereof, integrated into the incorrectly-edited site in the one or more incorrectly-edited polynucleotide sequence
[0102] Preferably, the Probe may be phosphorylated on the 5' terminus to permit ligation.
[0103] In some embodiments, the Probe comprises at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides, at least 100 nucleotides, at least 110 nucleotides, at least 120 nucleotides, or at least 130 nucleotides.
[0104] In some embodiments, the Probe comprises between about 30 nucleotides to about 500 nucleotides, preferably between about 40 nucleotides to about 200, more preferably between about 50 nucleotides to about 150 nucleotides, most preferably from about 60 nucleotides to about 120 nucleotides.
[0105] As is well known, "GC content" is used to describe the amount and / or percentage of nitrogenous bases in a DNA or RIMA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out of the total bases, which also include adenine (A) and thymine (T) in DNA, or adenine (A) and uracil (U) in RNA. In some embodiments, the Probe, or a part thereof comprises GC content of about 10% to about 80%, preferably between about 20% to about 70%, more preferably between about 30% to about 65%, most preferably from about 40% to about 60 %.
[0106] The amount of the Probe that needs to be added to a sample may be chosen to ensure a low concentration in the reaction mixture, minimizing non-target specific interactions. This helps prevent the probe from randomly binding to non-target polynucleotide sequences in the sample. However, the Probe is generally used in excess relative to the target molecule. In some cases, the Probe concentration in the reaction mixture after combining with the sample ranges from about 1 fM to 1 pM, such as from about 1 pM to about 1 nM, including from about 1 pM to about 100 nM, e.g., 1, 2, 5, 10, 20, or 50 nM.
[0107] In an embodiment, the invention provides a method wherein the Probe is selected from the group comprising: a padlock probe, a molecular inversion probe; a gap-fill probe; a split-like probe; a Lotus probe; or a combination thereof.
[0108] As used herein the term "padlock probe" includes linear single stranded nucleic acid molecules which comprise sequences complementary to a target nucleic acid molecule at their 3' and 5' ends, such that upon hybridisation of a padlock probe to its target nucleic acid molecule, the ends of the probe are brought into juxtaposition, providing ligatable ends at ligation site for ligation. When the complementary segments hybridise and ligate onto the DNA target, the padlock probes become circularized.
[0109] Adaptation of padlock probes and methods resulted in the development of techniques in which the ends of the Probes bind to sequences in the target nucleic acid molecule which are separated by one or more bases, and in which the "gap" between the respective ends of the Probe is filled. Accordingly, in some embodiments, the Probe may be designed to bind to two or more non-contiguous sequences within the target nucleic acid molecule, with a gap between the respective complementary targets of the probe or probe parts. When the ligation ends of the probe or probe parts hybridise to a target and leave a gap, the gap may be filled by extending the hybridised 3' end of the probe using a nucleic acid polymerase enzyme (Hardenbol et al. 2003. Nat Biotechnol 21, 673-678) and the target nucleic acid molecule as a template for extension. Once the 3' end of the probe has been extended to be adjacent to the 5' ligatable end, the two ends may be joined by a ligation reaction. Alternatively, a gap can be filed by using a one or more gap-fill oligonucleotide (Weibrecht et al. 2013, Nature protocols 8, 355-372). In both embodiments, the term "gap-fill Probe" relates to Probes which have probe arms that are designed to flank a sequence of interest after hybridization, thus leaving a gap on the editing site.
[0110] It should also be appreciated that the Probe could also be designed as a "gap-fill" probe, for example, when one arm of the Probes is designed to hybridise next to a site where the Oligonucleotide-Tag is integrated into the target polynucleotide sequence. Alternatively, one arm of the probes may be designed to hybridise at the site where the Oligonucleotide- Tag is integrated into the target polynucleotide sequence, but the other arm of the Probe comprising degenerated bases can hybridise with a gap with reference to the first arm of the Probe. The said gap may be filled as explained above in relation to "gap-fill" probes.
[0111] The embodiments where ligatable 3' end of the Probe may be generated by extension using a nucleic acid polymerase enzyme requires the incorporation of nucleotides into an extension product, and thus in such embodiments a mixture of nucleoside triphosphates is required in the reaction mixture in order for extension to take place. In certain embodiments, one or more of the nucleoside triphosphates (one, two, three or all four nucleoside triphosphates) (nATP, nGTP, nCTP, dTTP / rUTP) provided may be deoxyribonucleoside triphosphates. This may allow at least one deoxyribonucleotide to be incorporated into the 3' ligatable end during the course of extending the 3' end of a Probe.
[0112] Any convenient polymerase enzyme which may use DNA as a template for extension, preferably a non-displacing polymerase, may be used to perform the "gap-fill" extension, including DNA polymerase enzymes.
[0113] A "gap-fill" Probe may be provided in two or more parts, i.e. as two or more oligonucleotides, which parts are ligated to form a circular oligonucleotide. A first oligonucleotide may comprise a first target-specific binding site at its 5' end and a second target-specific binding site at its 3' end, and may hybridise to the target nucleic acid molecule with a gap in-between its target-specific binding sites.
[0114] In some embodiments, a gap can be at least 1 nucleotide base, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20 or at least 25 nucleotides.
[0115] In some embodiments, a gap can be between 1 to 50 nucleotide bases, between 2 to 40 nucleotide bases, between 4 to 30 nucleotide bases or preferably between 6 to 20 nucleotide bases.
[0116] However, rather than extending the 3' end of the Probe to provide the 3' ligatable end for ligation, a second oligonucleotide (and optionally third, fourth etc. oligonucleotides) may be provided, which comprises a target-specific binding site(s) complementary to a cognate probe-binding site in the target nucleic acid molecule situated between the respective cognate probe-binding sites for the first and second target-specific binding sites. It will be appreciated that the length of the gap-fill oligonucleotide will be equal to the length of the gap between Probe's target-specific binding sites. In some embodiments, a gap inbetween Probe's target-specific binding sites is between 1 to 50 nucleotide bases, between 2 to 40 nucleotide bases, between 4 to 30 nucleotide bases or preferably between 6 to 20 nucleotide bases.
[0117] In an embodiment, the invention provides a method wherein a gap is filled by hybridising a library of oligonucleotides with degenerate bases ("degenerate oligonucleotide library"). In some embodiments, length of the oligonucleotides of the degenerate oligonucleotide library is between 1 to 50 nucleotide bases, between 2 to 40 nucleotide bases, between 4 to 30 nucleotide bases, or preferably between 6 to 20 nucleotide bases.
[0118] It will be appreciated that the proportion of degenerate bases in any one or more oligonucleotide of the degenerate oligonucleotide library (i.e. ratio of degenerate bases to a total number of bases forming the oligonucleotide) may vary depending on the experiment. In some embodiments, the proportion of degenerate bases in any one or more oligonucleotide of the degenerate oligonucleotide library is at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%. In some embodiments, the proportion of degenerate bases in any one or more oligonucleotide of the degenerate oligonucleotide library is from about 10% to about 100%, or from about 20% to about 90%, or from about 30% to about 80%.
[0119] Circularisation of the "gap-fill" Probe comprises the ligation of the 5' and 3' ligatable ends of the first oligonucleotide to the 3' and 5' ligatable ends of the gap-fill oligonucleotides. When more than one gap-fill oligonucleotides are used, there will be more than one ligation site and thus more than one ligation event.
[0120] Preferably, the gap-fill oligonucleotides may be phosphorylated on the 5' terminus to permit ligation.
[0121] In some embodiments, "gap-fill" Probes may be designed to flank the intended gene editing site, making "gap-fill" Probes particularly useful to detect insertion and deletions within the intended targeted site.
[0122] As is known in the art, a "molecular inversion probe" is a type of "gap-fill Probe", which is a linear single stranded nucleic acid molecule which comprises sequences complementary to a target nucleic acid molecule at their 3' and 5' ends. The complementary 3' and 5' ends are designed such that a gap delimited by the hybridized ends of the molecular inversion probe remains over the target region. The size of the gap ranges from a single nucleotide to several hundred nucleotides, depending on the application. As explained above, the gap may be filled by DNA polymerase using free nucleotides and the ends of the molecular inversion probe are ligated by ligase, resulting in a fully circularized probe.
[0123] By the term "split-like probe" we include Probes that consists of at least two separate Probes that have been designed in a way to be partially complementary to the polynucleotide target sequence and to a connector sequence. Upon hybridization to the polynucleotide target sequence, the two or more Probes come into close proximity and their parts forming a linker get linked by the connector oligonucleotide. The nick in the linker can be closed via a ligase, while the nick or gap in the polynucleotide target sequence can be either closed by a ligase alone or in conjunction with a polymerase or hybridisation of degenerate oligonucleotide library. Ligation of a linker provided in two or more parts may be templated by a connector oligonucleotide, which is capable of hybridising to each of the two or more parts of a linker, thereby to bring respective 5' and 3' ends into juxtaposition for ligation. The connector oligonucleotide thus comprises regions of complementarity to sequences at the ends of each of the two or more parts of a linker sequence.
[0124] The connector oligonucleotide may be a separate oligonucleotide, e.g. an oligonucleotide added to the sample together with the Probe or separately, or pre-hybridised to the Probe, or may be another target nucleic acid molecule, or another part of the same target nucleic acid molecule (i.e. another, separate, target sequence at a different position within the same target nucleic acid molecule). Accordingly, the ligation template may be a synthetic or natural oligonucleotide, and may be an RIMA molecule or DNA molecule.
[0125] By the term "Lotus probe" we include Probes that are already partially circular but require the specific hybridization of a polynucleotide target sequence to form a functional circular oligonucleotide molecule. The hybridized target fragment can either create two nick sites or a gap that can be subsequently closed by a ligase or in the case of the gap in conjunction with a polymerase. These Probes are dumbbell shaped and consist either of a single probe or multiple separate probes.
[0126] By the term "Trilock probe" we include Probes that consist of four separate probes a common backbone probe, two probes partially complementary to the polynucleotide target and a "Trilock linker" which brings them all together by being partially complementary to the other probes. The two probe arms complementary to the polynucleotide target can either be perfectly matching to create a nick site that can be sealed with a ligase, or create a gap that is filled by a polymerase before being sealed by a ligase.
[0127] Probes described here, or parts of a probe, may include additional sequences to introduce a sequence into a ligation product (and RCA Product). These sequences can be tags or detection sequences, such as barcodes, identificatory motifs, or binding sites for detection probes or primers. The additional sequences can be located at the 3' or 5' end of a Probe (preferably opposite the ligatable end), or within a circularisable backbone oligonucleotide not hybridised to the target nucleic acid.
[0128] Certain sequence motifs (tags) can be introduced into probe sequence for various purposes. Universal or common tag sequence will allow to process different ligated probes together in a multiplex setting. For example, introduction of a unique binding site for a universal amplification primer will enable simultaneous amplification of different ligated probes, for example, in library amplification by PCR or RCA.
[0129] Additionally, a tag sequence can be used to label different probes (a "target" tag, marker or barcode) or to tag different samples for pooling before a common amplification step (a "sample" tag, marker or barcode).
[0130] In some embodiments, Probes with the same target barcode may comprise a further unique sequence. That can be achieved by introducing a Unique Molecular Identifier (UMI) into the Probe sequence. A UMI is a short, random sequence of nucleotides. Given that the sequence of a UMI is random, UMI serve as molecular "tags" that uniquely label each original Probe molecule, allowing for precise tracking and single molecule counting throughout the sequencing process. In some embodiments, the UMI comprises between 4 to 20 nucleotides.
[0131] By the term "sequencing probe" we include Probes that consist of one or more sequencing tag in the linker to enable sequencing of the RCA-Product by being compatible with NGS chemistry. The sequencing probe may comprise one or more sequencing tags from the following: Probe target tag, one or more adapter sequence, binding site for ligation probes, one or more index, and / or one or more flanking primer sites. The sequencing probes are useful in conjunction with sequencing by hybridization, sequencing by ligation, or other next-generation sequencing techniques for multiplexed detection of multiple target nucleic acids in a sample.
[0132] Multiple Probes can be added to a sample for a multiplex assay, which is useful when multiple different target nucleic acid molecules are present and need to be detected simultaneously. These assays can detect tens, hundreds, thousands, or even tens of thousands of nucleic acid molecules in a sample. Therefore, multiplex assays may include at least 2 distinct probes that can hybridize (directly or indirectly) to different target polynucleotide sequences, thereby detecting different analytes. For example, multiplex assays might use 3, 4, 5, 10, 20, 30, 40, or 50 probes, or even 100, 200, 500, 1000, 10000, or more Probes.
[0133] In an embodiment, the invention provides a method wherein the Probe circularises on recognition of its target polynucleotide sequence. By "target polynucleotide molecule," we include any specific polynucleotide sequence, such as a target DNA or RIMA sequence, that the Probe is designed to recognise and hybridise to. In the context of the present invention, the target polynucleotide sequence may comprise incorrectly-edited polynucleotide sequence at and / or within an off-target site; and / or at and / or within on-target site. The Probes may comprise parts designed to hybridise to non-overlapping sequences on the target polynucleotide molecule, and when these parts hybridize to their respective target sequences, the probe forms a circular structure, allowing for precise detection and amplification. This method can permit detection of multiple target sequences, including both RNA and DNA, in the same sample.
[0134] In some embodiments, the length of the target polynucleotide molecule may be longer or shorter than the length of the Probe's parts designed to hybridise to non-overlapping sequences on the target polynucleotide molecule. In some embodiments, the target polynucleotide sequence comprises at least 2 nucleotides, at least 5 nucleotides, at least 10 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides, or at least 100 nucleotides.
[0135] In some embodiments, the target polynucleotide sequence comprises between about 2 nucleotides to about 100 nucleotides, preferably between about 5 nucleotides to about 70 nucleotides, more preferably between about 15 nucleotides to about 50 nucleotides, most preferably from about 30 nucleotides to about 50 nucleotides.
[0136] Circularisation occurs when the ends of the polynucleotide sequence (these can be a probe or a target) hybridize next to each other forming the circle and a DNA ligase joins the ligation ends closing completely the circular molecule. For gap-fill probes and / or molecular inversion probes circularization happens after a polymerase or a gap-fill oligonucleotide fills the gap in between the ends of the probe and then a ligase joins the ends.
[0137] In an embodiment, the invention provides a method wherein circularisation of the Probe is mediated and / or improved by one or more Joining probe; optionally wherein the one or more Joining probe is a Selector probe.
[0138] By "joining probe" we include Probe that hybridize next to each other and, after ligation, form a single sequence strand. It will be appreciated that, in an embodiment, the invention provides a method wherein the circularisation of the Probe is mediated and / or improved by one or more Selector probe.
[0139] As used herein the term "selector probe" includes probes comprising sequences at its ends complementary to sequences in a target nucleic acid molecule.
[0140] In some embodiments, a selector probe is a double stranded DNA oligonucleotide with a longer nucleic acid strand overhanging both ends of a shorter nucleic acid strand, which longer strand comprises at its ends sequences complementary to sequences in a target nucleic acid molecule, for example as described in US 7,883,849. In some embodiments, a selector probe is a single stranded DNA oligonucleotide that comprises at its ends sequences complementary to sequences in a target nucleic acid molecule (as described in Johansson et al., 2011, Nucleic Acids Res, 39(2):e8).
[0141] Following hybridisation, the selector Probe and the target nucleic acid fragment are joined by ligation to give a circular single-stranded polynucleotide substrates. Alternative methods for selection and amplification of a nucleic acid involve the general circularisation of genomic fragments as described in Drmanac et al. 2010 (Drmanac et al., 2010, Science. 327(5961):78-81) and US 8,518,640. Selector probe-based methods for selecting a target nucleic acid sequence are also provided in WO 03 / 044229. After exonucleolytic digestion of any non-circular targets, RCA is performed thus generating multiple copies of the target polynucleotide sequence.
[0142] In some embodiments, selector probes may comprise degenerate sequence and / or tags or detection sequences (such as barcodes, identificatory motifs, UMIs, or binding sites for detection probes or primers). In a specific embodiment, a selector Probe comprises two overhang sequences at both ends, wherein one end is complementary to the target sequence and the other end comprises degenerate sequence. Target sequence can be either a wild-type sequence or the Oligonucleotide-Tag sequence. Examples of such selector Probes are presented in Figures 3 and 4. It should be appreciated that those selector Probes are particularly useful to detect long DNA rearrangements, such as chromosomal aberration at both off-target and on-target sites.
[0143] In the embodiment where a selector probe comprises tags or detection sequences (such as a capture sequence, barcode or UMI), the selector probe will bind digested gDNA as depicted in Figures 3 and 4. In this embodiment, the target DNA hybridises to the two overhang sequences at both ends of the selector probe, wherein one end is complementary to the integrated Oligonucleotide-tag (Figure 3) or a target sequence (depicted as "on- target" site (known) in Figure 4). As depicted in Figures 3 and 4, the 3' and 5' ends of the target DNA hybridises with the selector probe to form double stranded DNA, such that a gap is formed between the 3' and 5' ends of the target DNA. The part of the selector probe sequence which is single stranded in this construct, may then comprise tags or detection sequences, and the gap if formed on the corresponding side of the target DNA that surrounds the tags or detection sequences. The gap can be filled using polymerisation or gap-fill oligo as explained elsewhere herein. The skilled person will appreciate that the choice of the method for filling the gap will depend on the type of the selector probe in the system. For example, if a UMI sequence is engineered into a selector probe, a polymerase will be more suitable as the unknown sequence of the UMI makes designing the complementary gap-fill oligo impossible. In some embodiments, the selector probe comprises any one or more from the list comprising: Oligonucleotide-tag complementary sequence, tags, capture sequences, barcode, UMIs, detection sequence, or a degenerate sequence.
[0144] In a preferred embodiment, the invention provides a method wherein circular singlestranded polynucleotide substrates are formed by ligation of the Probe; optionally wherein ligation is performed by a ligase, preferably a ligase with specific intramolecular ligation activity.
[0145] Where the Probe is a multi-part padlock, the connector oligonucleotide a may be contacted with the sample and allowed to hybridise, and the gap oligonucleotide(s) or joining probes may then be added, if used, and allowed to hybridise. Alternatively, all the parts of a multi-part probe may be added together.
[0146] Ligation involves creating a phosphodiester bond between the 3' OH group at the end of one ligation end of a probe, and the 5' phosphate group at the second ligation end of the same or another probe or oligonucleotide. Ligation can be achieved by enzymatic and / or chemical methods. Ligation achieved by chemical methods may involve click chemistry, native chemical ligation, thiol-maleimide ligation, Staudinger and / or tetrazine Ligation; preferably, ligation achieved by chemical methods involve click chemistry.
[0147] DNA ligases are commonly used for ligating oligonucleotides enzymatically. Ligation reactions may require cofactors (such as ATP or NAD+) to provide energy for bond formation. The ligase recognises the ends of the DNA strands (such as the ligatable ends at the ligation site) and seals them together, producing a continuous DNA strand. Enzymatic ligation is highly specific and efficient. Preferably, a ligase enzyme is selected which is capable of catalysing the formation of a phosphodiester bond in a target-specific manner.
[0148] Preferably, the ligase is a thermostable DNA ligase, which catalyses the formation of a phosphodiester bond between two adjacent deoxyribonucleotides hybridized to a target nucleic acid molecule.
[0149] A suitable ligase and any necessary or desirable reagents can be combined with the reaction mixture and maintained under conditions that allow for the ligation of hybridized oligonucleotides. Ligation reaction conditions are well known to those skilled in the art. During ligation, the reaction mixture may be kept at temperatures ranging from about 4°C to 105°C, 4°C to 80°C, 10°C to 70°C, or 15°C to 60°C, typically from 30°C to 60°C, for a duration ranging from about 5 seconds to 16 hours, such as from 1 minute to 1 hour. In other embodiments, the reaction mixture may be maintained at temperatures ranging from 35°C to 45°C, such as 37°C to 42°C, for a duration ranging from 5 seconds to 16 hours, such as from 1 minute to 1 hour, including from 2 minutes to 8 hours.
[0150] In an embodiment, the invention provides a method wherein Rolling Circle Amplification of the circular single-stranded polynucleotide substrates is initiated by the target polynucleotide sequence or by an amplification primer that is complementary to the circular singe-stranded polynucleotide substrate.
[0151] In a further embodiment, Rolling Circle Amplification of the circular single-stranded polynucleotide substrates is initiated by more than one amplification primer.
[0152] In an embodiment, the Probe (and consequently any circular single-stranded polynucleotide substrate produced by ligating said Probe) comprises more than one amplification primer-binding site or sequence. The Probe may comprise a one or more universal (i.e. identical) amplification primer-binding site or sequence, or the Probe may comprise one or more different amplification primer-binding site or sequence. In an embodiment where the one or more amplification primer-binding site share the same sequence, a universal primer binding said sequence may be used, and multiple copies of the universal primer may bind the Probe and each may initiate RCA. In another embodiment, the one or more amplification primer-binding sites have one or more different sequence; in that embodiment, one or more different primers may be used to bind the Probe and initiate RCA.
[0153] In an embodiment, two or more amplification primers are used. In an embodiment, three or more amplification primers are used. In an embodiment, four or more amplification primers are used. In an embodiment, five or more amplification primers are used. In an embodiment, ten or more amplification primers are used.
[0154] Once circular single-stranded polynucleotide substrates are formed, they are amplified in order to generate a number of complementary copies, which may increase the sensitivity of the detection method of the present invention. In some embodiments, the circular single-stranded polynucleotide substrates are amplified using any convenient method known in the art, such as PCR or a variant thereof, Strand Displacement Amplification (SDA), Helicase-Dependent Amplification (HAD), Loop-Mediated Isothermal Amplification (LAMP) or Smart Amplification Process (SMAP).
[0155] In a preferred embodiment, the circular single-stranded polynucleotide substrates are amplified by Rolling Circle amplification (RCA). Rolling Circle Amplification using a circular single-stranded polynucleotide substrates for amplification results in a concatenated RCA- Product comprising multiple tandem repeats, each having a sequence complementary to the circular single-stranded polynucleotide substrate. In an embodiment, the polymerase that performs RCA requires a 3'OH end to initiate RCA.
[0156] In some embodiments, the target nucleic acid molecule may comprise or provide a 3' end (3'OH) which can serve as a primer for RCA amplification. When RCA is initiated, the polymerase itself digests the 3'end until it reaches the circularised part and then can initiate RCA.
[0157] Alternatively, RCA may be performed by contacting the sample with a separate primer complementary to and capable of hybridising to the one or more circular single-stranded polynucleotide substrates, and the 3' end of said primer may be extended to provide a concatenated RCA product.
[0158] By "amplification primer" we include a polynucleotide sequence that is fully or partially complementary to the one or more circular single-stranded polynucleotide substrate. In an embodiment the amplification primer can be modified with a moiety to protect it from exonuclease activity. Such moieties can be selected from the group consisting of ortho methyl RIMA bases and alpha-thiol phosphate linkages.
[0159] Amplification of the circular single-stranded polynucleotide substrates is carried out using a DNA polymerase enzyme, which is capable of synthesizing a DNA oligonucleotide from dNTPs. The term "DNA polymerase" here refers to any enzyme that can incorporate dNTPs with both DNA and RNA polymerase activity. Examples of polymerase enzymes for this step include Phi 29 DNA Polymerase, Vent DNA polymerase, T7 RNA Polymerase and Bst DNA polymerase.
[0160] In an embodiment, the invention provides a method wherein the one or more RCA-Products generated in step (ii) are labelled with a detectable moiety; optionally, wherein the detectable moiety is selected from the group comprising: a fluorophore; a chromophore; or a combination thereof.
[0161] In an embodiment, the invention provides a method wherein the RCA-Products are labelled directly with a detectable moiety. A directly detectable moiety is one that can be directly detected without the use of additional reagents, while an indirectly detectable label is one that is detectable by employing one or more additional reagents. As explained below, the Probe described herein, or one or more parts of a Probe, may comprise a sequence which may serve to introduce a sequence into the RCA-Product, for example a tag (i.e. a detection sequence, barcode or identificatory motif), for detection. Tags may be designed with different needs / purposes, for example, to introduce a common sequence to enable either different probes in a multiplex setting to be processed together (for example, to introduce a binding site for a universal or common amplification primer) or to introduce a unique sequence to enable them to be distinguished from one another (for example, a barcode or a Unique Molecular Identifier (UMI)).
[0162] In an embodiment, different detectable moieties are used to label the RCA-Products generated from different incorrectly-edited polynucleotide sequences. In that embodiment, the RCA-Products are differentially-labelled, allowing differentiation of RCA- Products that originated from different Probes (for example, designed to bind different Oligonucleotide-Tags, or when different types of Probes are combined in one reaction). In an embodiment, the detectable moiety may be selected from the list consisting of fluorophores, radioisotopic labels, chemiluminescent labels, phosphorescent labels, chromophores, nanoparticles and particles, such as gold particles, silver particles, quantum dots and the like. In a preferred embodiment, the detectable moiety is a spectrally separated fluorophore selected from the list consisting of Cyanine 3 , Cyanine 5, Alexa Fluor family dyes (such as 488 and 750), fluorescein (FITC), Atto family dyes (such as ATTO 550 and ATTO 488), quantum dots, and synthetic fluorophores. The labelling reagent employed in such embodiments can be a fluorescently tagged nucleotide(s), e.g. fluorescently tagged CTP (such as Cy3-CTP, Cy5-CTP) etc.
[0163] The detectable moieties as described herein can be conjugated to an oligonucleotide to form a "detection probe". Accordingly, detection probes can be labelled by various reporter detectable moieties as defined above.
[0164] In an embodiment, the invention provides a method wherein step (iii) comprises detecting the one or more RCA-Products by microscopy; optionally wherein microscopy is selected from the group comprising: bright-field microscopy; fluorescence microscopy; or a combination thereof.
[0165] In an embodiment, the invention provides a method wherein step (iii) comprises determining the sequence of the one or more RCA-Products, for example by DNA sequencing.
[0166] As used herein, the term "DNA sequencing" refers to sequencing techniques such as sequencing by ligation, sequencing by synthesis or sequencing by hybridization as well as Sanger sequencing or next-generation sequencing. Such methods are well known to those skilled in the art of molecular biology. Examples 2 and 3 below demonstrate detection of the RCA-Products by DNA sequencing using the MinlON sequencing platform (Oxford Nanopore Technologies). This platform offers several advantages, such as real-time and portable sequencing technology. It can sequence very long DNA fragments and is known for its versatility and ability to provide rapid results. Other platforms that can be used with the methods of the invention are Illumina Sequencing, Pacific Biosciences (PacBio) Sequencing (Metzker, Nat Rev Genet 11, 31-46 (2010) or Sequencing-by-ligation (as explained elsewhere herein). Illumina Sequencing uses sequencing-by-synthesis technology, where fluorescently labelled nucleotides are incorporated into a growing DNA strand and detected in real-time. PacBio's Single Molecule Real-Time (SMRT) sequencing technology allows for long-read sequencing, which is beneficial for studying complex genomic regions and structural variations. Ion Torrent Sequencing detects hydrogen ions released during DNA synthesis, allowing for semiconductor-based sequencing.
[0167] In some embodiments, the method further comprises a step of monomerising RCA- Products prior to sequencing. In some embodiments the method further comprises a step of synthesising the second strand of the monomers.
[0168] In some embodiments, the method further comprises a step of ligating adaptors to each end the monomer. Adaptors are short, double-stranded DNA sequences that are attached to the ends of the monomers. This allows the monomers to bind to the sequencing platform, enabling the sequencing process.
[0169] In some embodiments, the adaptors are added using during PCR amplification. Primers with adaptor sequences can be used to amplify the monomers, incorporating the adaptors into the amplified product.
[0170] In an embodiment, the invention provides a method wherein the one or more RCA-Products are flown through a microfluidic channel so as to concentrate the one or more RCA-Products into a specified area, for example into a single view area of a microscope.
[0171] The liquid flow may be achieved through use of a pump system, or the flow may be achieved passively. This concentrating step may be referred to as microfluidic enrichment.
[0172] The enrichment / immobilization of the RCA-Products allows for lower amounts / concentrations of polynucleotides to be used compared to currently known procedures. Particular amounts / concentrations of polynucleotides that can be used are outlined above.
[0173] Quantification is achieved by imaging immobilized RCA-Products with a fluorescence microscope. Resulting images may be processed with an image processing software that performs top-hat filtering, spot registration, spot size filtering and spot quantification algorithms. Where, each spot corresponds to a resulting RCA-Product from a target molecule. This counting of individual spots or positive entities is termed digital quantification, as discussed herein.
[0174] In an embodiment, the invention provides a method wherein step (iii) comprises immobilizing the one or more RCA-Products on a surface; for example, by electrostatic interaction, covalent interaction and / or steric interaction with said surface.
[0175] The "immobilization on a surface" (or "immobilization on a solid phase") can be achieved through various methods.
[0176] In one embodiment, the target polynucleotide sequences can initially be captured by immobilized (or immobilizable) capture probes as explained above. The amplified RCA- Product can then be generated such that it is attached to the target polynucleotide sequence molecules as described above. In other words, the capture probe may be an immobilized (or immobilizable) probe that is specific to the incorrectly-edited polynucleotide sequence, containing a complementary binding sequence. Subsequently, the immobilized target polynucleotide molecule undergoes a detection and / or that can be detected and / or characterised as explained elsewhere herein.
[0177] In some embodiments, the RCA-Product is immobilized to the surface. Preferably, immobilisation of the RCA-Product to the surface is achieved to aid detecting RCA-Product, for example, via microscopic imaging. Immobilization of RCA-Products can be achieved by electrostatic interaction, i.e. utilising changes in surface charge between RCA-Product and the surface, for example by coating the solid surface with positively charged polymers or molecules, such as poly-L-lysine or aminosilanes which can then attract RCA-Products which are negatively charged due to their phosphate backbone. In alternative embodiments, RCA-Products is adsorbed onto surfaces (such as gold or silica) that have been modified to carry a positive charge.
[0178] In some embodiments, the RCA-Products are immobilized to the surface by introducing and / or chemically modifying one or more functional groups of the RCA-Products, so that covalent bonds are formed with the substrate surface. For example, the phosphate group of the RCA-Products can be derivatized to create a reactive moiety. Alternatively, chemical crosslinkers can be used to attach DNA to the surface. Some agents (such as glutaraldehyde and carbodiimides) are known to form covalent bonds between the DNA and functional groups on the solid support. In yet alternative embodiment, RCA-Products are immobilized using photochemical or enzymatic reactions.
[0179] The solid support can be any of the well-known supports or matrices commonly used or proposed for immobilization and separation. These supports may include particles (e.g., beads that can be magnetic or non-magnetic), sheets, gels, filters, membranes, fibres, capillaries, or microtiter strips, tubes, plates, or wells.
[0180] The support material can be made of glass, silica, latex, or a polymeric substance. Suitable materials are those that offer a high surface area for analyte binding. These supports may have an irregular surface and can be porous or particulate, such as particles, fibres, webs, sinters, or sieves. Particulate materials like beads are particularly useful due to their higher binding capacity, especially polymeric beads.
[0181] In a further embodiment, the target nucleic acid molecule itself may be immobilised (or immobilizable) on the solid phase e.g. by non-specific absorption.
[0182] In an embodiment, the invention provides a method wherein the method comprises the step of attaching the one or more RCA-Products to magnetic beads to provide bead-bound RCA-Products; optionally where the one or more RCA-Products are immobilized by providing a magnetic source so as to attract the bead-bound RCA-Products to a position on the surface.
[0183] As explained above, the term "capturing" includes techniques to isolate molecules, such as incorrectly-edited target polynucleotide sequences or RCA-Products. Capturing RCA- Products can be achieved by hybrid capture which relies on magnetic beads.
[0184] In some embodiments, RCA-Products are captured using probes that hybridize to a sequence which is complementary to the RCA-Products, enabling the RCA-Products to be isolated using magnetic beads or other methods. In an embodiment, the RCA-Products are captured using magnetic beads. Conveniently, the magnetic beads may be coated with a one or more ligand (such as streptavidin) to bind the RCA-Products (such as avidin- conjugated probes able to hybridise to the sequence complementary to the RCA-Product).
[0185] The magnetic beads may have an average size of from about 10 nm to about 5 pm, for example from about 10 nm to about 2 pm, such as about 500 nm to about 2 pm. In this regard, the magnetic beads may have an average diameter from about 10 nm to about 5 pm, for example from about 10 nm to about 2 pm, such as about 500 nm to about 2 pm or about 50 nm to about 200 nm. In a preferred embodiment, magnetic beads have an average size of from about 30 nm to about 200 pm.
[0186] Monodisperse magnetic beads, which are beads with a uniform size distribution (e.g., having a diameter standard deviation of less than 5%), offer the advantage of highly consistent reaction reproducibility. Magnetic beads are particularly beneficial for manipulation and separation processes. The term "magnetic" refers to the support's ability to acquire a magnetic moment when exposed to a magnetic field, meaning it is paramagnetic and can be moved by that field. In other words, a support containing magnetic particles can be easily removed through magnetic aggregation, providing a quick, simple, and efficient method for separating the particles (such as RCA-Products) after the binding steps.
[0187] In an embodiment, the invention provides a method wherein the surface is a glass surface, optionally a glass surface which is modified to interact with an RCA-Product.
[0188] In an embodiment the glass surface may be modified with positively charged homopolymers, for example poly-L-Lysine, poly-D-lysine or aminosilane.
[0189] RCA-Products may also contain affinity molecules such as biotin by introducing biotin- modified dNTPs into the RCA reaction mix. Biotinylated RCA-Products can be immobilized on a glass surface modified with streptavidin.
[0190] In an embodiment, the invention provides a method wherein the surface is a porous membrane, such as a filter membrane, for example a porous hydrophilic membrane; optionally wherein the one or more RCA-Products are immobilized by filtering through the porous membrane.
[0191] The liquid sample may be drawn through the porous membrane by gravity filtration, by applying a vacuum pump, or by capillary forces.
[0192] In an embodiment, the liquid sample is drawn through the porous membrane by capillary forces by applying the liquid sample to one side of the porous membrane and applying an absorption layer to the other side of the porous membrane to put the absorption layer and porous membrane into liquid connection and such the liquid from the liquid sample.
[0193] For the avoidance of doubt, the porous membrane is permeable for the liquid in the liquid sample, but substantially impermeable to the RCA-Products.
[0194] By the term "capillary force(s)" as used herein, this refers to the sucking or wicking of liquid through the porous membrane so as to immobilize the RCA-Products on the surface of the membrane.
[0195] In an embodiment the area of the porous membrane corresponds to a single field of view of an optical sensing device.
[0196] In an embodiment the porous membrane has a thickness of from about 0.01 pm to about 100 pm, such as from about 0.05 pm to 0.5 pm, for example from about 0.07 pm to about 0.2 pm, or wherein the filter membrane has a thickness of about 0.1 pm.
[0197] In another embodiment, the porous membrane has a surface area of from about 2 to about 20 mm2, such as from about 5 to about 15 mm2, for example from about 5 to about 10 mm2.
[0198] In a further embodiment, the porous membrane is substantially circular in shape, such as circular in shape, wherein the porous membrane has a diameter in the range of from about 0.1 to about 10 mm, such as from about 0.5 mm to about 10 mm, for example from about 1 mm to about 5 mm, or from about 1 mm to about 3 mm, or wherein the filter membrane is circular having a diameter of about 2 mm.
[0199] In an embodiment, the invention provides a method wherein the sample further comprises one or more correctly-edited polynucleotide sequence and / or one or more unedited polynucleotide sequence.
[0200] As explained herein, during genetic-editing procedures unintended alterations can occur at both on-target and off-target sites. The method of the invention is particularly advantageous to detect and characterise incorrectly-edited polynucleotide sequences. In a further embodiment, the method of the invention can also be performed with additional steps that permit the detection and / or characterisation of correctly-edited polynucleotide sequence and / or unedited polynucleotide sequence in the sample.
[0201] In an embodiment, the invention provides a method further comprising the step of generating RCA-Products from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence.
[0202] RCA-Products from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence can be generated using Rolling Circle Amplification ("RCA").
[0203] It will be appreciated that RCA-Products from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence after RCA can be processed together with the RCA-Products from the one or more incorrectly- edited polynucleotide sequence.
[0204] Preferably, step (i-a) additionally comprises generating circular single-stranded polynucleotide substrates from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence. Those circular single-stranded polynucleotide substrates are produced using Probes followed by ligation. That is preferably performed in one of two ways.
[0205] - In one preferred embodiment, probes are specifically engineered to comprise sequences allowing the detection and / or characterisation of the one or more correctly-edited polynucleotide sequences and / or unedited polynucleotide sequence in the sample.
[0206] In that embodiment, a probe (such as a padlock probe or a gap-fill probe) is engineered to comprise sequences complementary to a target nucleic acid molecule at its 3' and 5' ends, such that upon hybridisation of a probe to its target nucleic acid molecule, the ends of the probe are brought into juxtaposition, providing ligatable ends for ligation. When the 3' and 5' ligatable ends of the probe hybridise and ligate, the probe becomes circularized. Thus, a probe having ligatable 3' and 5' ends engineered to specifically recognise and bind the one or more correctly-edited polynucleotide sequence, will permit identification of a correctly-edited polynucleotide sequence in the sample; and a probe having ligatable 3' and 5' ends engineered to specifically recognise and bind the one or more unedited polynucleotide sequence, will permit identification of an unedited polynucleotide sequence in the sample. Such probes are disclosed, for example, in WO 2022 / 194886.
[0207] When probes specific for one or more correctly-edited polynucleotide sequence and / or one or more unedited polynucleotide sequence are provided in step (i- a), the ligation and subsequent amplification of the circular single-stranded polynucleotide substrates can be achieved together with the ligation and amplification of the circular single-stranded polynucleotide substrates obtained from the one or more incorrectly-edited polynucleotide sequence, as explained herein.
[0208] - In an alternative, preferred embodiment, the one or more correctly-edited polynucleotide sequence is detected and / or characterised by virtue of the presence at the correctly-edited site of an integrated Oligonucleotide-Tag.
[0209] Such Oligonucleotide-Tags are as already discussed herein, and will be integrated into the polynucleotide sequence during the genetic-editing procedure. As already discussed above, genetic-editing involves the formation of a comprises a Double Strand Break (DSB) at the intended on-target site and, when that occurs, the Oligonucleotide-Tag (as defined herein) can be integrated. In that embodiment, one or more Probe (as already described herein) can be used to detect and / or characterise the one or more correctly-edited polynucleotide sequence, and the disclosures herein apply also to this embodiment of the invention.
[0210] Thus, and to reiterate, in this embodiment, a Probe is used that contains a first part comprising nucleotide sequence that is complementary to the Oligonucleotide-Tag, and a second part comprising degenerate nucleotide sequence. The first part of the Probe is therefore capable of binding to the Oligonucleotide-Tag at the correctly-edited site in the one or more correctly- edited polynucleotide sequence, and the second part of the Probe is capable of binding at, or adjacent to, the sequence at the correctly-edited site. Alternatively, a Probe is used that contains a first part comprising nucleotide sequence that is complementary to the Oligonucleotide-Tag, and a second part comprising a sequence complementary to the sequence of the expected edit. The first part of the Probe is therefore capable of binding to the Oligonucleotide- Tag at the correctly-edited site in the one or more correctly-edited polynucleotide sequence, and the second part of the Probe is capable of binding at, or adjacent to, the sequence at the correctly-edited site.
[0211] As already discussed herein, upon hybridisation of a Probe to its target nucleic acid molecule, the ends of the Probe are brought into juxtaposition, providing ligatable ends for ligation. When the 3' and 5' ligatable ends of the Probe hybridise and ligate, the Probe becomes circularized. RCA can then be performed, as already discussed herein.
[0212] When Probes specific for one or more correctly-edited polynucleotide sequence are provided in step (i-a), the ligation and subsequent amplification of the circular single-stranded polynucleotide substrates can be achieved together with the ligation and amplification of the circular single-stranded polynucleotide substrates obtained from the one or more incorrectly-edited polynucleotide sequence, as explained herein.
[0213] In the embodiments of the invention in which step (i-a) additionally comprises generating circular single-stranded polynucleotide substrates from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence, the following preferred embodiments are contemplated:
[0214] - Preferably, the Probe comprises degenerate nucleotide sequence capable of binding at, or adjacent to, nucleotide sequence at the correctly-edited site. As explained above, it will be appreciated that the degenerate nucleotide sequence in the Probe enables it to bind to, or adjacent to, nucleotide sequence at the correctly-edited site. By including degenerate nucleotide sequence, one or more molecule in a population of Probes will have sufficient homology to sequence at, or adjacent to, the nucleotide sequence of the correctly-edited site. In some embodiments, the Probe will hybridise to the Oligonucleotide-Tag, or part thereof, integrated into the correctly-edited site in the one or more correctly- edited polynucleotide sequence. It will be appreciated from the above paragraphs that ligation ends are required to be in direct juxtaposition and hybridised to their respective complementary base pairs in the target polynucleotide molecule in order for ligation to take place, and that the ligatable ends may be brought into juxtaposition for ligation in various ways, depending on the design of the probe and / or its parts and type of the correctly-edited polynucleotide sequence detected.
[0215] The preferred positions and number of degenerated bases in the Probe for detecting the one or more incorrectly-edited polynucleotide sequences will be applicable to the embodiment where the one or more correctly-edited polynucleotide sequences are detected.
[0216] Preferably, the Probe for detecting the one or more correctly-edited polynucleotide sequences or the one or more unedited polynucleotide sequences is selected from the group comprising: a padlock probe, a molecular inversion probe; a gap-fill probe; a split-like probe; a Lotus probe; or a combination thereof. As explained already above, the Probes can be engineered to comprise sequences complementary to a target nucleic of the one or more correctly- edited polynucleotide sequence or the one or more unedited polynucleotide sequence as disclosed, for example, in WO 2022 / 194886. Alternatively, the Probes for detecting the one or more correctly-edited polynucleotide sequence may comprise degenerate and / or Oligonucleotide-Tag complementary sequences as explained in detail above.
[0217] Preferably, the Probe for detecting the one or more correctly-edited polynucleotide sequence or the one or more unedited polynucleotide sequence, or parts of a probe, may include additional sequences to introduce a sequence into a ligation product (and RCA Product). These sequences can be tags or detection sequences, such as barcodes, identificatory motifs, or binding sites for detection probes or primers. The additional sequences can be located at the 3' or 5' end of a Probe (preferably opposite the ligatable end), or within a circularisable backbone oligonucleotide not hybridised to the target nucleic acid. Additionally, a unique tag sequence can be used to label different probes (a "target" tag, or marker or barcode) or to tag different samples for pooling before a common amplification step (a "sample" tag, or marker or barcode) or a unique sequence (such as UMI). Preferably, the Probe for detecting the one or more correctly-edited polynucleotide sequence or the one or more unedited polynucleotide sequence circularises on recognition of the one or more correctly-edited polynucleotide sequence or the one or more unedited sequence. The circularisation occurs when the ends of the polynucleotide sequence (these can be a probe or a target) hybridize next to each other forming the circle and a DNA ligase joins the ligation ends closing completely the circular molecule. For gap-fill probes and / or molecular inversion probes circularization happens after a polymerase or a gapfill oligonucleotide fills the gap in between the ends of the probe and then a ligase joins the ends.
[0218] Preferably, the Probe for detecting the one or more correctly-edited polynucleotide sequence or the one or more unedited polynucleotide sequence is mediated and / or improved by one or more Joining probe; optionally wherein the one or more Joining probe is a Selector probe.
[0219] Selector probes used in this embodiment of the invention are as described elsewhere herein. For example, the Selector probe as depicted in Figure 3 and Figure 4 may be adapted to bind correctly-edited target sequence (for example, by virtue of the sequence complementary to the Oligonucleotide-Tag integrated into the correctly-edited target sequence, and / or the sequence complementary to the unedited sequence adjacent to the edited sequence). In both variants, the degenerate sequence may bind to the unedited or correctly-edited target sequence. Accordingly, Selector probes as explained herein can detect correctly-edited and / or undetected target sequences.
[0220] Preferably, circular single-stranded polynucleotide substrates from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence are formed by ligation of the Probe; optionally wherein ligation is performed by a ligase, preferably a ligase with specific intramolecular ligation activity.
[0221] Preferably, Rolling Circle Amplification of the circular single-stranded polynucleotide substrates from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence is initiated by the target polynucleotide sequence or by an amplification primer that is complementary to the circular singe-stranded polynucleotide substrate.
[0222] Preferably, the primer can be modified with a moiety to protect it from exonuclease activity. Such moieties can be selected from the group consisting of ortho methyl RIMA bases and alpha-thiol phosphate linkages, as explained elsewhere herein.
[0223] Preferably, the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence generated in step (ii) are labelled with a detectable moiety; optionally, the detectable moiety is selected from the group comprising: a fluorophore; a chromophore; or a combination thereof.
[0224] - As explained elsewhere herein, the Probe or one or more parts of a Probe, may comprise a sequence which may serve to introduce a sequence into the RCA- Product, for example a tag (i.e. a detection sequence, barcode or identificatory motif), for detection. In an embodiment, different detectable moieties are used to label the RCA-Products generated from the one or more incorrectly-edited polynucleotide sequences, the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence. In that embodiment, the RCA-Products are differentially-labelled, allowing differentiation of RCA-Products that originated from different Probes (for example, designed to bind different Oligonucleotide-Tags, or when different types of Probes are combined in one reaction). The detectable moieties as described herein can be conjugated to an oligonucleotide to form a "detection probe". Accordingly, detection probes can be labelled by various reporter detectable moieties as defined above.
[0225] Preferably, step (iii) comprises detecting the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence by microscopy; optionally wherein microscopy is selected from the group comprising: bright-field microscopy; fluorescence microscopy; or a combination thereof. Preferably, step (iii) comprises determining the sequence of the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence, the one or more unedited polynucleotide sequence and / or the one or more incorrectly-edited polynucleotide sequence. More preferably, step (iii) comprises determining the sequence of the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence, the one or more unedited polynucleotide sequence and / or the one or more incorrectly-edited polynucleotide sequence by DNA sequencing. It will be appreciated that the processing of the RCA-Products (such as capture, monomerization etc) can be performed simultaneously for all RCA-Products.
[0226] Preferably, the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence, the one or more unedited polynucleotide sequence and / or the one or more incorrectly-edited polynucleotide sequence are flown through a microfluidic channel so as to concentrate the one or more RCA- Products into a specified area, for example into a single view area of a microscope.
[0227] Preferably, step (iii) comprises immobilizing the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence, the one or more unedited polynucleotide sequence and / or the one or more incorrectly-edited polynucleotide sequence on a surface; for example, by electrostatic interaction, covalent interaction and / or steric interaction with said surface.
[0228] Preferably, the method comprises the step of attaching the one or more RCA- Products from the one or more correctly-edited polynucleotide sequence, the one or more unedited polynucleotide sequence and / or the one or more incorrectly-edited polynucleotide sequence to magnetic beads to provide beadbound RCA-Products; optionally the one or more RCA-Products are immobilized by providing a magnetic source so as to attract the bead-bound RCA-Products to a position on the surface. In an embodiment, the surface is a glass surface, optionally a glass surface which is modified to interact with an RCA-Product.
[0229] Preferably, the surface is a porous membrane, such as a filter membrane, for example a porous hydrophilic membrane; optionally wherein the one or more RCA-Products from the one or more correctly-edited polynucleotide sequence, the one or more unedited polynucleotide sequence and / or the one or more incorrectly-edited polynucleotide sequence are immobilized by filtering through the porous membrane.
[0230] In an embodiment, the invention provides a method wherein the efficiency of the genetic- editing procedure is determined based on the relative amounts of the incorrectly-edited polynucleotide sequences and / or the one or more correctly-edited polynucleotide sequences and / or the one or more unedited polynucleotide sequences in the sample.
[0231] As explained above, the method of the invention may be used to detect more than one target polynucleotide sequence. It is useful for identifying sequences of interest, such as incorrectly-edited sequences in genetic-editing procedures, as well as correctly-edited or unedited sequences, to assess the efficiency of the editing process. As explained elsewhere herein, correctly-edited and / or unedited polynucleotide sequences can also be targeted with the method of the invention, and detecting and / or characterising of said sequences together with detecting and / or characterising the one or more incorrectly-edited polynucleotide sequences can be used to assess efficiency of the efficiency of the genetic- editing procedure.
[0232] As explained above, if incorrectly-edited polynucleotide sequences are detected in a multiplex setting, they may be quantified relative to one another and / or may be quantified relative to the total number of polynucleotide sequences in the sample.
[0233] The means for quantifying the one or more incorrectly-edited polynucleotide sequence are explained elsewhere herein, and can also be used for quantifying the one or more correctly- edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence.
[0234] Accordingly, the invention provides a method wherein incorrectly-edited polynucleotide sequences are detected and quantified relative to the correctly-edited polynucleotide sequences and / or unedited polynucleotide sequence in the sample. The one or more incorrectly-edited polynucleotide sequences may be quantified relative to the total number of polynucleotide sequences in the sample by also quantifying the total number of the correctly-edited polynucleotide sequences and / or unedited polynucleotide sequence in the sample. Accordingly, in an embodiment, the method further comprises the step of quantifying the total number of polynucleotide sequences in the sample.
[0235] By "total number of polynucleotide sequences in the sample" we include the combined total number of incorrectly-edited polynucleotide sequences, correctly-edited polynucleotide sequences and unedited polynucleotide sequences in the sample. In an embodiment, the total number of incorrectly-edited polynucleotide sequences may be quantified followed by quantifying the total number of correctly polynucleotide sequences, quantifying the total number of unedited polynucleotide sequences, and totalling the amounts. The quantification of each sequence type may be done in any order. Alternatively, the total number of incorrectly-edited polynucleotide sequences, correctly-edited polynucleotide sequences and unedited polynucleotide sequences can be quantified and totalled simultaneously.
[0236] Preferably, the efficiency of the genetic-editing procedure is determined based on the relative amounts of the incorrectly-edited polynucleotide sequence, the correctly-edited polynucleotide sequences and / or unedited polynucleotide sequences in the sample.
[0237] For example, the number of incorrectly-edited polynucleotide sequences can be determined relative to a control with the remainder relating to unedited sequences. Should the total amount of combined edited and non-edited sequences not equal 100%, this difference can be attributed to incorrect or non-expected edits.
[0238] In an embodiment the efficiency of the gene-editing procedure may be determined by adding a detectable moiety that targets the same polynucleotide that is being edited, but in a different position to that which is being edited.
[0239] This allows for accurate quantification of efficiency of the method as depending on the method a gene may be edited, but not by the right sequence. Therefore, the detectable moiety theoretically binds to all polynucleotide sequences in the sample as it binds to a different position on the polynucleotide to that which is being edited, thus allowing for the total number of sequences to be determined. The moieties targeting the edited and nonedited sequence will provide quantification of the total number of correctly-edited and nonedited sequences, whilst the difference between the total number of polynucleotide sequences and the combination of correctly-edited and non-edited sequences will allow quantification of the incorrectly edited sequences. The inclusion of this control probe allows for very accurate analysis of efficiency without relying on other more expensive and complicated methods, such as ddPCR and qPCR.
[0240] In a second aspect, the invention provides the use of Rolling Circle Amplification for detecting and / or characterising incorrectly-edited polynucleotide sequences from a geneediting procedure.
[0241] In a third aspect, the invention provides a kit of parts comprising;
[0242] (i) an Oligonucleotide-Tag as defined herein;
[0243] (ii) a Probe, as defined herein; and
[0244] (iii) one or more reagent for performing Rolling Circle Amplification.
[0245] In some embodiments of the kit disclosed herein, the kit further comprises one or more of the following:
[0246] - one or more reagent for performing genetic-editing of a polynucleotide sequence; one or more reagent for capturing an RCA-Product; one or more monomerization reagent; one or more Probe controls and / or spike-in templates;
[0247] - one or more library preparation reagent; and / or instructions for performing the method according to any one of Claims 1 to 21.
[0248] In some embodiments, the kit further comprises one or more of the following:
[0249] - Lysis buffer;
[0250] Neutralisation reagent;
[0251] Fragmentation reagent;
[0252] One or more fragmentation enzyme (such as Alul, Ddel, Msel);
[0253] Hybridisation and Ligation (Probing) Buffer;
[0254] - One or more ligation enzyme (such as Ampligase / Tth Ligase / Taq DNA ligase);
[0255] - The RCA Buffer;
[0256] One or more amplification enzymes (such as Phi29 polymerase or Exonuclease I); Hybridisation buffer;
[0257] Biotinylated oligonucleotides;
[0258] - Streptavidin-coated magnetic beads; Magnetic Rack;
[0259] Monomerisation buffer;
[0260] One or more monomerization enzyme (such as Alul, SapI);
[0261] 2nd strand synthesis buffer with 2nd strand synthesis primer;
[0262] One or more 2nd strand synthesis enzyme (such as T4 DNA polymerase / klenow fragment polymerase).
[0263] In a fourth aspect, the invention provides a population of DNA molecules obtained or obtainable by a method according to the first aspect of the invention, or by the use according to the second aspect of the invention.
[0264] Preferably, the population of DNA molecules comprises one or more RCA-Products comprising nucleotide sequence corresponding to an incorrectly-edited polynucleotide sequence from a genetic-editing procedure and nucleotide sequence corresponding to an Oligonucleotide-Tag.
[0265] In a fifth aspect, the invention provides a method for detecting and / or characterising a double-strand break in a polynucleotide sequence, the method comprising the steps of:
[0266] (i) providing a sample comprising a polynucleotide sequence having a doublestrand break;
[0267] (ii) integrating an Oligonucleotide-Tag at the double-strand break;
[0268] (iii) performing Rolling Circle Amplification, to generate one or more RCA- Products from the double-strand break in the sample; and
[0269] (iv) detecting and / or characterising the double-strand break in the polynucleotide sequence based on the one or more RCA-Products generated in step (iii).
[0270] In some embodiments of the method according to the fifth aspect as disclosed herein, the double-strand break is at an unknown site in the one or more polynucleotide sequence.
[0271] It will be clear from the description of the invention herein, that the present invention provides a method by which a double-strand break in a polynucleotide sequence can be detected and / or characterised. As explained, such double-strand breaks may occur at one or more unknown sites in a polynucleotide sequence, for example as a result of genetic- editing procedures. Detecting the presence of such double-strand breaks, and / or characterising the double-strand break (for example, determining its location and / or sequence) can be an important step of molecular analysis, but detection and characterisation is challenging when the location of the double-strand break is unknown.
[0272] It will be apparent that the embodiments of the method of the first aspect of the invention are applicable to the fifth aspect of the invention. Those embodiments apply accordingly to the fifth aspect of the invention, in the detecting and / or characterising a double-strand break in a polynucleotide sequence.
[0273] In a sixth aspect, the invention provides for the use of Rolling Circle Amplification for detecting and / or characterising a double-strand break in a polynucleotide sequence.
[0274] Preferably in the method of the fifth aspect of the invention or the use of the sixth aspect of the invention, the polynucleotide sequence having a double-strand break is generated in a genetic-editing procedure.
[0275] Preferably, in the methods and uses of the invention, the sample from a genetic-editing procedure comprises or consists of: a crude DNA extract; whole genome amplified DNA; and / or purified genomic DNA.
[0276] The sample may be any sample, from any source or of any origin, in which it is desired to detect a target polynucleotide sequences in a one or more incorrectly-edited polynucleotide sequence. A sample may thus be any clinical or non-clinical sample, and may be any biological, clinical or environmental sample in which the target polynucleotide sequence may occur. Representative samples include any biological material which may contain a target polynucleotide sequence, including for example whole blood and blood-derived products such as plasma, serum and buffy coat, blood cells, urine, faeces, cerebrospinal fluid or any other body fluids (e.g. respiratory secretions, saliva, milk, etc.), tissues, biopsies, cell cultures, cell suspensions, cell culture constituents, foods and allied products, or clinical and environmental samples. Such biological material may also comprise all types of eukaryotic cells (mammalian and non-mammalian animal cells, plant cells, algae, fungi, protozoa), prokaryotic cells (bacteria (i.e. from Eubacteria), archaeal cells (i.e. from Archaebacteria), mycoplasmas), virus, protoplasts and / or organelles.
[0277] The sample may be a fresh sample that has not undergone preservation, or a frozen, cryopreserved sample or fixed sample. By "fixed sample" we include samples treated to preserve cellular and tissue morphology by cross-linking proteins and stabilizing cellular structures, for example with formaldehyde, paraformaldehyde, and / or alcohol. Another example of "fixed sample" is a formalin-fixed, paraffin-embedded (FFPE) sample.
[0278] The sample may be pre-treated in any convenient or desired way to prepare for use in the methods and uses of the invention.
[0279] By "crude DNA extract" we include a sample comprising one or more polynucleotide sequence (preferably one or more DNA molecule) that has been isolated from a one or more cell or a one or more tissue with minimal processing. This means that the DNA is extracted quickly and with fewer purification steps, resulting in a mixture that still contains some cellular components, such as proteins, lipids, and other.
[0280] Crude DNA extracts are often used in applications where high purity is not essential, such as initial screening experiments. The main advantage of using crude extracts is that they are faster and less labour-intensive to prepare compared to highly purified DNA samples.
[0281] Using crude DNA extracts in molecular procedures present several challenges. As explained above, crude extracts often contain residual cellular components and / or other cellular debris that can inhibit a reaction. Additionally, substances used during the extraction process (such as phenol, ethanol, or salts) can remain in the crude extract. The above contaminants can interfere with an enzyme (for example, affecting binding of primers or the activity of a DNA polymerase), leading to reduced efficiency or completely compromised amplification.
[0282] Additionally, the quality and quantity of DNA in crude extracts can vary significantly between samples. This inconsistency can result in variable results, making it difficult to reproduce experiments or compare data across different samples. In turn, crude extracts may not be suitable for applications requiring high sensitivity, such as detecting low- abundance targets.
[0283] In some embodiments the method comprises the step of preparing the sample prior to step (ii) of the method of the invention. In some embodiments, the preparation steps include any one from the list comprising: DNA extraction from the sample; DNA purification from the sample extract, denaturation of the DNA; and / or whole genome amplification. In some embodiments of the methods disclosed herein, the DNA is extracted from at least 1 cell, at least 10 cells, at least 100 cells, at least 1,000 cells, at least 10,000 cells, at least 25,000 cells, at least 50,000 cells, at least 75,000 cells, at least 100,000 cells, at least 250,000 cells, at least 500,000 cells, at least 750,000 cells, at least 1,000,000 cells, at least 10,000,000 cells, or at least 100,000,000 cells, or more. In a preferred embodiment, the DNA is extracted from at least 10 cells.
[0284] In some embodiments of the methods disclosed herein, the DNA is extracted from between 1 to 3,000,000 cells , 1 to 1,000,000 cells, 1 to 100,000 cells, 1 to 50,000 cells, 1 to 10,000 cells, or from between 1 to 1,000 cells. Preferably the DNA is extracted from between 1 to 1,000,000 cells, more preferably from between 10 to 1,000,000 cells.
[0285] In a seventh aspect, the invention provides a method, use, a kit of parts or a population of DNA molecules substantially as described herein, with reference to the accompanying description, examples and / or figures.
[0286] It will be apparent that, in addition to the aspects and embodiments of the invention described above, the method of the invention can also be used for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0287] Accordingly, in a further aspect, the present invention relates to methods for detecting and / or characterising the presence of one or more payload sequence following a genetic- editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure. The invention also relates to kits, populations of DNA molecules, and uses related to detecting and / or characterising the presence of one or more payload sequence following a genetic- editing procedure; wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0288] As is well known, genetic-editing technologies have enabled the integration of payload sequences into target polynucleotide sequences. The term "payload sequence" generally refers to a specific polynucleotide sequence that contains, or encodes, genetic information that it is desired to integrate into a larger polynucleotide sequence such as a cell genome. A payload sequence may, for example, encode a therapeutic protein intended to produce a therapeutic effect, for example, in a recipient cell. Such payloads may include functional gene cassettes, regulatory elements, and / or therapeutic constructs.
[0289] Whilst it is known that payload sequences can be integrated into larger polynucleotide sequences via genetic-editing techniques, the process itself can often result in the payload sequence being integrated at an unknown site. However, current molecular biology approaches for detecting and / or characterising integrated sequences often require prior knowledge of the precise integration site, limiting their utility in cases of random or unknown site integration.
[0290] In relation to payload delivery, a disadvantage of random, unknown or unintended genomic site integration is loss or unpredictability of the function of the payload sequence. In some cases, random or unknown site integration increases a risk of insertional mutagenesis and / or cancer. More specifically, when a payload sequence integrates into a random, unknown or unintended genomic site, it could lead to:
[0291] - insertional mutagenesis, where the incorrectly integrated payload disrupts a host gene (e.g. integrating within a tumor suppressor or essential gene) or inserts near an oncogene, activating it through introduced promoters or enhancers. Both of these can increase the risk of malfunction or malignant transformation, and the development of cancer; genomic instability, where incorrect payload integration breaks or rearranges DNA, leading to chromosomal abnormalities (deletions, duplications, translocations), and over time, makes cells more prone to malfunction or malignant transformation, and the development of cancer; loss of function of the payload (for example, a loss of therapeutic effect), e.g. if a payload integrates in a recessive or heterochromatic region, the payload sequence might not be functional (for example, it might not be expressed, either optimally or at all);
[0292] - disruption of regulatory networks, particularly when a payload comprises promoters or enhancers which, if integrated near endogenous genes, may misregulate them, causing unintended gene expression; and / or mosaicism and variability: with random integration for applications inside host cells (such as human-derived cells), there is a risk that the payload sequence will integrate differently across different cells. This can lead to inconsistent therapy, where the cells where payload integration was successful express the therapeutic gene strongly, while the cells where payload was incorrectly integrated, or not at all, exhibit a defective phenotype.
[0293] The above risks can limit efficacy and complicate safety assessment of the cell or organism that is generated using gene-editing techniques.
[0294] Detecting and / or characterising the presence of payload sequence following a genetic- editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure across the genome, currently remains a significant challenge. These challenges are discussed elsewhere herein in relation to identifying alterations (such as insertions or deletions ("indels"), translocations and / or complex genomic rearrangements), and apply equally to detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure.
[0295] Detecting a payload sequence which was integrated into an expected "on-target site" can be conducted by e.g. Amplicon Sequencing, discussed elsewhere herein.
[0296] At present, methods for detecting a payload sequence which has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure can be conducted by methods developed to assess off-target effects, such as: (i) Whole Genome Sequencing (WGS); (ii) in vitro genomic DNA cleavage; (iii) anchored-primer target enrichment; (iv) in situ end-capture; (v) chromatin immunoprecipitation sequencing (ChlP-seq); and (vi) translocation enrichment. Those methods, and specific examples, are discussed in detail herein.
[0297] Importantly, a number of those methods suffer from a series of disadvantages, such as i) limited throughput and efficiency, high cost; (ii) bias towards detecting more unintended activity in vitro than occurs in cellular environments, and not accounting for endogenous DNA repair mechanisms; (iii) low reproducibility; (iv) bias towards detecting cleavage events at specific time points, which may not represent the full extent of events over time; (v) low sensitivity, and poor accuracy. In addition, all the above methods use next generation sequencing, which, due to high sequencing cost, make them impractical when dealing with large cell populations or complex organisms. In practice, it is often necessary to use a combination of methods to gain a complete picture of gene-editing outcomes, which again raises cost. Therefore, there remains a need for universal, targeted, genome-wide methods that can comprehensively and accurately detect and characterise the presence of payload sequences regardless of their integration site, across a range of cell types and experimental conditions. Such a method would give several important benefits in gene therapy development and clinical use. For example, early characterisation of incorrect payload integration sites would allow researchers / clinicians to: identify those events before they expand clonally and cause oncogenic transformations;
[0298] - characterise whether payloads become integrated in transcriptionally active regions or silent ones — which correlates with whether the payload sequence (such as a therapeutic gene) is likely to be expressed; refine vector design for delivering payload sequences (for example, if a vector consistently leads to the payload integrating in random or incorrect locations, modifications can be made to reduce bias integration).
[0299] In addition, agencies like the European Medicines Agency (EMA) and U.S. Food and Drug Administration (FDA), which are regulatory agencies that oversee medicines and therapies including gene therapies, expect evidence that integrations are being monitored.
[0300] Against this background, the present inventors have developed methods, uses and kits for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure, which are surprisingly advantageous.
[0301] As already discussed above in relation to the preceding aspects and embodiments, the invention provides a method for detecting incorrectly-edited polynucleotide sequences from a genetic-editing procedure. In one embodiment, the invention comprises integrating an Oligonucleotide-Tag, and identifying its site of integration using Rolling Circle Amplification (RCA).
[0302] It will be appreciated that this same general concept and approach may be used in order to detect and / or characterise the presence of one or more payload sequence following a genetic-editing procedure, wherein the payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure. Just as RCA has not been previously used for detecting and / or characterising incorrectly- edited polynucleotide sequences from a genetic-editing procedure (as discussed and explained elsewhere herein), it has not been previously used for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0303] As explained herein, and shown in the accompanying Examples, the present invention provides methods, uses and kits that can detect and / or characterise the presence of one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure, across the whole genome (genome-wide), and that are cost effective and practical enough for routine use. Advantageously, the generated RCA Products (RCPs) can be analysed directly via optical imaging, or undergo further processing steps, such as sequencing, if required. Conveniently, further steps, such as enrichment, can be performed to provide even more accurate data.
[0304] Accordingly, the method of this further aspect of the invention offers several advantages over the methods in the art:
[0305] Cost efficiency:
[0306] No reliance on NGS: though it remains compatible with sequencing if needed.
[0307] - Reduced costs: particularly because the method uses an approach combining targeted detection using Probes with RCA amplification, which can be performed with inexpensive reagents and standard lab equipment, making it practical for routine use.
[0308] - Scalability: because it avoids whole genome sequencing and / or non-target amplification, it is suitable for applications involving large cell numbers or whole organisms.
[0309] High sensitivity and specificity:
[0310] Single-molecule resolution: RCA enables digital detection, characterisation and quantification of individual payload integration events;
[0311] High fidelity amplification: use of a high-fidelity polymerase ensures low error rates compared to PCR-based methods.
[0312] Direct detection of payload: compared to the first aspect of detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure, the present method does not rely on incorporating Oligonucleotide-tags, rather it detects integrated payload sequences directly, particularly when payload become integrated at one or more unknown site.
[0313] Unbiased Genome-Wide detection:
[0314] - No prior knowledge of integration site required : use of degenerate probes allows detection of payloads at unknown, random or unintended genomic locations.
[0315] - Applicable to both on-target and off-target integration: The method is not limited to predefined loci, but can detect unintended integration events.
[0316] High reproducibility and multiplexing capability:
[0317] Simplified workflow: fewer steps and reduced reliance on complex enzymatic reactions or multi-method combinations improve consistency.
[0318] - Simultaneous detection of multiple payloads: probes can be barcoded or labelled with spectrally distinct fluorophores, enabling parallel analysis; or characterising using sequencing.
[0319] Regulatory and clinical utility:
[0320] - Supports safety assessments: enables early identification of insertional mutagenesis or integration near oncogenes.
[0321] Facilitates regulatory compliance: provides comprehensive data on payload integration, supporting EMA / FDA expectations for gene therapy products.
[0322] In an eighth aspect, the invention provides a method for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure, the method comprising the steps of:
[0323] (i) providing a sample from a genetic-editing procedure, the sample comprising the one or more payload sequence;
[0324] (ii) generating circular single-stranded polynucleotide substrates from the one or more payload sequence, and performing Rolling Circle Amplification to generate RCA-Products from the one or more payload sequence in the sample; and (iii) detecting and / or characterising the presence of the one or more payload sequence based on the RCA-Products generated in step (ii).
[0325] Thus, the invention provides a method for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure. The method is based on RCA which, as shown in the accompanying Examples, provides a number of advantages when compared to current methods. As explained in detail below, RCA-based methods are highly-specific, amenable to multiplexed reactions and high-throughput analysis / screening, and can be performed with standard equipment and inexpensive reagents.
[0326] Unlike the prior art techniques, and as explained below and in the accompanying Examples, the present invention avoids the multiple complex steps using expensive and specialist laboratory equipment. Notably, the present invention can be performed without the need for DNA sequencing; but where sequencing is desired, it is compatible with sequencing platforms, sequencing protocols and instruments.
[0327] By "payload sequence" we include a polynucleotide sequence for integration into a target polynucleotide (such as genomic DNA) and which is subsequently integrated into that polynucleotide during a genetic-editing procedure. The payload sequence comprises one or more functional genetic elements designed to achieve a specific biological effect, such as therapeutic gene expression and / or regulatory control. Examples of these functional genetic elements are discussed in detail below.
[0328] In some embodiments, the payload sequence comprises a DNA sequence. In some embodiments, the payload sequence comprises an RNA sequence. In some embodiments, the payload sequence comprises one or more non-coding RNA selected from the group comprising: siRNA, shRNA, microRNA, and antisense RNA. In some embodiments, the payload sequence comprises a heteropolymer comprising DNA and RNA sequence. In some embodiments, the payload sequence comprises a single stranded DNA sequence and / or double stranded DNA sequence.
[0329] For the avoidance of doubt, the payload sequence comprises a functional genetic material intended to be inserted into the genome. As explained in detail below, the payload sequence preferably comprises any one or more from the following: gene expression cassette, coding sequence, regulatory element, and / or other functional domain relevant to therapeutic, diagnostic, and / or research applications.
[0330] The term "integrated" is disclosed elsewhere herein in the context of detecting and / or characterising incorrectly-edited polynucleotide sequences, and in particular integration referred to the covalent attachment of the Oligonucleotide-Tag at a double-strand break (DSB) site within the DNA. In that context, the integration of the Oligonucleotide-Tag was performed either in vivo or ex vivo, before detecting and / or characterising the unintended genetic alterations.
[0331] In the present aspect of the invention, which is directed to detecting and / or characterising one or more integrated payload sequence following a genetic-editing procedure, the term "integration" encompasses the biological process by which the payload sequence becomes covalently incorporated into the host polynucleotide sequence. Integration may be mediated by endogenous cellular mechanisms, including but not limited to homology- directed repair (HDR), non-homologous end joining (NHEJ), site-specific recombination, or vector-mediated insertion. All of those mechanisms are described in detail herein. The integration may result in single-copy or multi-copy events, and may occur in the cell nucleus (e.g. genomic DNA, gDNA etc.), outside the cell nucleus (e.g. mitochondrial DNA), depending on the delivery system and / or cell type. The present invention is particularly advantageous in detecting one or more payload sequence integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0332] The integration of the one or more payload sequence into a polynucleotide sequence at an unknown site does not require ex vivo ligation. All references of "integration" of one or more payload sequence into a polynucleotide sequence relate to biological events occurring within cells, whereby the payload sequence is incorporated into the DNA as part of the genetic-editing procedure.
[0333] In some embodiments, integration may occur within euchromatin, such as transcriptionally active regions, thereby permitting expression of the payload sequence. In other embodiments, integration may occur within heterochromatin, including facultative heterochromatin or constitutive heterochromatin, wherein transcriptional activity may be limited or repressed. In some embodiments, integration may occur in intergenic regions, in intronic regions of endogenous genes, or within exonic regions. In some embodiments, integration occurs in proximity to regulatory elements, including promoters, enhancers, silencers, or insulator sequences, thereby altering expression of host or payload sequences. In some embodiments, integration may occur within or adjacent to repetitive elements, such as satellite DNA, minisatellites, microsatellites, short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs), or endogenous retroviral elements. In certain embodiments, the payload sequence may integrate at or near telomeric regions or centromeric regions of the genome.
[0334] The term "genetic-editing procedure" is disclosed elsewhere herein in the context of detecting and / or characterising incorrectly-edited polynucleotide sequences. Unless stated otherwise, the term should be construed to apply mutatis mutandis to the present aspect for detecting and / or characterising the presence of one or more payload sequences following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site.
[0335] Genome-editing tools disclosed elsewhere herein (such as ZFNs, TALENs, CRISPR / Cas9), can also be used to integrate a payload sequence into a polynucleotide sequence. A payload sequence can additionally become integrated into a one or more polynucleotide sequence by viral vector-mediated integration (e.g. retroviruses, lentiviruses), episomal viral vectors (e.g. AAV), transposon-based systems (Sleeping Beauty, PiggyBac, Tol2), non-viral DNA vector technologies (e.g. "MegaBulb DNA" and "doggybone DNA" (dbDNA), also known as or similar minicircle DNA constructs) and lipid nanoparticles (LNPs).
[0336] In some embodiments, the invention provides a payload sequence in the form of a MegaBulb DNA (also referred to herein as a minicircle DNA). MegaBulb DNA comprises a minimised circular DNA molecule in which non-essential bacterial backbone elements, including but not limited to replication origins and antibiotic resistance markers, are removed. The resulting construct contains only a therapeutic expression cassette and optional regulatory elements. By virtue of its reduced sequence complexity, MegaBulb DNA demonstrates improved transgene expression, decreased risk of unwanted recombination, and a reduced likelihood of integration into the host genome, thereby providing a safer alternative to conventional plasmid DNA vectors.
[0337] In other embodiments, the invention provides a payload sequence in the form of a Doggybone DNA (dbDNA). Doggybone DNA comprises a linear, double-stranded DNA molecule produced by an enzymatic, cell-free amplification process. The termini of the Doggybone DNA molecule are covalently closed hairpin loops, thereby stabilizing the construct and reducing susceptibility to nuclease degradation. dbDNA vectors are free of bacterial backbone sequences and are capable of episomal maintenance in target cells with a low risk of genomic integration. The cell-free production method further enables rapid, scalable, and contamination-free manufacturing suitable for applications in gene therapy, vaccine development, and engineered cell therapies such as CAR-T.
[0338] By the term "presence" we include the physical occurrence, integration, or existence of one or more payload sequence within a polynucleotide sequence in a sample obtained from a genetic-editing procedure. For example, the payload sequence may be integrated into a genome (e.g. genomic DNA (gNDA), mitochondrial DNA (mtDNA), or chloroplast DNA (cpDNA)), integrated in a single copy, or integrated in multiple copies, or partially integrated (e.g. when only a part of a payload sequence becomes integrated). Accordingly, the "presence" may include qualitative determination of the presence, absence, amount, and / or location of the payload sequence. In some embodiments, the payload sequence is integrated into a polynucleotide sequence an unknown site in a single copy. Such singlecopy integration event can be verified through, e.g. Southern blotting.
[0339] The term "sample from a genetic-editing procedure" is disclosed elsewhere herein in the context of detecting and / or characterising incorrectly-edited polynucleotide sequences. Unless stated otherwise, the term should be construed to apply mutatis mutandis to the present aspect related to detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure.
[0340] By the term "unknown site" we include a location at any given position (nucleotide, or between two nucleotides) within a polynucleotide sequence — such as gDNA, mtDNA, or cpDNA — where a payload sequence has been integrated as a result of a genetic-editing procedure, and where the precise nucleotide position, locus, or genomic context of the integration is not predetermined, not targeted, or not known prior to detection. Accordingly, the term "unknown site" encompasses random integration events resulting from non-targeted delivery systems; "off-target" integration events occurring outside the intended editing locus; integration into regions of the genome that are not precharacterised or mapped; and situations where the payload sequence is present, but its genomic location is not identifiable without further analysis. As explained in detail below, the inventors have shown that the detection and / or characterisation of the presence of one or more payload sequence at an "unknown site" may be achieved using probes comprising degenerate nucleotide sequences, which allow hybridisation to variable or undefined regions adjacent to the payload sequence.
[0341] To contextualize "random" integration or integration at an "unknown site", a payload sequence may be integrated either at a predetermined, intended and / or known location (referred to herein as an "on-target site"), or at a unknown site (which would constitute an "off-target site"). Integration at an on-target site can typically be detected using conventional methods, such as PCR amplification with site-specific primers, or targeted sequencing approaches, since the expected integration locus is known in advance. In contrast, integration at an unknown site (off-target site) — where the payload sequence is present, but its genomic location is not known — requires alternative detection strategies. The method of the eighth aspect is particularly advantageous for detecting payload sequences integrated into a polynucleotide sequence at unknown or "off-target" sites, using probes comprising degenerate nucleotide sequences and Rolling Circle Amplification (RCA).
[0342] For the avoidance of doubt, in some embodiments, integration of a payload sequence at an unknown site may be a desired outcome of the genetic-editing procedure. For example, certain delivery systems or experimental designs may intentionally rely on non-targeted integration to achieve stable expression or broad genomic distribution of integrated payload. In such cases, the precise location of integration is still not known prior to detection. The present invention provides methods that are particularly suited to detecting such non-targeted integration events.
[0343] In some embodiments of the methods disclosed herein, one or more payload sequence has been integrated into a polynucleotide sequence in at least one unknown site, at least two unknown sites, at least three unknown sites, at least four unknown sites, at least five unknown sites, at least 10 unknown sites, at least 20 unknown sites, at least 50 unknown sites, or at least 100 unknown sites.
[0344] The term "polynucleotide sequence" is disclosed elsewhere herein in the context of detecting and / or characterising incorrectly-edited polynucleotide sequences. Unless stated otherwise, the term should be construed to apply mutatis mutandis to the present aspect for detecting and / or characterising the presence of one or more payload sequences following a genetic-editing procedure. In the context of detecting and / or characterising the presence of one or more payload sequence, the term "detecting" is used herein to include the step of determining the presence of the one or more payload sequence following a genetic-editing procedure in the target polynucleotide molecule. It will be understood that in the methods of the invention the target polynucleotide molecule is detected by detecting the RCA-Products of Step (ii), which serve as a "reporter" for the one or more integrated payload sequence . Accordingly, detecting the RCA-Products in Step (iii) may include determining, measuring, assessing, or assaying the presence or absence or amount or location of the RCA-Products in any way. The presence of RCA-Product in the sample (i.e. the confirmation of its presence or amount) is indicative or identificatory of the presence of the one or more payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site . Thus, detection of the RCA- Products generated in step (ii) allows accurate determination of the presence of the one or more integrated payload sequence following a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site.
[0345] In the context of detecting and / or characterising the presence of one or more payload sequence, the term "characterising" is used herein to include the step of assessing one or more specific attributes, features, or properties of the one or more integrated payload sequence. Accordingly, by "characterising" we include one or more from the group comprising: determining the nucleotide sequence of the one or more one or more integrated payload sequence, and the polynucleotide sequence which the payload was integrated into; determining the type and / or nature of the payload within the polynucleotide sequence; determining the structure (for example, secondary) of the one or more integrated payload sequence; determining the inter- and / or intramolecular interactions between and / or within the one or more integrated payload sequence; identifying and / or categorizing and / or mapping the one or more integrated payload sequence to known genes, transcripts, and / or isoforms; functional analysis (such as linking the one or more incorrectly-edited polynucleotide sequence to biological functions and pathways to understand its role in a certain process; and / or annotation or identification of functional motifs in the sequence. As explained in detail below, in some embodiments, sequencing of the RCA-Products from the integrated payload sequence in step (iii) is a preferred form of characterisation, as it enables precise identification of the integrated payload sequence and its integration context, including the sequence of unknown and / or off-target sites. In some embodiments, the invention provides a method wherein the one or more integrated payload sequence is enriched and / or captured in step (i), before step (ii). Enriching and / or capturing said one or more integrated payload sequence may help ensure that the one or more integrated payload sequence are adequately represented in the final data, increasing accuracy of the assay.
[0346] The term "enriched" or "enrichment" is defined elsewhere herein in the context of purifying or concentrating incorrectly-edited polynucleotide sequences from a sample. Unless stated otherwise, the term should be construed to apply mutatis mutandis to the present method for detecting and / or characterising the presence of one or more payload sequences following a genetic-editing procedure. Accordingly, references to "enrichment" of incorrectly-edited polynucleotide sequences should be understood to encompass analogous steps for increasing the relative abundance, purity, or detectability of integrated payload sequences in a sample. Such enrichment may be achieved through selective amplification (e.g. using primers or probes specific to the payload sequence), size selection, or removal of inhibitory components (such as unbound or unligated probes), thereby enhancing the sensitivity and specificity of downstream detection and / or characterisation.
[0347] The term "captured" or "capturing" is defined elsewhere herein in the context of isolating incorrectly-edited polynucleotide sequences from a sample. Unless stated otherwise, the term should be construed to apply mutatis mutandis to the present method for detecting and / or characterising the presence of one or more integrated payload sequences. Accordingly, references to "capturing" incorrectly-edited polynucleotide sequences should be understood to encompass analogous steps for isolating payload sequences from a sample. Such capturing may be achieved through methods including, but not limited to, hybrid capture, immunoprecipitation, or other probe-based or affinity-based techniques suitable for retrieving payload sequences or its amplification products.
[0348] In some embodiments, payload sequences integrated into a polynucleotide sequence at an unknown site are captured using probes that hybridize to a sequence which is common for all integrated payload sequences, which can be in turn isolated using magnetic beads or other methods. In some embodiments, payload sequences integrated into a polynucleotide sequence at an unknown site are captured in step (i) and before step (ii) using magnetic beads. In some embodiments, the magnetic beads may be coated with a one or more ligand (such as streptavidin) to bind incorrectly-edited polynucleotide sequences (such as avidin-conjugated probes able to hybridise to payload sequences).
[0349] In some embodiments, payload sequences integrated into a polynucleotide sequence at an unknown site are captured by immunoprecipitation, where antibodies are used to bind and isolate target polynucleotide sequences. For example, in chromatin immunoprecipitation (ChIP), antibodies specific to DNA-binding proteins are used to capture DNA-protein complexes.
[0350] Step (ii) of the method of the eighth aspect comprises generating circular single-stranded polynucleotide substrates from the one or more payload sequence, and performing Rolling Circle Amplification to generate RCA-Products from the one or more payload sequence in the sample.
[0351] The term "circular single-stranded polynucleotide substrates" is defined elsewhere herein in the context of ligating Probes that target Oligonucleotide-Tag at incorrectly-edited site. References to circular single-stranded polynucleotide substrates should be understood to encompass substrates generated from payload sequences using Probes described herein, followed by ligation. These substrates may comprise two or more segments complementary to the payload sequence, connected by a linker polynucleotide sequence. When the complementary segments hybridise to the target polynucleotide sequence, the probe can be ligated and become circularized.
[0352] Advantages of RCA are discussed elsewhere herein in the context of performing RCA to generate one or more RCA Products from the one or more incorrectly-edited polynucleotide sequence in the sample. Those advantages apply mutatis mutandis to the present aspect of detecting and / or characterising the presence of one or more integrated payload sequences.
[0353] In addition, the term "high fidelity amplification" is defined elsewhere herein in the context of accurate DNA replication. Unless stated otherwise, the term should be construed to apply mutatis mutandis to the present aspect for detecting and / or characterising the presence of one or more integrated payload sequence. The Inventors' method relies on RCA, which can accurately replicate DNA with minimal errors, making it particularly useful in detecting and / or characterising one or more integrated payload sequence. Step (iii) of the method of the invention requires detecting and / or characterising the presence of the one or more payload sequence based on the RCA-Products generated in step (ii).
[0354] In some embodiments, the one or more payload sequence comprises a polynucleotide sequence of more than 100 bases in length.
[0355] Payload sequence described herein is of greater length than short oligonucleotide elements such as linkers, adapters, Oligonucleotide-tags, or primer sequences, as the payload comprises functional genetic information beyond simple hybridization or ligation domains. Unlike Oligonucleotide-tags and adapters, which may be in the order of a few tens of nucleotides in length (e.g. the Oligonucleotide-Tag disclosed elsewhere herein is between 15 and 75 nucleotides in length), payload sequences typically extend into the kilobase range, thereby accommodating open reading frames, regulatory regions, or other functional components required to achieve an intended effect.
[0356] In some embodiments, the one or more payload sequence comprises a polynucleotide sequence of at least 100 nucleotides in length. In some embodiments, the one or more payload sequence has at least 200 nucleotides in length. In some embodiments, the one or more payload sequence has at least 300 nucleotides in length. In some embodiments, the one or more payload sequence has at least 400 nucleotides in length. In some embodiments, the one or more payload sequence has at least 500 nucleotides (0.5 kilobases (kb)) in length. In some embodiments, the one or more payload sequence has at least 1000 nucleotides (1 kilobases) in length. In some embodiments, the one or more payload sequence has at least 2000 nucleotides (2 kilobases) in length.
[0357] In some embodiments, the one or more payload sequence has a length of between about 0.5 kilobases (kb) to about 50 kb. In some embodiments, the one or more payload sequence has a length of between about 1 kilobases (kb) to about 20 kb. In some embodiments, the one or more payload sequence has a length of between about 1 kilobases (kb) to about 10 kb. In some embodiments, the one or more payload sequence has a length of between about 2 kilobases (kb) to about 8 kb.
[0358] In some embodiments, the one or more payload sequence comprises one or more functional gene expression element. By "functional gene expression element" we include any nucleotide sequence within the payload sequence that affects, or contributes to, gene expression and / or regulation. In a preferred embodiment, the one or more payload sequence integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure comprises one or more functional gene expression cassette, optionally wherein the functional gene expression cassette comprises one or more of the following: promoter, coding sequence, regulatory element, polyadenylation signal, enhancer, untranslated region, and vector design feature (such as selection markers, cloning sites, origins of replication).
[0359] In some embodiments, the one or more payload sequence integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure has a seamless vector design (i.e. a design that does not comprise a bacterial sequence). Advantageously, seamless vector design improves safety and reduces immunogenicity in therapeutic contexts.
[0360] In yet other embodiment, the one or more payload sequence integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure comprises one or more recombination site. In some embodiments, the one or more recombination site facilitates site-specific integration of a payload into the host genome, and / or can be used to process the one or more integrated payload sequence after it has been incorporated into a polynucleotide sequence.
[0361] In some embodiments, the one or more payload sequence integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure does not comprise any chemical modifications.
[0362] In some embodiments, the integration of the one or more payload sequence at an unknown site of the polynucleotide sequence is achieved by homology-directed repair, non- homologous end-joining, or transposition.
[0363] Payload sequences may be integrated into polynucleotide sequences through a variety of mechanisms and systems, each with distinct biological and technical characteristics.
[0364] In some embodiments, the one or more payload sequence is integrated into a polynucleotide sequence through lambda integrase systems. Lambda integrase system targets the specific attH4X sites within human LINE-1 elements, offering approximately 900 potential integration sites genome-wide. In some embodiments, a payload sequence is integrated into a polynucleotide sequence by use of lentiviral vectors, optionally Adeno- Associated Virus (AAV) vectors. Lentiviral vectors preferentially integrate into transcriptionally active regions of the genome, often resulting in stable transgene expression. In some embodiment, payload is integrated into a polynucleotide sequence using CRISPR-directed systems. CRISPR-directed systems enable targeted integration at predetermined genomic loci using single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), or circular DNA templates.
[0365] By "transposition" we include a transposase-mediated polynucleotide sequence transfer, such as a payload sequence transfer. During transposition (e.g. payload transposition, where a payload sequence functions similarly to a transposon), a transposase recognises sequences (e.g. Terminal Inverted Repeats, TIRs) flanking the payload sequence. The transposase excises that payload sequence and integrates it into a new genomic target site. Integration via transposition typically produces target-site duplications at the insertion loci.
[0366] As discussed above, exemplary transposon-based systems suitable for payload integration include Sleeping Beauty, PiggyBac and Tol2.
[0367] Integration patterns may vary depending on the tissue type (for example, liver, muscle, hematopoietic cells, and neural tissue). Mechanistically, integration is influenced by the DNA repair pathways active in the target cell. In human somatic cells, non-homologous end joining (NHEJ) is the dominant repair pathway, often resulting in imprecise integration. In contrast, homology-directed repair (HDR) provides a more precise mechanism for payload integration, particularly when a homologous template is available.
[0368] HDR is a template-dependent DNA repair mechanism that enables precise integration of a payload sequence at a one or more polynucleotide sequence. While HDR is typically associated with targeted integration at known loci using homologous flanking sequences, it may also result in integration at unexpected or unintended sites under specific conditions. For example, when donor templates contain homology arms that are partially complementary to multiple genomic regions, or when off-target double-stranded breaks (DSBs) are present, HDR may facilitate integration at non-predetermined loci, thereby producing payload integration at unknown sites. This phenomenon is particularly relevant in complex biological systems or therapeutic contexts where multiple DSBs are induced (e.g. multiplexed CRISPR editing), homology arms exhibit partial or degenerate homology to multiple genomic regions, cellular repair pathways are modulated or imprecise (e.g. in stem cells, embryos, or cancer cells), or where payload delivery vectors (e.g. AAV) are present in high copy number and may recombine with endogenous sequences. The method of the eighth aspect is particularly advantageous in detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, which was integrated into unknown sites.
[0369] In an embodiment, step (ii) comprises generating circular single-stranded polynucleotide substrates using a Probe that targets the one or more payload sequence at the unknown site.
[0370] In a preferred embodiment, the invention provides a method wherein circular singlestranded polynucleotide substrates are generated using one or more oligonucleotide Probe which specifically targets the one or more payload sequence integrated into an unknown site in the polynucleotide sequence. This may be achieved by using one or more Probe that targets one or more payload sequence integrated into an unknown site at a polynucleotide sequence, and that becomes circular only once it hybridises to the respective target; or by circularising the polynucleotide sequence, into which the one or more payload sequence was integrated, using selector probes (explained elsewhere herein). Accordingly, the circularisation occurs only upon successful hybridisation of the Probe to the integrated payload sequence, or alternatively, by circularising the target polynucleotide sequence comprising the integrated payload sequence, or fragment thereof using selector probes. Both mechanisms are explained in detail herein.
[0371] Importantly, in contrast to the method for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure (described elsewhere herein) — which relies on the integration of an Oligonucleotide-Tag at a double-strand break to enable detection of incorrectly-edited polynucleotide sequences — the inventors have shown that one or more integrated payload sequence can be detected and characterised directly, i.e. without the need for the integration of an Oligonucleotide- Tag or engineered DSBs. The ability to detect payload sequences directly — without relying on engineered DSBs or pre-integrated tags — represents a substantial advancement in the field of genome engineering and payload validation. In an embodiment, the invention provides a method wherein the Probe comprises nucleotide sequence capable of binding to the one or more payload sequence.
[0372] Binding of the Probe into the target polynucleotide molecule is described elsewhere herein in the context of hybridising to an Oligonucleotide-Tag. References herein to probe hybridisation and ligation should be understood to encompass analogous steps for probes hybridising to the integrated payload sequence.
[0373] Preferably, the Probe comprises a nucleotide sequence that is complementary to the integrated payload sequence. One or more part of the Probe may hybridise to the payload sequence in a target-dependent manner, forming a first ligation end. In a preferred embodiment, another part of the same Probe molecule, or a separate Probe (e.g. in splitlike configurations), may hybridise to a region adjacent to the payload integration site, forming the second ligation end.
[0374] Upon hybridisation, the ligatable ends (i.e. the first and second ligation end) are brought into juxtaposition and ligated to form a circular single-stranded polynucleotide substrate. This substrate is then amplified via Rolling Circle Amplification (RCA), and the resulting RCA-Product detected and / or characterised to confirm the presence, structure, and integration context of the payload sequence.
[0375] In an embodiment, the Probe comprises a degenerate nucleotide sequence capable of binding at, or adjacent to, nucleotide sequence at the one or more payload sequence.
[0376] The use of degenerate nucleotide sequence in the Probe is explained elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences. Those considerations apply mutatis mutandis to the present method for detecting and / or characterising one or more integrated payload sequences following a genetic-editing procedure. In the context of payload sequence detection, degenerate probes advantageously provide means of identifying integration sites / events without prior knowledge of the exact integration site. Preferably, the Probe comprises a first part that is complementary to a known sequence / region within the payload sequence (e.g. a first ligation end), and a second part (e.g. a second ligation end) comprising one or more degenerate nucleotides capable of hybridising at, or adjacent to the unknown site in the target polynucleotide sequence. In a preferred embodiment, the ligation site is located at the junction spanning the integrated payload sequence and the adjacent polynucleotide sequence upstream or downstream from the integrated payload sequence. Preferably, the 3' ligatable end comprises one or more degenerate nucleotide.
[0377] By "degenerate nucleotide" we include any nucleotide position in a polynucleotide sequence (e.g. within the Probe or oligonucleotide sequence) that is intentionally designed to represent multiple possible base identities. For example, a single oligonucleotide design with one degenerate position would comprise a set of oligonucleotide variants, where each variant has a different base permitted at that position. More specifically, if a Probe contains one degenerate base represented by "N" (which can be A, T, C, or G), then that single design represents four distinct oligonucleotide sequences, each with A, T, C, or G at that position. Degenerate nucleotides are used to introduce sequence variability and enable hybridisation to target sequences with unknown or variable nucleotide composition. This is particularly advantageous when detecting payload sequences integrated at unknown sites, where the exact flanking sequence is not known. Examples of degenerate bases include, but are not limited to N: A, T, C, or G; R: A or G; Y: C or T / U; S: G or C; W: A or T / U; K: G or T / U; M: A or C; H: A, C, or T / U; B: G, C, or T / U; V: A, C, or G; D: A, G, or T / U.
[0378] As discussed above, one or more degenerate nucleotide may be incorporated at the 5' and / or 3' ends of a Probe, or within internal regions, and may be used in combination to generate libraries of probes capable of binding to a diverse range of target sequences.
[0379] Figure 9 illustrates various configurations / designs of the Probe comprising degenerate nucleotide sequences, and how the Probe can bind a payload sequence integrated at an unknown ("off-target") or the expected "on-target" site. In the examples in Figure 9, padlock probes are illustrated.
[0380] The specific drawings in Figure 9 visualise gap-fill padlock probes: upon hybridisation, the two arms of the probe are separated by a gap over the target junction or payload region. In such configurations, the gap must be filled prior to ligation — either by polymerase-mediated extension using the target as a template, or by introducing one or more gap-fill oligonucleotides that occupy the intervening bases — so that the 3' and 5' ligatable ends are brought into direct juxtaposition for circularisation as described elsewhere herein. For completeness, padlock probes designed without a gap (i.e. perfectly-matching arms that bind in direct juxtaposition at the ligation site) can also be used in this context. When the integration site is known, a junction-specific padlock with contiguous arms can be ligated without gap-fill. Conversely, when the flanking genomic sequence is unknown (Figure 9D), a padlock comprising degenerate bases in one arm can hybridise adjacent to the payload, and if hybridisation leaves a discontinuity, a gap-fill step is performed before ligation and RCA.
[0381] Accordingly, the gap may be introduced by Probe design, for example where one arm targets a known payload sequence and is intentionally separated from the other arm to flank the junction (as illustrated in Figures 9A-C); alternatively, a gap may arise when a degenerate arm binds an unknown genomic sequence adjacent to the payload. Because the degenerate arm encompasses any possible sequence, probes are expected to form gaps of varied size across different integration contexts; suitable chemistries for closing such gaps include polymerase extension and / or hybridisation of a library of degenerate gap-fill oligonucleotides, as detailed elsewhere herein.
[0382] In some embodiments, the Probe comprises one or more arms, or parts thereof, having a melting temperature suitable for specific hybridisation to the target polynucleotide sequence. In a preferred embodiment, the melting temperature of the one or more Probe arm, or part thereof, is between about 30°C to about 60°C, more preferably between about 35°C to about 55°C, and most preferably between about 40°C to about 50°C.
[0383] The melting temperature may be influenced by factors including probe length, GC content, and the presence of degenerate bases, and may be adjusted accordingly to accommodate different payload sequence contexts or integration environments. In an embodiment, the Probe comprises between 15 to 20 degenerate nucleotides, and / or wherein the degenerate nucleotides are selected from the group consisting of S, C, or G. In an embodiment, the Probe comprises a GC content sufficient to produce a melting temperature of greater than 70°C, preferably greater than 80°C.
[0384] In an embodiment, the Probe is selected from the group comprising: a padlock probe; a selector probe, a molecular inversion probe; a gap-fill probe; a split-like probe; a Lotus probe; or a combination thereof. Padlock probes, selector probes, molecular inversion probes, gap-fill probes, split-like probes, Lotus probes, Trilock probes, and associated probe designs, aspects and considerations are discussed elsewhere herein (in the context of detecting incorrectly- edited polynucleotide sequences) and apply mutatis mutandis to the present method for detecting and / or characterising one or more integrated payload sequence. Accordingly, probes with ligatable ends described herein, probes designed to hybridise to noncontiguous target regions, probes requiring gap-fill oligonucleotides or polymerase- mediated extension, and probes incorporating degenerate oligonucleotide libraries may equally be used be used for payload sequence detection.
[0385] Further, probe variants such as molecular inversion probes (which flank a gap over the target region), split-like probes (which rely on connector oligonucleotides to bring probe parts into proximity), Lotus probes (which are partially circular and complete circularisation upon hybridisation), and Trilock probes (which use a backbone and linker system to assemble multiple probe components) may be adapted to bind payload sequences and their adjacent genomic regions of unknown sequence. Connector oligonucleotides used in split-like and Trilock probes may be synthetic or natural, and may comprise DNA or RIMA.
[0386] Accordingly, for the present invention, the term "Probe" is used broadly to encompass various probe formats, including padlock probes, gap-fill probes, molecular inversion probes, split-like probes, Lotus probes, Trilock probes, and selector probes.
[0387] A Probe may also comprise one or more additional sequence from the list comprising: sample tag, primer binding site or sequence (e.g. amplification primer binding site or sequence, which can either be the same universal sequence, or one or more different sequence), and Unique Molecular Identifiers (UMIs), which facilitate multiplexed detection, sample pooling, and single-molecule quantification. Probes may further be adapted to be compatible with next generation sequencing readout ("sequencing probes"). In some embodiments, sequencing probes comprise adapter sequences, ligation sites, and flanking primer regions compatible with next-generation sequencing (NGS) workflows. These features are equally compatible with payload-derived RCA-Products, and may be used to count and / or distinguish payloads, track integration events, and / or enable high-throughput analysis.
[0388] Multiplex assays may include multiple probes targeting different payloads or payload variants, and may be configured to detect tens, hundreds, or thousands of distinct payload integration events in a single reaction. The probe designs and tagging strategies described herein are therefore broadly applicable to the detection and characterisation of payload sequences integrated at unknown sites. For example, multiplex assays might use 3, 4, 5, 10, 20, 30, 40, or 50 probes, or even 100, 200, 500, 1000, 10000, or more Probes.
[0389] In an embodiment, the Probe circularises on recognition of its target polynucleotide sequence, and / or the target polynucleotide sequence circularises on recognition of the Probe.
[0390] The term "target polynucleotide molecule" is defined elsewhere herein to define any polynucleotide sequence, such as a target DNA or RNA sequence, that the Probe is designed to recognise (target) and hybridise to. That term, and the associated considerations and aspects apply mutatis mutandis to the method for detecting and / or characterising the presence of one or more integrated payload sequence. In the context of the present invention, the target polynucleotide sequence may comprise one or more payload integration site. The degenerate portion of the Probe is particularly advantageous in binding to unknown site within the target polynucleotide molecule. The method permits detection of integration events in multiple target sequences, including both RNA and DNA, in the same sample.
[0391] It will be appreciated that two circularisation mechanisms may take place:
[0392] - when the Probe becomes circularised upon hybridising to the target sequence; and another when the target becomes circularised upon hybridising to the Probe. For example, padlock probes and related variants typically circularise upon hybridisation and ligation to the target sequence. For gap-fill probes and / or molecular inversion probes, circularization happens after a polymerase or a gapfill oligonucleotide fills the gap in between the ends of the probe and then a ligase joins the ends.
[0393] - when selector probes are used, target polynucleotide sequence circularises upon binding to the selector probe. In other words, selector probes hybridise to the ends of a target fragment and, upon ligation, cause the target to form a circular single-stranded polynucleotide substrate. In an embodiment, circularisation of the Probe is mediated and / or improved by one or more Joining probe.
[0394] The term "Joining probe" is defined elsewhere herein.
[0395] As explained elsewhere herein, selector probes hybridise to the ends of a target nucleic acid fragment and, upon ligation, cause the target fragment itself to circularise, thereby forming a circular single-stranded polynucleotide substrate. Joining probes, as disclosed elsewhere herein, are designed to hybridise adjacent to one another and, upon ligation, form a single continuous probe strand. Accordingly, the primary role of selector probes is to mediate target circularisation, rather than probe concatenation.
[0396] In an embodiment, circular single-stranded polynucleotide substrates are formed by ligation of the Probe and / or the target polynucleotide sequence; optionally wherein ligation is performed by a ligase, preferably a ligase with specific intramolecular ligation activity.
[0397] Ligation chemistry, conditions (e.g. ligation temperature and incubation durations), typical and preferred enzymes for achieving ligation, cofactors, are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more integrated payload sequence.
[0398] Preferably, the ligase is a thermostable DNA ligase, which catalyses the formation of a phosphodiester bond between two adjacent deoxyribonucleotides hybridized to a target nucleic acid molecule.
[0399] In an embodiment, Rolling Circle Amplification of the circular single-stranded polynucleotide substrates is initiated by the target polynucleotide sequence, Probe, and / or by one or more amplification primer that is complementary to the circular singe-stranded polynucleotide substrate.
[0400] In a further embodiment, Rolling Circle Amplification of the circular single-stranded polynucleotide substrates is initiated by more than one amplification primer.
[0401] In an embodiment, the Probe (and consequently any circular single-stranded polynucleotide substrate produced by ligating said Probe) comprises more than one amplification primer-binding site or sequence. As explained elsewhere herein, the Probe may comprise a one or more universal (i.e. identical) amplification primer-binding site or sequence, or the Probe may comprise one or more different amplification primer-binding site or sequence. In an embodiment where the one or more amplification primer-binding site share the same sequence, a universal primer binding said sequence may be used, and multiple copies of the universal primer may bind the Probe and each may initiate RCA. In another embodiment, the one or more amplification primer-binding sites have one or more different sequence; in that embodiment, one or more different primers may be used to bind the Probe and initiate RCA.
[0402] In an embodiment, two or more amplification primers are used. In an embodiment, three or more amplification primers are used. In an embodiment, four or more amplification primers are used. In an embodiment, five or more amplification primers are used. In an embodiment, ten or more amplification primers are used.
[0403] In an embodiment, the Rolling Circle Amplification is branched and / or hyperbranched Rolling Circle Amplification. It will be appreciated that the copy number, sequences, and positioning of amplification primer-binding sites may be varied, if necessary, to affect the amplification process.
[0404] In an embodiment, "branched" Rolling Circle Amplification (RCA) is the amplification in which one or more secondary primers anneal to the growing RCA product (e.g. within repeated sequence units of the concatemer) and initiate additional extension events, thereby generating branch structures from the original product.
[0405] In an embodiment, "hyperbranched" RCA refers to amplification in which such branching is iterative or occurs at multiple levels, so that secondary products themselves serve as templates for further priming, producing a highly branched network of products and increased amplification kinetics relative to linear RCA.
[0406] In an embodiment, branched and / or hyperbranched RCA can be achieved by providing multiple amplification primer-binding sites on the Probe and / or by selecting primers that anneal to sequences within the RCA product. As explained above, the copy number, sequences, and positioning of amplification primer-binding sites may be varied, if necessary, to affect the amplification process. Convenient methods known in the art, such as PCR or a variant thereof, SDA, HAD, LAMP or SMAP are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more payload integrated sequence.
[0407] The considerations regarding Rolling Circle Amplification (RCA) and the use of circular single-stranded polynucleotide substrates as substrates for RCA are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more integrated payload sequence.
[0408] In an embodiment, the polymerase that performs RCA requires a 3'OH end to initiate RCA. Several scenarios are contemplated for RCA initiation in the context of payload sequence detection.
[0409] In one embodiment, the target nucleic acid molecule serves as the primer for RCA. This occurs when e.g. padlock probes or variants thereof are used, as these probes circularise upon hybridisation to the target, and the target provides a free 3' hydroxyl (3'OH) group to initiate amplification. If the 3' OH is not in close proximity to the ligated padlock probe, the polymerase digests the 3'end of the target until it reaches the circularised Probe and then can initiate RCA.
[0410] In another embodiment, one or more separate amplification primer is employed, which is at least partially complementary to the circular single-stranded polynucleotide substrate and provides the necessary 3'OH for polymerase extension; all considerations regarding amplification primers disclosed elsewhere herein are applicable to the method for detecting and / or characterising one or more integrated payload sequence.
[0411] In yet another embodiment, the selector probe provides the 3'OH end from which RCA is initiated. In this scenario, the target payload sequence becomes circularised upon binding the selector probe, and RCA is initiated from the bound selector probe.
[0412] In all of the above cases, amplification is carried out using a DNA polymerase enzyme capable of synthesising DNA from dNTPs; suitable polymerases include, but are not limited to, Phi 29 DNA Polymerase, Vent DNA polymerase, T7 RNA Polymerase, and Bst DNA polymerase. In an embodiment, step (iii) comprises quantifying the one or more RCA-Product and / or determining some or all of the nucleotide sequence of the one or more RCA-Product.
[0413] In an embodiment, the invention provides a method wherein step (iii) comprises quantifying the RCA-Products generated in step (ii), in order to determine the amount of the one or more integrated payload sequence following a genetic-editing procedure.
[0414] The principles of quantifying RCA-Products are disclosed elsewhere herein in the context of incorrectly-edited polynucleotide sequences. Unless stated otherwise, those principles apply mutatis mutandis to detecting and / or characterising one or more payload sequences integrated into a polynucleotide sequence at an unknown site. Accordingly, references to quantifying RCA-Products, whether by absolute or relative means, and detecting RCA- Products using direct or indirect methods (e.g. via detectable moieties or signal-producing systems), are applicable to payload-derived RCA-Products.
[0415] In particular, integrated payload detection may involve quantifying RCA-Products to assess integration efficiency, and / or determining nucleotide sequence to confirm payload identity and its integration location. Where appropriate, known concentration controls, standard curves, and comparative analysis between multiple payload sequences may be employed. By "quantifying one or more RCA-Product" we include quantifying in a digital manner. References to "quantifying one or more RCA-Product in a digital manner" encompass the precise detection and enumeration of individual RCA-Products derived from payload sequences, enabling counting of integrated payload sequences at single-molecule level (single-molecule resolution). Each detection event corresponds to one original payload sequence, thereby allowing high-precision analysis of payload integration events. In particular, single-molecule detection schemes discussed herein may be employed to differentiate and quantify payload-derived RCA-Products. It will be appreciated that RCA- Products derived from payload sequences can be counted, either, e.g. using the microscope, or by using sequencing.
[0416] In an embodiment, the invention provides a method wherein the number of RCA-Products in the sample is greater than 1 RCA-Product, such as greater than 10, 100, 200, 400, 500, 1000, 2000, 5000, 10,000, 100,000, or greater than 500,000 RCA-Products. In an embodiment, the invention provides a method wherein instead of, or in addition to, quantifying one or more RCA-Product, some or all of the nucleotide sequence of one or more RCA-Product is determined.
[0417] The term "determining the nucleotide sequence" is defined elsewhere herein in a context of discerning the nucleotide sequence of one or more incorrectly-edited polynucleotide in a sample. Unless stated otherwise, the term applies mutatis mutandis to detecting and / or characterising the presence of one or more integrated payload sequence following a genetic-editing procedure. Accordingly, references to "determining the nucleotide sequence" encompass the identification of the precise composition and / or arrangement of nucleotide bases within payload-derived RCA-Products, including the detection of motifs, functional elements, integration junctions, integration sites, or sequence variants relevant to payload characterisation.
[0418] In an embodiment, sequencing may be performed to determine the nucleotide sequence of the payload-derived RCA-Products. In a preferred embodiment, the nucleotide sequence of one or more RCA-Product is determined using next-generation sequencing (NGS) techniques, such as sequencing-by-ligation (SBL), sequencing-by-synthesis, or strand sequencing. In some embodiments, payload-derived RCA-Products may be immobilised on a surface, and the nucleotide sequence of one or more RCA-Product is determined using sequencing-by-ligation (SBL).
[0419] In some embodiments of the method disclosed herein, the sequence of at least 2 nucleotides, at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, at least 50 nucleotides, at least 55 nucleotides, at least 60 nucleotides, at least 65 nucleotides, at least 70 nucleotides, at least 75 nucleotides, at least 80 nucleotides, at least 85 nucleotides, at least 90 nucleotides, at least 95 nucleotides or at least 100 nucleotides, or more than 100 nucleotides of one or more RCA-Product is determined.
[0420] In some embodiments of the method of the invention, the nucleotide sequence determined comprises 1 nucleotide, 1 to 5 nucleotides, 1 to 10 nucleotides, 1 to 20 nucleotides, 1 to 50 nucleotides, 1 to 100 nucleotides, 1 to 1000 nucleotides or more than 1000 nucleotides of one or more RCA-Product. The approach of fragmenting RCA-Products into monomeric units for downstream sequencing is disclosed elsewhere herein. Unless stated otherwise, the approach applies mutatis mutandis to the present invention. Accordingly, references to shearing, cutting, clipping, trimming, or monomerising RCA-Products should be understood to encompass analogous steps for processing payload-derived RCA-Products. In particular, long tandem repeats generated by Rolling Circle Amplification may be enzymatically cleaved — e.g. using restriction endonucleases — to produce monomeric fragments suitable for high- throughput sequencing. These fragments may be prepared for sequencing via standard workflows, including end-repair, A-tailing, and ligation of sequencing adapters. This strategy advantageously enables efficient library preparation and facilitates accurate characterisation of payload integration events, without requiring extensive purification.
[0421] In an embodiment, step (iii) comprises determining the sequence of the one or more RCA- Products by DNA sequencing.
[0422] Sequencing, methods, sequencing platforms (such as those used in Example 2 and 3 below), and the steps of monomerising RCA-products, and / or optional adaptors, are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to method for detecting and / or characterising one or more integrated payload sequence following a genetic-editing procedure.
[0423] In an embodiment, the one or more RCA-Products generated in step (ii) are labelled with a detectable moiety; optionally, wherein the detectable moiety is selected from the group comprising: a fluorophore; a chromophore; or a combination thereof.
[0424] Mechanisms for labelling RCA-Products (i.e. direct and indirect), Probe engineering to enable labelling, various detectable moieties and combinatorial labelling, are all discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more payload sequences following a genetic-editing procedure, wherein the payload sequence has been integrated into a polynucleotide sequence at an unknown site.
[0425] In an embodiment, step (iii) comprises detecting the one or more RCA-Products by microscopy; optionally wherein microscopy is selected from the group comprising: bright- field microscopy; fluorescence microscopy; or a combination thereof. In an embodiment, the invention provides a method wherein the one or more RCA-Products are flown through a microfluidic channel so as to concentrate the one or more RCA-Products into a specified area, for example into a single view area of a microscope.
[0426] Considerations regarding liquid flow, concentrating, enrichment and immobilization of the RCA-Produces (as well as considerations regarding the solid support), quantification of the RCA-Products by imaging immobilized RCA-Products with a fluorescence microscope, are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more payload sequences following a genetic-editing procedure, wherein the payload sequence has been integrated into a polynucleotide sequence at an unknown site.
[0427] In an embodiment, the invention provides a method wherein step (iii) comprises immobilizing the one or more RCA-Products on a surface; for example, by electrostatic interaction, covalent interaction and / or steric interaction with said surface.
[0428] In an embodiment, the method comprises the step of attaching the one or more RCA- Products to magnetic beads to provide bead-bound RCA-Products; optionally wherein the one or more RCA-Products are immobilized by providing a magnetic source so as to attract the bead-bound RCA-Products to a position on the surface.
[0429] Considerations regarding "capturing" RCA-Products, particularly wherein the RCA-Products are captured using magnetic beads, and the related aspects, are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to detecting and / or characterising one or more integrated payload sequence.
[0430] In an embodiment, the invention provides a method wherein the surface is a glass surface, optionally a glass surface which is modified to interact with a one or more RCA-Product. In an embodiment the glass surface may be modified with positively charged homopolymers, for example poly-L-Lysine, poly-D-lysine or aminosilane.
[0431] In an embodiment, the invention provides a method wherein the surface is a porous membrane, such as a filter membrane, for example a porous hydrophilic membrane; optionally wherein the one or more RCA-Products are immobilized by filtering through the porous membrane.
[0432] Considerations regarding various surfaces (i.e. the glass surface and / or membranes), introducing affinity molecules onto RCA-Products to provide means for capturing RCA- Products on the surface, filtering RCA-Products through membranes, and the related aspects, are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more payload sequences following a genetic-editing procedure, wherein the payload sequence has been integrated into a polynucleotide sequence at an unknown site.
[0433] In an embodiment the area of the porous membrane corresponds to a single field of view of an optical sensing device, such as a microscope, or fluorescent microscope.
[0434] Considerations regarding porous membrane (including membrane characteristics, e.g. thickness, shape, size), are discussed elsewhere herein in the context of detecting incorrectly-edited polynucleotide sequences, and apply mutatis mutandis to the present method for detecting and / or characterising one or more payload sequences following a genetic-editing procedure, wherein the payload sequence has been integrated into a polynucleotide sequence at an unknown site.
[0435] In a ninth aspect, the invention provides the use of Rolling Circle Amplification for detecting and / or characterising the presence of one or more payload sequence following a genetic- editing procedure; wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0436] In a tenth aspect, the invention provides a kit of parts comprising;
[0437] (i) one or more payload sequence as defined herein;
[0438] (ii) a Probe, as defined herein; and
[0439] (iii) one or more reagent for performing Rolling Circle Amplification.
[0440] In some embodiments of the kit disclosed herein, the kit further comprises one or more of the following: one or more reagent for performing genetic-editing of a polynucleotide sequence; - one or more reagent for capturing an RCA-Product;
[0441] - one or more monomerization reagent; one or more Probe controls and / or spike-in templates;
[0442] - one or more library preparation reagent; and / or instructions for performing the method as disclosed herein.
[0443] In some embodiments, the kit further comprises one or more of the following : lysis buffer; neutralisation reagent; fragmentation reagent; one or more fragmentation enzyme (such as Alul, Ddel, Msel); hybridisation and ligation (Probing) buffer; one or more ligation enzyme (such as Ampligase / Tth Ligase / Taq DNA ligase); RCA buffer; one or more amplification enzymes (such as Phi29 polymerase or Exonuclease I); hybridisation buffer; biotinylated oligonucleotides; streptavidin-coated magnetic beads; magnetic rack; monomerisation buffer; one or more monomerization enzyme (such as Alul, SapI); 2nd strand synthesis buffer with 2nd strand synthesis primer; and one or more 2nd strand synthesis enzyme (such as T4 DNA polymerase / klenow fragment polymerase).
[0444] In an eleventh aspect, the invention provides a population of DNA molecules obtained or obtainable by a method according to the ninth aspect of the invention, or by the use according to the tenth aspect of the invention.
[0445] Preferably, the population of DNA molecules comprises one or more RCA-Products comprising nucleotide sequence corresponding to the one or more payload sequence from a genetic-editing procedure, wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
[0446] In some embodiments, the population of DNA molecules obtained or obtainable by a method for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure, are derived from homozygous clones, heterozygous clones, mosaic organisms, or multi-copy payload insertions.
[0447] In some embodiments the method comprises the step of preparing the sample prior to step (ii) of the method of the invention. In some embodiments, the preparation steps include any one from the list comprising: DNA extraction from the sample; DNA purification from the sample extract, denaturation of the DNA; and / or whole genome amplification. In a twelfth aspect, the invention provides a method, use, kit of parts or population of DNA molecules substantially as described herein, with reference to the accompanying description, examples and / or figures.
[0448] Embodiments of the invention will now be described, by way of example only, with reference to the accompanying figures, in which:
[0449] Figure 1 shows an exemplary assay workflow for detecting and characterising incorrectly- edited polynucleotide sequences and Double Strand Breaks (DSB) using Probes comprising degenerate nucleotide sequences.
[0450] (A) Cells. RNA-guided nuclease-induced DSBs in the genomes of living cells are tagged, for example, by integration of a double stranded Oligonucleotide-Tag by means of non- homologous end joining (NHEJ); qDNA extraction: Genomic DNA (gDNA) is extracted from these cells. Cell Lysis and gDNA purification can be performed by column or magnetic bead-based methods as well as traditional ethanol based precipitation methods. A crude lysate can also be used for these purposes. Extracted DNA will have RGN-induced DSBs tagged with the Oligonucleotide-Tag sequence; gDNA fragmentation: gDNA is fragmented to an average size of 500 to 1000 pb to facilitate downstream RCA process. This can be achieved by enzymatic fragmentation with restriction endonucleases or by mechanical shearing like sonication; Denaturation, Probe hybridization & ligation: gDNA is denatured by heating. Padlock probes with the 5' arm complimentary to the oligonucleotide-tag and 3' arm with degenerate nucleotides will hybridize to only to the tagged DSBs in the fragmented gDNA. A thermostable ligase will join the ends of the padlock probes.
[0451] (B) RCA + Exonucleolysis: Reacted probes will undergo RCA while the rest of the gDNA fragments will be processed by a single strand DNA exonuclease (Phi29 polymerase and Exonuclease I). RCP capture and clean up: Generated RCP products are captured on streptavidin coated magnetic beads with oligos complementary to the RCPs to facilitate clean up from excess of enzymes, primers, salts and processed untagged gDNA. RCP monomerization: RCPs are monomerized and released from the magnetic beads by using a restriction endonuclease (Alul, SapI). Second strand synthesis: Monomers from RCPs are ssDNA molecules. A primer complementary to the RCP monomer and a DNA polymerase such as T4 DNA polymerase, are added to generate the complementary strand to enable downstream sequencing adaptor ligation and sample indexing. Adaptor and sample indexing. The double stranded RCP monomers subsequently subjected to a A-tailing reaction to enable the ligation of sequencing adaptors and sample indexing barcodes with T4 DNA ligase. Seouencino: Resulting barcoded RCP monomers are submitted to NGS to characterise the sequence(s) adjacent to the Oligonucleotide-Tag sequence. These sequences are mapped to the genome by pairwise alignment to find location in the genome where RGN-induced DSB occurred.
[0452] Figure 2 shows an exemplary assay workflow for detecting and characterising chromosomal aberrations using Probes comprising degenerate nucleotide sequences.
[0453] (A) Cells: Living cells are subjected to a targeted gene editing using RNA-guided nuclease (RGN) and sgRNA. This causes intended and unintended modifications. Chromosomal aberrations are expected to contain a known upstream or downstream sequence from the intended DSB caused by the RGN and an unknown sequence adjacent to it corresponding to an off-target site. qDNA extraction: Genomic DNA (gDNA) is extracted from these cells. Cell Lysis and gDNA purification can be performed by column or magnetic bead-based methods as well as traditional ethanol based precipitation methods. A crude lysate can also be used for these purposes. Extracted DNA will have sequences resulting from chromosomal rearrangements where at least one part of the sequence is known, corresponding to the on-target sequence. gDNA fragmentation: gDNA is fragmented to an average size of 500 to 1000 pb to facilitate downstream RCA process. This can be achieved by enzymatic fragmentation with restriction endonucleases or by mechanical shearing using sonication. Denaturation, Probe hybridization & ligation: gDNA is denatured by temperature, so that padlock probes with the 5' arm complementary to the known downstream or upstream on-target sequence and a 3' arm with degenerate nucleotides will hybridize to unknown off target sequence in the fragmented gDNA. A thermostable ligase will join the ends of the padlock probes.
[0454] (B) RCA + Exonucleolysis: Reacted probes will undergo RCA while the rest of the gDNA fragments will be processed by a single strand DNA exonuclease (Phi29 polymerase and Exonuclease I). RCP capture and clean up: Generated RCP products are captured on streptavidin coated magnetic beads with oligos complementary to the RCPs to facilitate clean up from excess of enzymes, primers, salts and processed untagged gDNA. RCP monomerization: RCPs are monomerized and released from the magnetic beads by using a restriction endonuclease (Alul, SapI). Second strand synthesis: Monomers from RCPs are ssDNA molecules. A primer complementary to the RCP monomer and a DNA polymerase such as T4 DNA polymerase, are added to generate the complementary strand to enable downstream sequencing adaptor ligation and sample indexing. Adaptor and sample indexing: the double stranded RCP monomers subsequently subjected to a A-tailing reaction to enable the ligation of sequencing adaptors and sample indexing barcodes with T4 DNA ligase. Seouencino: Resulting barcoded RCP monomers are submitted to NGS to obtained the sequence(s) adjacent to known upstream or downstream on target sequences. These sequences are mapped to the genome by pairwise alignment to find location(s) in the genome where chromosomal aberration have occurred.
[0455] Figure 3 shows an exemplary assay workflow for detecting and characterising incorrectly- edited polynucleotide sequences and Double Strand Breaks (DSB) using selector probes.
[0456] (A) Cells: RGN-induced DSBs in the genomes of living cells are tagged by integration of a double stranded Oligonucleotide-Tag by means of non-homologous end joining. qDNA extraction: Genomic DNA (gDNA) is extracted from these cells. Cell Lysis and gDNA purification can be performed by column or magnetic bead-based methods as well as traditional ethanol based precipitation methods. A crude lysate can also be used for these purposes. Extracted DNA will contain RGN-induced DSBs tagged with the Oligonucleotide- Tag. The Oligonucleotide-Tag contains a specific restriction site for a specific restriction endonuclease. gDNA fragmentation: gDNA is fragmented by enzymatic fragmentation with the restriction endonuclease that will cut at the Oligonucleotide-Tag and upstream of the original DSB. Denaturation, Probe hybridization & ligation: gDNA is denatured by heating. A selector probe with one arm complementary to the digested Oligonucleotide- tag and the other arm with degenerate nucleotides will promote the circularization by hybridization of only those DNA fragments where Oligonucleotide-Tag is integrated. A thermostable ligase will join the ends of the circularised target.
[0457] (B) RCA + Exonucleolvsis: Reacted probes will undergo RCA while the rest of the gDNA fragments will be processed by a single strand DNA exonuclease (Phi29 polymerase and Exonuclease I). RCP capture and clean up: Generated RCP products are captured on streptavidin coated magnetic beads with oligos complementary to the RCPs to facilitate clean up from excess of enzymes, primers, salts and processed untagged gDNA. RCP monomerization: RCPs are monomerized and released from the magnetic beads by using a restriction endonuclease (Alul, SapI). Second strand synthesis: Monomers from RCPs are ssDNA molecules. A primer complementary to the RCP monomer and a DNA polymerase such as T4 DNA polymerase, are added to generate the complementary strand to enable downstream sequencing adaptor ligation and sample indexing. Adaptor and sample indexing: The double stranded RCP monomers subsequently subjected to a A- tailing reaction to enable the ligation of sequencing adaptors and sample indexing barcodes with T4 DNA ligase. Sequencing: Resulting barcoded RCP monomers are submitted to NGS to obtained the sequence(s) adjacent to the Oligonucleotide Tag sequence. These sequences are mapped to the genome by pairwise alignment to find location in the genome where RGN-induced DSB occurred. Figure 4 shows an exemplary assay workflow for detecting and characterising chromosomal aberrations using selector probes.
[0458] (A) Cells: Living cells are subjected to a targeted gene editing using RGN and sgRNA. This causes intended and unintended modifications. Chromosomal aberrations are expected to contain a known upstream or downstream sequence from the ion-target sequence and an unknown sequence adjacent to it corresponding to an off-target site. gDNA extraction: Genomic DNA (gDNA) is extracted from these cells. Cell Lysis and gDNA purification can be performed by column or magnetic bead-based methods as well as traditional ethanol based precipitation methods. A crude lysate can also be used for these purposes. Extracted DNA will have sequences resulting from rearrangements where at least one part of the sequence is known. gDNA fragmentation : gDNA is fragmented with one or more restriction endonucleases to cause fragment with an known end at the know on target site and an unknown end, corresponding to the unknown fragment. Denaturation, Probe hybridization & ligation : gDNA is denatured by heating. A selector probe with one arm complimentary to the known on-target fragment and the other arm with degenerate nucleotides will promote the circularization by hybridization of only those fragments that contain the known on-target sequence adjacent to a chromosomal aberration. A thermostable ligase will join the ends of the selector probe.
[0459] (B) RCA + Exonucleolysis: Reacted probes will undergo RCA while the rest of the gDNA fragments will be processed by a single strand DNA exonuclease (Phi29 polymerase and Exonuclease I). RCP capture and clean up: Generated RCP products are captured on streptavidin coated magnetic beads with oligos complementary to the RCPs to facilitate clean up from excess of enzymes, primers, salts and processed untagged gDNA. RCP monomerization: RCPs are monomerized and released from the magnetic beads by using a restriction endonuclease (Alul, SapI). Second strand synthesis. Monomers from RCPs are ssDNA molecules. A primer complementary to the RCP monomer and a DNA polymerase such as T4 DNA polymerase, are added to generate the complementary strand to enable downstream sequencing adaptor ligation and sample indexing. Adaptor and sample indexing: the double stranded RCP monomers subsequently subjected to a A-tailing reaction to enable the ligation of sequencing adaptors and sample indexing barcodes with T4 DNA ligase. Sequencing: Resulting barcoded RCP monomers are submitted to NGS to obtained the sequence(s) adjacent to known upstream or downstream on target sequences. These sequences are mapped to the genome by pairwise alignment to find location(s) in the genome where chromosomal aberration have occurred. Figure 5 corresponds to the data of Example 1 and shows the detection and characterisation of incorrectly-edited polynucleotide sequences using Probes comprising degenerate nucleotide sequences. (A to F) Absolute numbers of RCA-Products originating from different Probes (as per panel in Figure 5F) are presented for different sample mixes. Average numbers from two replicates is presented as box-plots. (F) Schematic of binding different Probes to their respective targets. "N": deaerated nucleotides. gDNA: human genomic DNA - non edited; Target 1 : Synthetic tagged incorrectly edited sitel; Target 2: Synthetic tagged incorrectly edited site2; Probe 1 targets non edited site 1; Probe 2 targets non edited site 2; Probe 3 targets tagged incorrectly edited site 1; Probe 4 targets tagged incorrectly edited site 2; Probe 5 targets all tagged incorrectly edited sites.
[0460] Figure 6 shows non limiting examples of incorrectly-edited polynucleotide sequences and double strand breaks that can be detected with the methods of the invention.
[0461] Figure 7 shows a sequencing workflow starting from edited cells as in Example 2.
[0462] Figure 8 shows a sequencing workflow starting from edited genomic DNA as in Example 3.
[0463] Figure 9 illustrates an exemplary probe design for detecting payload integration, and various scenarios for detecting integrated payload sequences. A and B: The probe design can be selected with high flexibility by spanning the junction at either the 5' or 3' end of the integrated payload sequence. Additionally, designing a Payload Internal Probe allows identification and quantification of free and integrated payload. C: Represents a situation where a payload sequence is correctly integrated into the target sequence ("on-site"), e.g. the human genome. Because the payload and adjacent sequence are known, a junction-specific padlock probe can be designed to target the integrated site. D: Represents a situation where a payload sequence is integrated into the target sequence at an unknown ("off-site") site. Because the sequence adjacent to the payload sequence is not known, a junction-specific padlock probe cannot be used. Instead, a padlock probe with degenerate nucleotides ("N-Lock Probe"; "N" represents degenerate nucleotides) may be used to target that unknown integration site.
[0464] Figure 10 shows a comparison of number of genome mapped restriction sites. For both panels A and B, conditions are as follows: i) Untreated control: human genomic DNA (hgDNA) is neither subjected to endonuclease treatment nor dsODN-tag ligation; ii) Non- digested control: hgDNA is not subjected to endonuclease treatment but is processed for dsODN-tag ligation; iii) Non-ligated control: hgDNA is subjected to endonuclease treatment but not processed for dsODN tag ligation & iv) test Sample: hgDNA is subjected to both endonuclease treatment and dsODN tag ligation. Time of incubation (in minutes) with the restriction endonuclease used is indicated on the axis labels. The number of mapped restriction sites is indicated on y-axis. (A) number of genome mapped restriction sites following incubation with Pmel restriction endonuclease; (B) number of genome mapped restriction sites following incubation with Hindlll restriction endonuclease.
[0465] Figure 11 shows positions of Pmel restriction sites mapped on the reference human genome (hg38). Each vertical bar corresponds to a site (coordinates) detected and mapped on the human genome.
[0466] Figure 12 shows positions of Hindlll restriction sites mapped on the reference human genome (hg38). Each vertical bar corresponds to a site (coordinates) detected and mapped on the human genome.
[0467] Figure 13 shows an exemplary annotated read showing a reacted probe capturing a Pmel restriction site on the human genome.
[0468] Figure 14 shows a schematic representation of samples used in Example 5. Condition i) (Untreated control sample); Condition ii) (test sample) with integrated payload.
[0469] Figure 15 shows the positions of mapped payload insertion sites on the reference human genome for two conditions in Example 5. Condition (i) represents a negative control (untreated sample), and Condition (ii) is the test sample with an integrated payload sequence. Each vertical bar corresponds to a mapped insertion site (genomic coordinates) on the human genome. In the test sample, one insertion site is expected (labelled) from the payload; however, several other sites were also mapped, indicating unknown ("off -target") insertions.
[0470] EXAMPLES
[0471] Example 1
[0472] Specificity of the disclosed probe design to detect incorrectly edited sites tagged with the Oligonucleotide-tag seguence
[0473] This example demonstrates the specificity of the probe design described in this disclosure to detect incorrectly-edited sites or double strand breaks that have been tagged with the Oligonucleotide-tag seguence.
[0474] For this, three sample mixes were prepared : i) gDNA: Human genomic DNA; ii) gDNA + spiked in Target 1 (wherein Target 1 is a mixture of two oligonucleotides of complementary seguence (SEQ. ID. NO:3 and 4) mimicking an incorrectly-edited site (called "site 1") in the human TP53 gene, and integrated Oligonucleotide-Tag comprising SEQ. ID. NO: 1 and 2); and iii) gDNA + spiked in Target 2 (wherein Target 2 is a mixture of two oligonucleotides of complementary seguence (SEQ. ID. NO: 5 and 6) mimicking an incorrectly-edited site (called "site 2") in the human ACTA2 gene, and integrated Oligonucleotide-Tag comprising SEQ. ID. NO: 1 and 2). Seguences of the spiked-in templates are presented in Table 1.
[0475] The sample mixes were tested with 3 different probe mixes: a) A mix containing Probes 1 and 2 (SEQ. ID. NO: 7 and 8) : targeting non-edited (wild-type) sites; b) A mix containing Probes 3 and 4 (SEQ. ID. NO:9 and 10) : specifically targeting incorrectly-edited sites 1 & 2, respectively; and c) A mix containing a Probe 5 (SEQ. ID. NO: 11) : targeting the Oligonucleotide-Tag of the incorrectly-edited sites and containing degenerate nucleotides Probe 5. For each combination, DNA fragmentation, hybridization, ligation, RCA, labelling and fluorescence microscopy was performed as described below. Synthetic targets (SEQ. ID. NO 1-6) were obtained from IDT. Padlock probes were obtained from Genewiz (SEQ. ID. NO: 7-11). Seguences of the Probes used are presented in Table 1.
[0476] Table 1. Specific sequences of oligonucleotides used in the example
[0477] (underlined: Oligonucleotide-Tag corresponding sequence in the Targets and Probes, PO4: phosphate group; "N" degenerate base)
[0478] Fragmentation, Probe Hybridization & Ligation & RCA
[0479] 100 ng of human genomic DNA (Sigma Aldrich) in 10 pL were subjected to enzymatic fragmentation by mixing in 9.75 pL Fragmentation Buffer (Countagen) and 0.75 Fragmentation enzyme (Countagen) for a total of 20 pL reaction volume. For spike in mixes, 4 fM of each synthetic target was added. The mix was incubated at 37 °C for 15 min, followed by 95°C for 2 min. Next, the ligation mixtures were prepared by adding 1.5 pL of Probes, 8.2 pL Probing buffer (Countagen) and 0.25 pL Probing enzyme (Countagen) for a final volume of 30 pL. The mixture was incubated at 98 °C for 10 min followed by 55 °C for 20 min. For rolling circle amplification 9.2 pL Amplification Buffer (Countagen), 0.8 pL Amplification Enzyme 1 (Countagen) & 0.2 Amplification Enzyme 2 (Countagen) were added for a total volume of 40 pL. The amplification mixture was incubated at 37 °C for 3 h followed by 65 °C for 2 min.
[0480] Signal enrichment and detection
[0481] To detect and capture RCA-Products, 8 pL of Labelling Buffer (Countagen), 1 pL of Magnetic Beads (Countagen) and 1 pL of Labelling Probes (Countagen) were added to the Amplification mix for further incubation at 75 °C for 2 min followed by 55 °C for 10 min. Tubes containing the Labelled products were placed on a Magnetic Stand (Countagen) for 15 min. Subsequently, the supernatant was discarded and 15 pL of Resuspension Buffer (Countagen) were added to each tube.
[0482] Samples were briefly centrifuged and vortexed. A Chip (Countagen) and Chip holder (Countagen) were assembled. The totality of the samples (15 pL) were applied to each channel of the readout Chip. After 30 min incubation at room temperature each channel of the readout Chip was imaged with a Leica DMi8 inverted fluorescence microscope with a 20x magnification objective, a field of view of 0.65 x 0.65 pm2 and the following light sources and Filter Channels: LED470 (Ex:480 / 40, Em: 527 / 30), Y3 (Ex: 545 / 25, Em: 605 / 70), Y5 (Ex: 620 / 60, Em: 700 / 75). 1 field-of-view and 5 z-stacks at a 1.25 pm interval were acquired.
[0483] Image analysis was performed using GeneAbacus Image Analyzer (GAIA) desktop software (version 1.2.0, Countagen). Briefly Acquired multipage *TIFF files were uploaded to the software with a threshold of 1.2. Resulting *.csv file with quantification for each sample was used to generate graphs shown in Figure 5.
[0484] Results
[0485] The results are summarised in Figure 5. Probes 1 and 2 (Panels A and B) do not detect the spiked synthetic targets that mimic incorrectly edited sites tagged with the Oligonucleotide-tag sequence, as the signal from these Probes is unaffected by the presence of the spiked targets. In contrast, Panels C and D show that Probes 3 and 4, which specifically target incorrectly edited sites tagged with the Oligonucleotide-tag sequence, produce signal only when their respective target sites are present in the mixture. Finally, Panel E demonstrates that Probe 5 shows a signal only when the spiked, incorrectly- edited sites tagged with the Oligonucleotide-tag sequence are present. This indicates that probes containing degenerate nucleotides can specifically detect sequences with the oligonucleotide-tagged sequence as well as the variable sequence adjacent to the oligonucleotide-tagged sequence which mimics the incorrectly-edited sequence. The observed decrease in signal count is expected due to the presence of degenerate bases, which dilutes the effective concentration of probes having the exact complementary sequence to the incorrectly-edited sequence and reacting with the specific spiked sequences.
[0486] Example 2
[0487] Sequencing Workflow starting from edited cells
[0488] Gene Editing and Integration of Oligonucleotide tag
[0489] This example demonstrates the workflow when detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure is achieved by DNA sequencing.
[0490] In a first step, a sample following a genetic-editing procedure is provided. Integration of the Oligonucleotide-Tag can be achieved as explained elsewhere herein, or for example in Malinin et al., 2021, Nat Protoc 16, 5592-5615. gDNA crude extraction
[0491] In the next step, crude genomic DNA (gDNA) is harvested. The process begins from harvesting between 30,000 to 50,000 cells, the cells are collected in a sterile environment to prevent contamination. An appropriate volume of lysis buffer to the harvested cells. The lysis buffer typically contains detergents and enzymes that help break down the cell membrane and nuclear envelope, releasing the crude genomic DNA (gDNA) into the solution. The cell-lysis buffer mixture is incubated at 100°C for 60 minutes, which helps to denature proteins and other cellular components, facilitating the release of gDNA. After the incubation period, the temperature is gradually ramped down from 100°C to 20°C at a rate of approximately 2°C per minute. This controlled cooling process helps to prevent the formation of secondary structures in the DNA, ensuring high-quality gDNA extraction. Once the mixture has cooled to 20°C, the neutralisation reagent is added. The neutralisation reagent contains components that neutralise the lysis buffer, stabilizing the gDNA and making it suitable for downstream applications. gDNA fragmentation
[0492] The fragmentation reagent and a fragmentation (restriction) enzyme, at a concentration of 300 mU / |jL (such as Alul, Ddel, or Msel), is added to the mixture are added to the sample. The mixture is incubated at 37°C for 60 minutes to allow the restriction enzyme to cleave the DNA at specific recognition sites, resulting in fragmented gDNA.
[0493] Denaturation, Probe Hybridization, and Ligation
[0494] The probing buffer is added to the gDNA sample. This buffer facilitates the hybridization of the Probes to the target DNA sequences. A probe mix, containing 5 nM of each probe, as well as a ligation enzyme, at a concentration of 65 mU / pL (such as Tth Ligase, Ampligase, or Taq DNA Ligase), are added to the mixture. The mixture is incubated at 98°C for 2 minutes to denature the DNA, allowing the probes to bind to their complementary sequences. Following denaturation, the temperature is reduced to 55°C, and the mixture is incubated for 30 minutes. This step allows the ligation enzyme to seal the nicks in the DNA, completing the hybridization and ligation process.
[0495] Rolling Circle Amplification (RCA)
[0496] The RCA (Rolling circle amplification) buffer is added to the sample. This buffer provides the necessary components for the amplification reaction such as dNTPs. Amplification enzymes, including Phi29 polymerase at a concentration of 100 mU / pL and DNA exonuclease I at a concentration of 1 U / pL, are added to the mixture. The mixture is incubated at 37°C for 120 minutes. This incubation period allows the Phi29 polymerase to perform the rolling circle amplification, generating RCA-Products, while exonuclease removes any unreacted probes from the sample. Following the amplification, the temperature is increased to 65°C, and the mixture is incubated for an additional 10 minutes. This step inactivates the enzymes and completes the amplification process.
[0497] RCP Capture and Clean-Up
[0498] The capture buffer, containing biotin-conjugated oligo at a concentration of 5 nM, is added to the sample. The mixture is incubated at 72°C for 2 minutes to ensure proper binding of the oligonucleotides. Following this, the temperature is reduced to 55°C, and the mixture is incubated for an additional 10 minutes to stabilize the binding. The beads are collected using a magnetic rack. This process typically takes about 15 minutes, allowing the beads to be separated from the supernatant. The supernatant is carefully discarded, and the beads are resuspended in molecular-grade water (mQH20). RCP Monomerization
[0499] The monomerisation buffer and a restriction enzyme, at a concentration of 100 mU / pL (such as Alul or SapI), are added to the mixture. The mixture is incubated at 37°C for 5 minutes to allow the restriction enzyme to cleave the DNA. Following this, the temperature is increased to 65°C, and the mixture is incubated for an additional 2 minutes to inactivate the enzyme. The beads are collected using a magnetic rack. This process typically takes about 15 minutes, allowing the beads to be separated from the supernatant. The magnetic beads are carefully discarded, leaving the monomerized product in the solution.
[0500] Second Strand Synthesis
[0501] The RCP monomers are cleaned up using SPRI beads at a 0.45x ratio. This step helps to purify the monomers by removing any unwanted components. NEB lx buffer and 100 nM of the second strand primer are added to the cleaned-up monomers. The mixture is incubated at 55°C for 2 minutes to allow the primer to anneal to the template. T4 DNA polymerase, at a concentration of 100 mU / pL, and dNTPs, at a concentration of 100 mM each, are added to the mixture. The mixture is incubated at 12°C for 10 minutes to allow the synthesis of the second strand. Following this, the temperature is increased to 75°C, and the mixture is incubated for an additional 10 minutes to complete the second strand synthesis.
[0502] A-Tailing
[0503] The double-stranded RCP monomers are cleaned up using SPRI beads at a 0.45x ratio. Terminal transferase reaction buffer is added to the cleaned-up monomers. ddATP at a concentration of 20 nM, CoCI2, and terminal transferase at a concentration of 2 U / pL are added to the mixture. The mixture is incubated at 37°C for 30 minutes. This incubation period allows the terminal transferase to add adenine (A) residues to the 3' ends of the DNA.
[0504] Adaptor Ligation
[0505] The double-stranded RCP monomers are cleaned up using SPRI beads at a 0.45x ratio. This step helps to purify the monomers by removing any unwanted components. The DNA concentration is measured using the Qubit dsDNA HS Assay Kit. Blunt / TA Ligase Master Mix and a sequencing adaptor, at a concentration of 150 mM, are added to the mixture. The adaptor is designed to ligate to the ends of the DNA fragments, preparing them for sequencing. The mixture is incubated at 25°C for 15 minutes. Preparation for MinlON Sequencing
[0506] The double-stranded RCP monomers are cleaned up using SPRI beads at a 0.45x ratio. The DNA concentration is measured using the Qubit dsDNA HS Assay Kit. The DNA is diluted to a concentration of 100 fmol in Elution Buffer. This step prepares the DNA sample for optimal performance in the MinlON sequencing process.
[0507] MinlON Sequencing
[0508] MinlON Sequencing Protocol is performed following the vendor's recommended instructions. This typically includes loading the prepared DNA sample onto the MinlON flow cell and running the sequencing software as per the manufacturer's guidelines.
[0509] Example 3
[0510] Sequencing Workflow starting from edited genomic DNA
[0511] In vitro RGN cleavage and Ligation of Oligonucleotide tag
[0512] This example demonstrates the workflow when detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure is achieved by DNA sequencing, starting with genomic DNA.
[0513] In a first step, a sample following an in vitro RNA-guided nuclease cleavage and ligation of Oligonucleotide tag is performed, for example, as described in, for example Cameron et al., 2021, Nat Protoc 16, 5592-5615. The remaining downstream steps (gDNA fragmentation; Denaturation, Probe hybridization and Ligation; Rolling circle amplification (RCA); RCP capture and clean up; RCP monomerization; Second strand synthesis; A- tailing; Adaptor Ligation; Preparation for MinlON Sequencing; and MinlON sequencing) can be performed as in Example 2.
[0514] Example 4
[0515] Capability of the disclosed probe design and procedure to detect and characterise endonuclease-induced double strand breaks.
[0516] Example 4 demonstrates the capability of the disclosed probe design and method to detect double-strand breaks induced by endonucleases. To evaluate the ability of the disclosed method to capture genome-wide double-strand breaks, human genomic DNA was treated with restriction endonucleases, mimicking double-strand breaks caused by RNA-guided endonucleases such as CRISPR / Cas9. A common double-stranded Oligonucleotide-tag (dsODN) was then integrated at the doublestrand break induced by restriction endonucleases for further capture and detection by sequencing.
[0517] Results are depicted in Figures 10-13. In this example, human genomic DNA (hgDNA) is subjected to the following conditions: i) Untreated control : hgDNA is neither subjected to endonuclease treatment nor dsODN- tag ligation. ii) Non-digested control: hgDNA is not subjected to endonuclease treatment but processed for dsODN-tag ligation. iii) Non-ligated control : hgDNA is subjected to endonuclease treatment but not processed for dsODN tag ligation & iv) Test Sample: hgDNA is subjected to both endonuclease treatment and dsODN tag ligation.
[0518] Resulting samples from each condition were further processed in the same manner by Probe Hybridization with Probe 7 (SEQ. ID. NO: 16), Gapfill & Ligation, Amplification, Endrepair, Sample Index Ligation, Sequencing Adaptor Ligation & Sequencing by the methods described below. dsOligonucleotide-tag and Gap-fill padlock probe (SEQ. ID. NOs: 14-16) were obtained from IDT.
[0519] Table 2. Specific sequences of oligonucleotides used in Example 4
[0520] (^Padlock probe targeting the Oligonucleotide-Tag-2 sequence and containing degenerate bases in the 5' arm. Underlined: Oligonucleotide-Tag corresponding sequence in Probe 7, P04: phosphate group; "N" degenerate base)
[0521] Endonuclease treatment. End repair, A-tailing & ds Olioonucleotide-tao Ligation
[0522] For samples with endonuclease treatment: 1 pg human genomic DNA (Sigma Aldrich) were mixed with 5 Units of restriction endonuclease (Hindlll-High Fidelity or Pmel, NEB) in a final volume of 30 pL.
[0523] The mix was incubated for either 5 or 15 min at 37°C, followed by inactivation at 80°C for
[0524] 10 min. Resulting samples were cleaned up using SPRIReagent (Beckman Coulter) at 1.8x ratio and following the manufacturer protocol. Briefly, 1.8x per pL were added to each sample and incubated for 10 min. Samples were put on a magnet for 5 min and supernatant was discarded. Two washes with 200 pL 80 % Ethanol were performed. Samples were removed from the magnet and pellet magnetic beads were left to dry for 1 min followed. Beads were resuspended in 30 pL of Low Tris-EDTA buffer (10 mM Tris, 0.1 mM EDTA, IDT) and incubated for 3 min at Room Temperature. After putting the samples on the magnet for additional 3 min and 30 pL supernatant was recovered for downstream processing.
[0525] 25 pL of the cleaned DNA was mixed with 3.5 pL Ultra II End-prep Reaction Buffer (NEB) and 1.5 pL Ultra II End-prep Enzyme Mix (NEB) and incubated at 20°C for 20 min, followed by 65°C for 5 min. The resulting mix was cleaned up with 1.8x SPRIReagent as described above with the difference that the final resuspension was performed in 15 pL of milliQ Water.
[0526] For samples with dsODN tag ligation: dsODN-2 sense and antisense oligonucleotides (SEQ. ID. NOs: 14 and 15) were resuspended and mixed in a final concentration of 100 pM in Nuclease-Free Duplex Buffer (30 mM HEPES, pH 7.5; 100 mM potassium acetate, IDT), followed by an incubation 5 min at 95°C and cool down at Room Temperature for annealing.
[0527] 11 pL of purified end-repaired samples were mixed with 1.5 pL of annealed dsODN and 12.5 pL of Blunt / TA Ligase Master Mix (NEB). The mix was incubated at room temperature for 20 min, followed by the addition of 2.5 pL 100 mM EDTA. Mixes were purified with 1.8x SPRIReagent in a final elution of 20 |jL of mill iQ water. Final concentration was determined by Qubit (Thermofisher Scientific).
[0528] Probe Hybridization, Gap-fill, exonuclease treatment & amplification
[0529] 5 ng of each sample from conditions i to iv were mixed with 2.8 pmole Probe 7 (SEQ. ID. NO: 16) and lx Hybridization Buffer (Countagen). The mix was incubated at 98°C for 5 min followed by 16 h incubation at 55°C. Extension and ligation was performed by the addition of IpL Hybridization Buffer, 0.3 pL 10 mM dNTPs, 1 pL Ligation Enzyme (Countagen) & 0.2 pL Extension Enzyme (Countagen) to a final reaction volume of 20 pL. This mixture was incubated at 55°C for 15 min, followed by 72°C for 15 min.
[0530] Next, 1 pL of Exonuclease enzyme 1 (Countagen) and 0.5 pL Exonuclease enzyme 2 (Countagen) in 5 pL were added and the mixtures were incubated at 37°C for 60 min, followed by 80°C for 20 min. For amplification, 5 pL of the resulting products were mixed with 1 pL 10 mM Primers 1 and 2 (SEQ. ID. NOs: 17 and 18), 0.4 pL 10 mM dNTPs, 4 pL Phusion HF buffer (Thermo Scientific) and 0.2 p L Phusion™ Hot Start II DNA Polymerase (Thermo Scientific). The samples were incubated at 98°C 3 min, followed by 30 cycles of 98°C 30 s, 60°C 30s, 72°C 15 s and a final step at 72°C for 3 min. Products were purified with 1.8x SPRI Reagent as described above in a final elution volume of 15 pL mQWater. All conditions were run by duplicates.
[0531] Library Preparation and Sequencing
[0532] Sequencing libraries were prepared using Native Barcoding Kit 24 V14 (Oxford Nanopore) and following the protocol for Ligation sequencing amplicons (NBA_9168_vll4_revR_30Jan2025, Oxford Nanopore). Briefly, 200 fmol of each amplicon were subjected to End repair and A-tailing using the Ultra II End-prep Reaction Buffer and Ultra II End-prep Enzyme Mix (NEB). Samples were cleaned with 1.8x AmpureXP beads (Beckman Coulter) following the same protocol described above for the SPRIReagent and eluted in mQ water. Sample indexing was performed by mixing individual Native Barcodes (NB01-24, Oxford Nanopore) and using Blunt / TA Ligase Master Mix (NEB). The reaction was finished by adding EDTA (Oxford Nanopore).
[0533] Barcode samples were pooled and cleaned up with 0.8x ratio AmpureXP beads. 30 pL of this pool were mixed in with Native Adapter (NA) (Oxford Nanopore) NEBNext Quick Ligation Reaction Buffer (NEB) and Quick T4 DNA Ligase (NEB). Cleaned up was performed using 0.4x AMpure Beads ratio, using Short Fragment Buffer (Oxford Nanopore) for the washes and Elution Buffer (Oxford Nanopore) for the final elution. 100 fmole of resulting library was mixed with Sequencing Buffer & Library Beads (Oxford Nanopore) before loading it into the MinlON flow cell.
[0534] Sequencing was performed following the manufacturers' instructions (MinlON, OxfordNanopore). A 24 h sequencing run was performed with default parameters and using the inline Adapter Trimming, Sample Demultiplexing and Super Accurate Base Calling Algorithms from the MinKNOW software (v 6.5.15, Oxford Nanopore).
[0535] Data analysis
[0536] A custom pipeline was written in Python (v 3.14) to follow the following workflow: Demultiplexed raw FASTAQ files for each sample were initially filtered for read quality (Q score >10) and size (70-600 bp). Following this, sequences containing the structure ACTACGGATTCATGCGCTCT(N, 18-250)GTTTAATTGAGTTGTCATATGTTA were extracted. Here, ’ N’ represents the captured sequence, ranging from a minimum of 18 bp (corresponding to the length of the degenerated arm of Probe 7) up to a maximum of 250 bp (the estimated maximum size of probe extension).
[0537] The extracted captured sequences were saved in a FASTA file. Next, a search was performed for partial restriction sites (GTTT / AAAC for Pmel and AGCT for Hindlll) at the first or last 6 bases of these captured sequences. Captured sequences with partial restriction sites were then shortlisted and aligned against the reference human genome (GRCh38 / hg38, NCBI RefSeq Assembly GCF_000001405.26) using Nucleotide BLAST (version 2.15.0+, NCBI), requiring a minimum similarity of 0.75 and an alignment length of at least 16 bases. The captured sequences with the highest score per query were further down-selected, and the 10 bp flanking bases from the mapped human genome site references were extracted for each.
[0538] A subsequent search for the complete restriction sites (Pmel: GTTTAAAC, Hindlll: AAGCTT) was performed within these extended captured sequences. The resulting coordinates of these were reported in a *.bed file, which was then visualized using the Integrative Genomics Viewer (version 2.19.7) to map the captured coordinates to the genome.
[0539] Results
[0540] The primary findings of the experiment are synthesized and presented in Figures 10-12. Overall, a clear and quantifiable increase in the number of mapped restriction sites was observed specifically in human genomic DNA (hgDNA) samples that underwent treatment with the restriction endonuclease and subsequent tagging with dsODN (condition (iv)). This observation confirms the successful enzymatic cleavage, dsODN-tagging and detection with the disclosed probe design and method.
[0541] A direct correlation was observed between the restriction incubation time and the number of detected breaks. 15 min of incubation with the Pmel restriction enzyme resulted in a greater number of successfully detected restriction sites than 5 min. Moreover, a significant difference was quantified in the number of detected breaks when comparing the two restriction enzymes used, Hindlll and Pmel. As theoretically predicted based on their recognition sequences, Hindlll yielded around 20 times higher number of detected breaks. The human genome (approximately 3.2x l09base pairs) is estimated to contain around 781,000 Hindlll cutting sites (a 6-base cutter) compared to only about 49,000 Pmel sites (an 8-base cutter). This theoretical distribution suggests that Hindlll should cut the genome approximately 16 times more frequently than Pmel. The experimental results align closely with this expectation, demonstrating that the method can quantify enzymespecific endonuclease activity.
[0542] The experiment also detected endogenous DNA breaks, which represent pre-existing double-strand breaks not induced by the restriction enzyme treatment. The data indicated a very low background level of these endogenous breaks. Specifically, an average of only 1 to 2 endogenous breaks were detected in control conditions (i-iii), suggesting excellent DNA integrity prior to enzymatic treatment. For the Hindlll control samples, a slightly higher but still relatively low average of 19 endogenous breaks was observed. This minimal background confirms that the vast majority of the detected sites following full treatment (iv) are indeed newly formed restriction-mediated cleavage points, not method artifacts or pre-existing DNA damage, thus validating the specificity and of the overall method.
[0543] Example 5
[0544] Capability of the disclosed probe design and procedure to detect and map integrated payload-genome junctions
[0545] This prophetic example demonstrates the capability of the disclosed probe design and method to detect and map the junctions surrounding a payload sequence (e.g. a lentiviral vector, AAV, or CAR-T construct) integrated at a correct ("on-target" site) and the known or unknown flanking genomic DNA at the integration site.
[0546] To evaluate the ability of the disclosed method to capture genome-wide integration sites, this example describes a procedure using genomic DNA from a cell line known to contain an integrated payload. The method utilizes a padlock probe design with one arm specific to the known payload sequence and a second, degenerate arm that hybridizes to the unknown flanking genomic DNA.
[0547] In this prophetic example, human genomic DNA (hgDNA) is subjected to the following conditions: i) Untreated control : hgDNA from a non-transduced / non-edited human cell line (e.g., wild-type HEK293T), and ii) Test sample: hgDNA from a human cell line known to contain one or multiple integrations of a specific payload (e.g., lentivirally-transduced HEK293T cells), as outlined in Figure 14.
[0548] Resulting samples from each condition are processed equally by Probe Hybridization with Probe 8 (SEQ ID 19), Gap-fill & Ligation, Amplification, End-repair, Sample Index Ligation, Sequencing Adaptor Ligation & Sequencing by the methods described below.
[0549] Table 3. Specific sequences of oligonucleotides used in Example 5
[0550] (^Padlock probe targeting a Payload Target Sequence (SEQ ID NO:20) and containing degenerate bases in the 5' arm; PO4: phosphate group; "N" degenerate base;
[0551] ** Hypothetical Payload Target Sequence. A hypothetical 21 -base pair sequence found within the integrated payload, targeted by the 3' arm of Probe 8 (SEQ ID NO: 19))
[0552] Methods
[0553] Sample Preparation Genomic DNA is extracted from the Untreated control (i) and condition (ii) (test sample) cell lines (as per Figure 14) using a standard genomic DNA purification kit (e.g. QIAamp DNA Mini Kit, Qiagen). Final DNA concentration is determined by Qubit (Thermofisher Scientific) and diluted to a working concentration in Tris-EDTA buffer.
[0554] Probe Hybridization, Gap-fill, exonuclease treatment & amplification
[0555] 5 ng of each sample from conditions (i) and (ii) are mixed with 2.8 pmol Probe 8 (SEQ ID 19) and lx Hybridization Buffer (Countagen). The mix is incubated at 98°C for 5 min followed by 16 h incubation at 55°C.
[0556] Extension and ligation is performed by the addition of IpL Hybridization Buffer, 0.3 pL 10 mM dNTPs, 1 pL Ligation Enzyme (Countagen) & 0.2 pL Extension Enzyme (Countagen) to a final reaction volume of 20 pL. This mixture is incubated at 55°C for 15 min, followed by 72°C for 15 min.
[0557] Next, 1 pL of Exonuclease enzyme 1 (Countagen) and 0.5 pL Exonuclease enzyme 2 (Countagen) in 5 pL are added and the mixtures are incubated at 37°C for 60 min, followed by 80°C for 20 min.
[0558] For amplification, 5 pL of the resulting products are mixed with 1 pL 10 mM Primers 1 (SEQ ID NO: 17) and 2 (SEQ ID NO: 18), 0.4 pL 10 mM dNTPs, 4 pL Phusion HF buffer (Thermo Scientific) and 0.2 pL Phusion Hot Start II DNA Polymerase (Thermo Scientific). The samples are incubated at 98°C 3 min, followed by 30 cycles of 98°C 30 s, 60°C 30s, 72°C 15 s and a final step at 72°C for 3 min. Products are purified with 1.8x SPRI Reagent as described in Example 4.
[0559] Library Preparation and Sequencing
[0560] Sequencing libraries are prepared using the Native Barcoding Kit 24 V14 (Oxford Nanopore) and sequenced on a MinlON flow cell, following the protocols as described in Example 4.
[0561] Data analysis
[0562] A custom pipeline is used for the following workflow: Demultiplexed raw FASTQ files for each sample are initially filtered for read quality (Q score >10) and size (70-600 bp). Following this, sequences containing the known probe structure are extracted. This structure consists of the 3' arm sequence targeting the payload (reverse complement of SEQ ID NO:20) and the 5' arm sequence (captured genomic DNA) captured by the degenerate bases.
[0563] The extracted captured sequences (representing the flanking genomic DNA) are saved in a FASTA file. These sequences are then aligned against the reference human genome (e.g. GRCh38 / hg38, NCBI RefSeq Assembly GCF_000001405.26) using Nucleotide BLAST, requiring a minimum similarity of 0.75 and an alignment length of at least 16 bases. The coordinates of the successfully mapped captured sequences are reported in a BED file. These coordinates represent the specific integration sites of the payload within the human genome.
[0564] Expected Results
[0565] As outlined in Figure 15, it is expected that the analysis will yield a clear and quantifiable number of mapped integration sites specifically in the Test Sample (condition ii). The Untreated Control (condition i) is expected to show zero (or minimal) mapped sites, demonstrating the high specificity of the method for the integrated payload and its flanking genomic junction.
[0566] It is further expected that the method will identify multiple, unique integration sites across the genome, consistent with the known integration patterns of the payload (e.g., semirandom integration for a lentiviral vector). The resulting BED file, when visualized in a genome browser, is expected to show distinct peaks (representing mapped integration sites) distributed across various chromosomes as illustrated in Figure 15. This result would validate the method's capability for genome-wide, unbiased mapping of integrated payloads.
[0567] It will be appreciated that the scientific principals behind the results discussed in Example 5 follow the results obtained in Example 4.
[0568] Both Examples rely on the same fundamental detection principle: hybridisation of Probes followed by RCA to capture junctions between a known sequence and an unknown adjacent sequence. In Example 4, the known sequence is the integrated Oligonucleotide-Tag at endonuclease-induced double-strand breaks, and the unknown sequence is the flanking genomic DNA. In Example 5, the known sequence is a region within the payload, and the unknown sequence is the flanking genomic DNA at the integration site. In both cases, the probe design incorporates a specific arm targeting the known sequence and a degenerate arm to accommodate variability in the unknown region, enabling ligation and circularisation even when the adjacent sequence is not predetermined. In embodiments where a Probe bind with a gap, the gap is first closed — either by polymerase-mediated extension using the target as template or by hybridisation of one or more gap-fill oligonucleotides — and only thereafter are the juxtaposed 3' and 5' ends ligated to generate circular singlestranded polynucleotide substrates.
[0569] The experimental data from Example 4 demonstrates that probes with degenerate arms can successfully hybridise and ligate at junctions where one side of the target is unknown, producing RCA products that were subsequently detected and mapped genome-wide. This validates the core concept that junction-specific padlock probes combined with RCA can capture integration or break sites without prior knowledge of the flanking sequence. Because payload integration creates analogous junctions — where the payload sequence is contiguous with an unknown genomic region — similar probe architecture and RCA workflow can be applied. Furthermore, the enzymatic steps (ligation, exonuclease cleanup, RCA) and sequencing pipeline used in Example 4 are directly transferable to Example 5, supporting the expectation that payload-genome junctions will yield detectable RCA products under equivalent conditions.
[0570] In short, the plausibility of Example 5 is underpinned by the demonstrated performance of degenerate probe-based RCA in Example 4, which confirms that unknown genomic contexts do not preclude successful detection and characterisation when only one side of the junction is defined. This shared mechanistic basis ensures that the method for payload detection is not speculative but a logical extension of the validated approach for endonuclease-induced breaks.
[0571] REFERENCES
[0572] Atkins A, Chung C-H, Allen AG, Dampier W, Gurrola TE, Sariyer IK, Nonnemacher MR and Wigdahl B (2021) Off-Target Analysis in Gene Editing and Applications for Clinical Translation of CRISPR / Cas9 in HIV-1 Therapy. Front. Genome Ed. 3:673022.
[0573] Chehelgerdi, M., Chehelgerdi, M., Khorramian-Ghahfarokhi, M. et al. Comprehensive review of CRISPR-based gene editing: mechanisms, challenges, and applications in cancer therapy. Mol Cancer 23, 9 (2024).
[0574] Duan J, Lu G, Xie Z, Lou M, Luo J, Guo L, Zhang Y. Genome-wide identification of CRISPR / Cas9 off-targets in human genome. Cell Res. 2014 Aug;24(8): 1009-12.
[0575] Kim D, Kim S, Kim S, Park J, Kim JS. Genome-wide target specificities of CRISPR-Cas9 nucleases revealed by multiplex Digenome-seq. Genome Res. 2016 Mar;26(3):406-15.
[0576] Baner, J.; Nilsson, M.; Mendel-Hartvig, M.; Landegren, U. Signal Amplification of Padlock Probes by Rolling Circle Replication. Nucleic Acids Res. 1998, 26 (22), 5073-5078).
[0577] Nilsson, M.; Malmgren, H.; Samiotaki, M.; Kwiatkowski, M.; Chowdhary, B. R; Landegren, U. Padlock Probes: Circularizing Oligonucleotides for Localized DNA Detection. Science. 1994, 265 (5181), 2085-2088.
[0578] Huang, R.; He, L.; Li, S.; Liu, H.; Jin, L.; Chen, Z.; Zhao, Y; Li, Z.; Deng, Y; He, N. A Simple Fluorescence Aptasensor for Gastric Cancer Exosome Detection Based on Branched Rolling Circle Amplification. Nanoscale 2020, 12 (4), 2445-2451.
[0579] Neumann, F.; Hernandez-Neuta, I.; Grabbe, M.; Madaboosi, N.; Albert, J.; Nilsson, M. Padlock Probe Assay for Detection and Subtyping of Seasonal Influenza. Clin. Chem. 2018, clinchem.2018.292979.
[0580] Kim, D., Bae, S., Park, J. et al. Digenome-seq: genome-wide profiling of CRISPR-Cas9 off- target effects in human cells. Nat Methods 12, 237-243 (2015).
[0581] Cameron, P., Fuller, C., Donohoue, P. et al. Mapping the genomic landscape of CRISPR- Cas9 cleavage. Nat Methods 14, 600-606 (2017). Tsai, S., Nguyen, N., Malagon-Lopez, J. et al. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets. Nat Methods 14, 607-614 (2017).
[0582] Tsai, S., Zheng, Z., Nguyen, N. et al. GUIDE-seq enables genome-wide profiling of off- target cleavage by CRISPR-Cas nucleases. Nat Biotechnol 33, 187-197 (2015).
[0583] Ke, R., Mignardi, M., Pacureanu, A. et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods 10, 857-860 (2013).
[0584] Drmanac et al., Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science. 2010 Jan 1;327(5961):78-81.
[0585] Hardenbol P, Baner J, Jain M, Nilsson M, Namsaraev EA, Karlin-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, Landegren U, Davis RW. Multiplexed genotyping with sequence- tagged molecular inversion probes. Nat Biotechnol. 2003 Jun;21(6):673-8.
[0586] Weibrecht, I., Lundin, E., Kiflemariam, S. et al. In situ detection of individual mRNA molecules and protein complexes or post-translational modifications using padlock probes combined with the in situ proximity ligation assay. Nat Protoc 8, 355-372 (2013). https: / / doi.org / 10.1038 / nprot.2013.006.
[0587] Yan, W., Mirzazadeh, R., Garnerone, S. et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat Commun 8, 15058 (2017). https: / / doi.org / 10.1038 / ncommsl5058
[0588] Dobbs, F.M., van Eijk, R, Fellows, M.D. et al. Precision digital mapping of endogenous and induced genomic DNA breaks by INDUCE-seq. Nat Commun 13, 3989 (2022). https: / / doi.org / 10.1038 / s41467-022-31702-9
[0589] Wienert, B., Wyman, S.K., Yeh, C.D. et al. CRISPR off-target detection with DISCOVER- seq. Nat Protoc 15, 1775-1799 (2020).
[0590] Hu, J., Meyers, R., Dong, J. et al. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing. Nat Protoc 11, 853-871 (2016). Giannoukos, G., Giulia, D.M., Marco, E. et al. UDiTaS™, a genome editing detection method for indels and genome rearrangements. BMC Genomics 19, 212 (2018). https: / / doi.org / 10.1186 / sl2864-018-4561-9
[0591] Johansson H, Isaksson M, Sorqvist EF, Roos F, Stenberg J, Sjoblom T, Botling J, Micke P, Edlund K, Fredriksson S, Kultima HG, Ericsson O, Nilsson M. Targeted resequencing of candidate genes using selector probes. Nucleic Acids Res. 2011 Jan;39(2):e8. doi: 10.1093 / nar / gkql005. Epub 2010 Nov 8. PMID: 21059679; PMCID: PMC3025563.
[0592] Malinin, N.L., Lee, G., Lazzarotto, C.R. et al. Defining genome-wide CRISPR-Cas genomeediting nuclease activity with GUIDE-seq. Nat Protoc 16, 5592-5615 (2021)
[0593] Cameron, R, Fuller, C., Donohoue, P. et al. Mapping the genomic landscape of CRISPR- Cas9 cleavage. Nat Methods 14, 600-606 (2017)
[0594] Metzker, M. Sequencing technologies — the next generation. Nat Rev Genet 11, 31-46 (2010). https: / / doi.org / 10.1038 / nrg2626 no
Claims
CLAIMS1. A method for detecting and / or characterising one or more incorrectly-edited polynucleotide sequences from a genetic-editing procedure, the method comprising the steps of:(i) providing a sample from a genetic-editing procedure, the sample comprising one or more incorrectly-edited polynucleotide sequence;(ii) performing Rolling Circle Amplification, to generate one or more RCA- Products from the one or more incorrectly-edited polynucleotide sequence in the sample; and(iii) detecting and / or characterising the one or more incorrectly-edited polynucleotide sequences based on the one or more RCA-Products generated in step (ii).
2. The method of Claim 1, wherein step (iii) comprises quantifying the one or more RCA-Product and / or determining some or all of the nucleotide sequence of the one or more RCA-Product.
3. The method of Claim 1 or 2, wherein the method further comprises step ( i-a) , which step is performed prior to step (ii) and comprises: generating circular singlestranded polynucleotide substrates from the one or more incorrectly-edited polynucleotide sequence in the sample.
4. The method of Claim 3, wherein step (i-a) comprises: integrating an Oligonucleotide-Tag at the incorrectly-edited site in the one or more incorrectly-edited polynucleotide sequence;- generating circular single-stranded polynucleotide substrates using a Probe that targets the Oligonucleotide-Tag at the incorrectly-edited site.
5. The method of Claim 4, wherein the Probe comprises nucleotide sequence capable of binding to the Oligonucleotide-Tag.
6. The method of Claim 4 or 5, wherein the Probe comprises degenerate nucleotide sequence capable of binding at, or adjacent to, nucleotide sequence at the incorrectly-edited site.
7. The method of any of Claims 4-6, wherein the Probe is selected from the group comprising: a padlock probe; a molecular inversion probe; a gap-fill probe; a splitlike probe; a Lotus probe; or a combination thereof.
8. The method of any of Claims 4-7, wherein the Probe circularises on recognition of its target polynucleotide sequence.
9. The method of Claim 8, wherein circularisation of the Probe is mediated and / or improved by one or more Joining probe.
10. The method of Claim 8, wherein circularisation of the Probe is mediated and / or improved by one or more Selector probe.
11. The method of any of Claims 8-10, wherein circular single-stranded polynucleotide substrates are formed by ligation of the Probe; optionally wherein ligation is performed by a ligase, preferably a ligase with specific intramolecular ligation activity.
12. The method of any of Claims 3-11, wherein Rolling Circle Amplification of the circular single-stranded polynucleotide substrates is initiated by the target polynucleotide sequence or by an amplification primer that is complementary to the circular singe-stranded polynucleotide substrate.
13. The method of any preceding claim, wherein step (iii) comprises determining the sequence of the one or more RCA-Products by DNA sequencing.
14. The method of any preceding claim, wherein the one or more RCA-Products generated in step (ii) are labelled with a detectable moiety; optionally, wherein the detectable moiety is selected from the group comprising: a fluorophore; a chromophore; or a combination thereof.
15. The method of any preceding claim, wherein step (iii) comprises detecting the one or more RCA-Products by microscopy; optionally wherein microscopy is selected from the group comprising: bright-field microscopy; fluorescence microscopy; or a combination thereof.
16. The method according to any preceding claim, wherein the method comprises the step of attaching the one or more RCA-Products to magnetic beads to provide beadbound RCA-Products; optionally where the one or more RCA-Products are immobilized by providing a magnetic source so as to attract the bead-bound RCA- Products to a position on the surface.
17. The method of any preceding claim, wherein the sample further comprises one or more correctly-edited polynucleotide sequence and / or one or more unedited polynucleotide sequence.
18. The method of Claim 17, further comprising the step of generating one or more RCA-Products from the one or more correctly-edited polynucleotide sequence and / or the one or more unedited polynucleotide sequence.
19. The method according to any preceding claim, wherein the efficiency of the genetic- editing procedure is determined based on the relative amounts of the incorrectly- edited polynucleotide sequences, and / or the one or more correctly-edited polynucleotide sequences and / or the one or more unedited polynucleotide sequences in the sample.
20. The use of Rolling Circle Amplification for detecting and / or characterising incorrectly-edited polynucleotide sequences from a gene-editing procedure.
21. A kit of parts comprising;(i) an Oligonucleotide-Tag as defined in any of Claims 4-19;(ii) a Probe, as defined in any of Claims 4-19; and(iii) one or more reagent for performing Rolling Circle Amplification.
22. A kit according to Claim 21, further comprising one or more of the following:- one or more reagent for performing genetic-editing of a polynucleotide sequence; one or more reagent for capturing an RCA-Product; one or more monomerization reagent; one or more Probe controls and / or spike-in templates;- one or more library preparation reagent; and / orinstructions for performing the method according to any one of Claims 1 to 19.
23. A population of DNA molecules obtained or obtainable by a method according to any of Claims 1 to 19, or by the use according to Claim 20.
24. A population of DNA molecules according to Claim 23, wherein the population comprises one or more RCA-Products comprising nucleotide sequence corresponding to an incorrectly-edited polynucleotide sequence from a genetic- editing procedure and nucleotide sequence corresponding to an Oligonucleotide- Tag.
25. A method for detecting and / or characterising a double-strand break in a polynucleotide sequence, the method comprising the steps of:(i) providing a sample comprising a polynucleotide sequence having a doublestrand break;(ii) integrating an Oligonucleotide-Tag at the double-strand break;(iii) performing Rolling Circle Amplification, to generate one or more RCA- Products from the double-strand break in the sample; and(iv) detecting and / or characterising the double-strand break in the polynucleotide sequence based on the one or more RCA-Products generated in step (iii).
26. The method according to Claim 25, wherein the double-strand break is at an unknown site in the one or more polynucleotide sequence.
27. The use of Rolling Circle Amplification for detecting and / or characterising a doublestrand break in a polynucleotide sequence.
28. The method according to Claim 26 or 26, or the use according to Claim 17, wherein the polynucleotide sequence having a double-strand break is generated in a genetic-editing procedure.
29. A method for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure,wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure, the method comprising the steps of:(i) providing a sample from a genetic-editing procedure, the sample comprising the one or more payload sequence;(ii) generating circular single-stranded polynucleotide substrates from the one or more payload sequence, and performing Rolling Circle Amplification to generate RCA-Products from the one or more payload sequence in the sample; and(iii) detecting and / or characterising the presence of the one or more payload sequence based on the RCA-Products generated in step (ii).
30. The method of Claim 29, wherein the one or more payload sequence comprises a polynucleotide sequence of more than 100 bases in length.
31. The method of Claim 29 or 30, wherein the one or more payload sequence comprises one or more functional gene expression element.
32. The method of any of Claims 29 to 31, wherein the integration of the one or more payload sequence at an unknown site of the polynucleotide sequence is achieved by homology-directed repair.
33. The method of any of Claims 29 to 32, wherein step (ii) comprises: generating circular single-stranded polynucleotide substrates using a Probe that targets the one or more payload sequence at the unknown site.
34. The method of Claim 33, wherein the Probe comprises a nucleotide sequence capable of binding to the one or more payload sequence.
35. The method of Claim 33 or 34, wherein the Probe comprises a degenerate nucleotide sequence capable of binding at, or adjacent to, nucleotide sequence at the one or more payload sequence.
36. The method of any of Claims 33 to 35, wherein the Probe is selected from the group comprising: a padlock probe; a selector probe, a molecular inversion probe; a gapfill probe; a split-like probe; a Lotus probe; or a combination thereof.
37. The method of any of Claims 33 to 36, wherein the Probe circularises on recognition of its target polynucleotide sequence, and / or the target polynucleotide sequence circularises on recognition of the Probe.
38. The method of Claim 37, wherein circularisation of the Probe is mediated and / or improved by one or more Joining probe.
39. The method of any of Claims 33 to 38, wherein circular single-stranded polynucleotide substrates are formed by ligation of the Probe and / or the target polynucleotide sequence; optionally wherein ligation is performed by a ligase, preferably a ligase with specific intramolecular ligation activity.
40. The method of any preceding claim, wherein Rolling Circle Amplification of the circular single-stranded polynucleotide substrates is initiated by the target polynucleotide sequence, Probe, and / or by at least one amplification primer that is complementary to the circular singe-stranded polynucleotide substrate.
41. The method of any preceding claim, wherein step (iii) comprises quantifying the one or more RCA-Product and / or determining some or all of the nucleotide sequence of the one or more RCA-Product.
42. The method of any preceding claim, wherein step (iii) comprises determining the sequence of the one or more RCA-Products by DNA sequencing.
43. The method of any preceding claim, wherein the one or more RCA-Products generated in step (ii) are labelled with a detectable moiety; optionally, wherein the detectable moiety is selected from the group comprising: a fluorophore; a chromophore; or a combination thereof.
44. The method of any preceding claim, wherein step (iii) comprises detecting the one or more RCA-Products by microscopy; optionally wherein microscopy is selectedfrom the group comprising: bright-field microscopy; fluorescence microscopy; or a combination thereof.
45. The method according to any preceding claim, wherein the method comprises the step of attaching the one or more RCA-Products to magnetic beads to provide beadbound RCA-Products; optionally wherein the one or more RCA-Products are immobilized by providing a magnetic source so as to attract the bead-bound RCA- Products to a position on the surface.
46. The use of Rolling Circle Amplification for detecting and / or characterising the presence of one or more payload sequence following a genetic-editing procedure; wherein the one or more payload sequence has been integrated into a polynucleotide sequence at an unknown site by the genetic-editing procedure.
47. A kit of parts comprising;(i) one or more payload sequence as defined in any of Claims 29-45;(ii) a Probe, as defined in any of Claims 33-45; and(iii) one or more reagent for performing Rolling Circle Amplification.
48. A kit according to Claim 47, further comprising one or more of the following:- one or more reagent for performing genetic-editing of a polynucleotide sequence; one or more reagent for capturing an RCA-Product; one or more monomerization reagent; one or more Probe controls and / or spike-in templates;- one or more library preparation reagent; and / or instructions for performing the method according to any one of Claims 29 to 45.
49. A population of DNA molecules obtained or obtainable by a method according to any of Claims 29 to 45, or by the use according to Claim 46.
50. A population of DNA molecules according to Claim 49, wherein the population comprises one or more RCA-Products comprising nucleotide sequence corresponding to the one or more payload sequence from a genetic-editing procedure.
1. A method, use, kit of parts or population of DNA molecules substantially as described herein, with reference to the accompanying description, examples and / or figures.