Diversified base editing

JP2025520697A5Pending Publication Date: 2026-06-26BASF AGRICULTURAL SOLUTIONS US LLC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
BASF AGRICULTURAL SOLUTIONS US LLC
Filing Date
2023-06-23
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Current base editing systems, particularly those using Cas9, are limited in their ability to diversify mutations and are not suitable for directed evolution in plants, lacking the capability to induce a broad spectrum of mutations efficiently and precisely, while Cas12a-based systems have remained ineffective.

Method used

Development of a Cas12a-based diversified base editor system that includes a CRISPR-Cas moiety derived from Cas12a, combined with cytosine and adenine deaminases, and optimized for targeted diversification in plant cells, achieving high base editing efficiency and a broad mutation spectrum.

Benefits of technology

The system enables efficient and targeted diversification of nucleic acid sequences in plant cells, allowing for directed evolution with high activity and precision, overcoming limitations of existing Cas9-based systems.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 00000052_0000
    Figure 00000052_0000
  • Figure 00000052_0001
    Figure 00000052_0001
  • Figure 00000052_0002
    Figure 00000052_0002
Patent Text Reader

Abstract

The present invention relates to the field of increasing genetic diversity in a targeted manner. In particular, this relates to the provision of methods and means for diversifying a target sequence using a base editor having an expanded mutation spectrum, including a Cas12a diversified base editing system and its use.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present invention relates to the field of increasing genetic diversity in a targeted manner. In particular, this relates to the provision of methods and means for diversifying targeted sequences using a base editor having an expanded mutation spectrum, including a Cas12a diversified base editing system and its use.

Background Art

[0002] In agriculture and other fields, the improvement of traits is continuously carried out. Classical approaches to achieve this are usually random mutagenesis by UV or EMS-induced mutagenesis. These mutagenesis approaches enable the discovery of new mutants, but they are extremely time-consuming and labor-intensive.

[0003] Furthermore, these strategies are not targeted and thus induce random mutations across the genome, and therefore do not enable the directed evolution or manipulation of the locus of interest without the risk of simultaneously causing unwanted mutations in the genome of interest.

[0004] In contrast, targeted genetic modification can be achieved by CRISPR-Cas approaches. These approaches enable the precise editing of the target gene position, but most of them are limited to insertions and deletions, and the standard CRISPR-Cas approach does not enable directed evolution. To enable directed evolution, strategies relying on the in vitro generation of random or semi-random mutagenesis libraries have been developed. However, since these approaches are carried out outside the organism of interest, they do not enable easy phenotypic analysis of the generated mutations.

[0005] With the creation of base editors, the CRISPR-Cas system has been successfully modified to induce targeted point mutations instead of cleaving target DNA. Currently, there are mainly two types: cytosine / cytidine base editors (CBEs) and adenine / adenosine base editors (ABEs). CBEs are typically created by fusing a cytidine deaminase domain to catalytically weakened Cas9 (either inactivated Cas9 (D10A / H840A) or nickase Cas9 (D10A)). Various cytidine deaminases, including APOBEC1 (A1), A3A, A3B, PmCDA1, AID, and their derivatives, have been used for base editing (Rees and Liu, 2018). CBEs catalyze the deamination of cytosine to uracil on the non-target DNA strand, ultimately causing a C-G to T-A mutation (see Komor et al., 2016; Komor et al., 2017 for CBEs). Regarding Cas9 variants suitable for base editors, nCas9 is thought to be more active than dCas9 because nicking of the target strand allows the non-target strand to be used as a template for mismatch-mediated repair (e.g., Eid et al., 2018). Nevertheless, early base editors only allow single types of conversions, either C to T or A to G, and are thus not suitable when high diversification potential is of interest.

[0006] Recent developments have shown that these two types of base editors can be combined into so-called dual base editors, but the resulting C-to-T and A-to-G conversions still provide only limited diversification. Therefore, there is a great need in the art for a system that allows diversification closer to random mutagenesis while being targeted, i.e., specifically inducing modifications at the locus of interest and enabling in situ, i.e., in the target cell or organism, targeted diversification.

[0007] Another limitation is that currently, base editing systems are restricted to Cas9 base editors. Cas12a (also called Cpf1) has gained particular interest in recent years as an alternative to Cas9 among other CRISPR-Cas systems, but Cas12a base editor systems have remained mostly ineffective, especially in plants. Furthermore, a functional Cas12a dual base editing system has not been described to date.

[0008] Accordingly, an object of the present invention is to provide a novel and particularly optimized base editor system comprising a Cas12a diversified base editor for enabling in situ targeted diversification with an improved editing scope while having high overall activity and base editing efficiency, which can be used for directed evolution approaches. SUMMARY OF THE INVENTION MEANS FOR SOLVING THE PROBLEM

[0009] In a first aspect, a method for targeted diversified base editing of at least one target nucleic acid segment, comprising: (a) providing at least one cell or construct comprising at least one target nucleic acid segment; (b) introducing into the target cell or contacting with the target construct (i) at least one diversified base editor (DBE), or at least one nucleic acid molecule encoding the same; and (ii) at least one suitable guide RNA or at least one nucleic acid molecule encoding the same; (c) enabling the formation of a complex of (i) at least one diversified base editor and (ii) at least one suitable guide RNA; (d) obtaining at least one cell or construct comprising at least one modified target nucleic acid segment, wherein the total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, with an upper limit of 100% or less, and / or the ratio of C to G substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the ratio of C to T substitutions and / or the ratio of C to A substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the ratio of C to T substitutions; and / or at least one modification of the target nucleic acid segment occurs in an extended base editing window; and the method does not include a method for treating and / or diagnosing the human or animal body by surgery or therapy performed on the human or animal body, and / or a process for modifying the genetic identity of the human germline is provided.

[0010] In one embodiment of the first aspect, the diversified base editor includes a CRISPR-Cas moiety derived from a Cas9 endonuclease or a class 2 type II CRISPR-Cas endonuclease, which is a class 2 type V CRISPR-Cas endonuclease. Preferably, the diversified base editor includes a CRISPR-Cas moiety derived from a Cas12a endonuclease.

[0011] In another embodiment of the first aspect, at least one target cell is a prokaryotic cell including a bacterial cell or an archaeal cell, or a eukaryotic cell including an insect cell, a mammalian cell, or a plant cell.

[0012] In another embodiment of the first aspect, at least one target cell is a plant cell including a plant protoplast.

[0013] In another embodiment of the first aspect, at least one diversified base editor includes (i) one or more cytosine deaminase moieties, (ii) one or more adenine deaminase moieties, (iii) one or more CRISPR-Cas moieties (preferably, the CRISPR-Cas domain does not cleave both strands of double-stranded DNA), (iv) one, two, three, or more nuclear localization sequences; and (v) at least one linker region, preferably one or more linker regions between (i) and (ii), and optionally one or more linker regions between (ii) and (iii).

[0014] In another embodiment of the first aspect, at least one diversified base editor of step (b-i) is at least one diversified base editor in the form of a fusion protein, preferably, the moieties (i), (ii) and (iii) as defined above are arranged in the N-terminal to C-terminal direction in the order of (i)-(ii)-(iii) together with one or more linker regions between each segment, more preferably, one, two, three or more nuclear localization sequences (iv) are located at the C-terminus of the diversified base editor, or one or more nuclear localization sequences (iii) are located at the N-terminus and one or more nuclear localization sequences (iii) are located at the C-terminus of the diversified base editor.

[0015] In another embodiment of the first aspect, the diversified base editor comprises at least one additional moiety, preferably, the at least one additional moiety is selected from an ssDNA, ssRNA, or dsRNA binding protein moiety including an MS2 protein moiety, an affinity tag binding protein, a uracil glycosylase inhibitor moiety and / or a uracil glycosylase moiety, or any combination thereof.

[0016] In one embodiment, the at least one additional moiety comprises at least one uracil DNA N-glycosylase (UNG), optionally, a uracil DNA N-glycosylase (eUNG) derived from Escherichia coli, and optionally, at least one UNG is delivered in trans with at least one diversified base editor of the present invention. For delivery of at least one UNG, optionally, at least one eUNG, it may be desirable to express at least one UNG or eUNG from a strong promoter such as the 35S promoter (SEQ ID NO: 59). In a preferred embodiment, at least one UNG, optionally, at least one eUNG is delivered in trans with at least one diversified base editor and at least one base editor is present in the form of a fusion protein as disclosed herein.

[0017] In another embodiment of the first aspect, one or more adenine deaminase moieties and / or one or more cytosine deaminase moieties are linked to at least one ssRNA or dsRNA binding protein moiety, preferably at least one MS2 protein moiety, and at least one suitable guide RNA is adapted to enable interaction with at least one ssRNA or dsRNA binding protein moiety, preferably one or more adenine base editor moieties and / or one or more cytosine base editor moieties are linked to at least one MS2 protein moiety, and the suitable guide RNA is adapted to include two MS2 stem-loops, optionally, the suitable guide RNA is a sequence selected from SEQ ID NO: 38 to SEQ ID NO: 41, or a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto.

[0018] In another embodiment of the first aspect, the diversified base editor comprises an amino acid molecule selected from any one of SEQ ID NO: 1 to SEQ ID NO: 27, 52 or a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to each respective reference sequence.

[0019] In the second aspect, there is provided an edited cell, tissue, organ, material or whole organism obtainable or obtainable by the method according to the first aspect.

[0020] In the third aspect, there is provided a diversified base editor complex further comprising a diversified base editor, or at least one suitable guide RNA, or at least one nucleic acid molecule encoding the same, wherein the diversified base editor is as defined in the first aspect.

[0021] In a fourth aspect, a vector or expression construct, or two or more vectors and expression constructs are provided, each vector and / or expression construct comprising at least one nucleic acid molecule of the third aspect, different parts of the diversified base editor being encoded on the same vector or expression construct or on different vectors or expression constructs, and / or the diversified base editor, or parts thereof, and at least one suitable guide RNA being encoded on the same vector or expression construct or on different vectors or expression constructs.

[0022] In a fifth aspect, there is provided a cell comprising at least one diversified base editor or at least one diversified base editor complex of the third aspect, or at least one nucleic acid molecule encoding the same; or at least one vector or expression construct of the fourth aspect; the cell is a prokaryotic cell including a bacterial cell or an archaeal cell, or a eukaryotic cell including an insect cell, a mammalian cell including a human cell, or a plant cell including a plant protoplast, preferably, the cell is a plant cell including a plant protoplast, optionally, the plant cell including a plant protoplast is a plant belonging to the superfamily Viridiplantae, in particular, Acer spp., Actinidia spp., Abelmoschus spp., Agave sisalana, Agropyron spp., Agrostis stolonifera, Allium spp., Amaranthus spp., Ammophila arenaria, Ananas comosus, Annona spp., Apium graveolens, Arachis spp, Artocarpus spp., Asparagus officinalis, Avena spp. (e.g., Avena sativa, Avena fatua, Avena byzantina, Avena fatua var.sativa, Avena hybrida), Averrhoa carambola, Bambusa sp., Benincasa hispida, Bertholletia excelsea, Beta vulgaris, Brassica spp.(e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip rape], Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g., Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp.)、loquat (Eriobotrya japonica), eucalyptus species (Eucalyptus sp.), Surinam cherry (Eugenia uniflora), buckwheat species (Fagopyrum spp.), beech species (Fagus spp.), tall fescue (Festuca arundinacea), common fig (Ficus carica), kumquat species (Fortunella spp.), strawberry species (Fragaria spp.), ginkgo (Ginkgo biloba), soybean species (Glycine spp.) (e.g., soybean (Glycine max), Soja hispida or Soja max), cotton (Gossypium hirsutum), sunflower species (Helianthus spp.) (e.g., sunflower (Helianthus annuus)), orange daylily (Hemerocallis fulva), hibiscus species (Hibiscus spp.), barley species (Hordeum spp.) (e.g., barley (Hordeum vulgare)), sweet potato (Ipomoea batatas), walnut species (Juglans spp.), lettuce (Lactuca sativa), vetch species (Lathyrus spp.), lentil (Lens culinaris), flax (Linum usitatissimum), litchi (Litchi chinensis), lotus species (Lotus spp.), angled luffa (Luffa acutangula), lupinus species (Lupinus spp.), woodrush (Luzula sylvatica), tomato species (Lycopersicon spp.) (e.g., tomato (Lycopersicon esculentum), Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma species (Macrotyloma spp.), apple species (Malus spp.), acerola (Malpighia emarginata), mammee apple (Mammea americana), mango (Mangifera indica), cassava species (Manihot spp.) Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g., Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp.) Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g., Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp.,. Plant cells of monocotyledonous and dicotyledonous plants, or cells derived therefrom, selected from at least one target cell selected from forage or foliage leguminous plants, ornamental plants, edible crops, trees or shrubs selected from the list including Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g., Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vitis spp., Zea mays, Zizania palustris, or Ziziphus spp., or cells derived therefrom.

[0023] In a sixth aspect, there is provided a kit comprising at least one diversified base editor or at least one diversified base editor complex of the third aspect, or at least one nucleic acid molecule encoding the same; or at least one vector or expression construct of the fourth aspect; or at least one cell of the fifth aspect.

[0024] In a seventh aspect, there is provided the use of at least one diversified base editor or at least one diversified base editor complex of the third aspect, or at least one nucleic acid molecule encoding the same, for the targeted directed evolution of at least one target nucleic acid segment, preferably for the targeted directed evolution of at least one target nucleic acid segment in a plant; or at least one vector or expression construct of the fourth aspect; or at least one cell of the fifth aspect; or at least one kit of the sixth aspect, for optimizing or modifying a trait in a plant, including optimizing or modifying a yield-related trait, or a disease or pathogen resistance-related trait, wherein the disease is caused by a virus, bacterium, fungus, nematode, or insect, or a herbicide resistance-related trait, or an abiotic stress-related trait including a salinity or drought stress-related trait, or the pathogen is selected from them, and further including the use for the identification of at least one lead gene.

[0025] Definitions The terms "adenine deaminase" and "adenosine deaminase" are used interchangeably herein. Similarly, the terms "cytidine deaminase" and "cytosine deaminase" are used interchangeably herein.

[0026] The term "base editor complex", as used herein, refers to a complex of at least one base editor and at least one guide RNA suitable for at least one CRISPR-Cas moiety of the at least one base editor. While the present invention includes base editors comprising two or more polypeptides that form a diversified base editor via non-covalent binding, these are also referred to as diversified base editors or DBEs, and are referred to as base editor complexes only when they also include at least one suitable guide RNA. However, a reference to a diversified base editor or DBE without an explicit reference to a complex does not exclude the possibility that the base editor may be present in a complex with at least one suitable guide RNA.

[0027] As used herein, the term "base editing window" generally refers to the region in a genomic sequence that contains the target nucleic acid segment to be modified, and the base editing window is the window in which a diversified base editor guided by a suitable guide RNA can theoretically induce at least one targeted nucleotide exchange as base editing. This window is defined by the suitable guide RNA and the region to be modified, particularly, the structure of the diversified base editor induced by the genomic region and the physical accessibility of the diversified base editor.

[0028] As used herein, "diversified base editor" or "DBE" refers to a base editor that includes at least one cytosine deaminase moiety, at least one adenosine deaminase moiety, and at least one CRISPR-Cas moiety, and the CRISPR-Cas moiety may be modified to cleave only one strand of the target DNA, or may be modified not to cleave either strand of the target DNA and at least one nuclear localization sequence, and the DBE may further include one or more additional moieties such as an ssDNA, ssRNA, or dsRNA binding protein moiety, a uracil glycosylase inhibitor moiety and / or a uracil glycosylase moiety, and the moieties are linked to each other covalently and / or non-covalently, and the non-covalent linkage may also be achieved by covalent and / or non-covalent binding of one or more moieties other than the CRISPR-Cas moiety to a suitable guide RNA that non-covalently interacts with a group of the CRISPR-Cas moiety or a moiety containing the CRISPR-Cas moiety, and the covalent linkage of the moieties may be achieved by at least one linker region.

[0029] The term "guide RNA" may refer to any RNA comprising a Cas protein binding region and a targeting region, and as long as the target nucleotide sequence is positioned adjacent to a protospacer adjacent motif (PAM) suitable for each Cas protein, the Cas protein can be directed to a target nucleotide sequence that is sufficiently complementary to the targeting region of the guide RNA. "Suitable guide RNA", as used herein, refers to a guide RNA suitable for use as part of a DBE, i.e., a suitable guide RNA can bind to the CRISPR-Cas moiety utilized through the Cas-protein binding region, and the targeting region has complementarity to the nucleotide sequence immediately upstream of the PAM sequence recognized by the CRISPR-Cas moiety utilized. As is well known in the art, the Cas12a system typically relies on a single crRNA as the guide RNA, and the Cas9 system typically uses a crRNA::tracrRNA duplex, which can be mimicked by a synthetic single guide RNA molecule. Those skilled in the art are well aware of designing, expressing / synthesizing, and adapting guide RNAs for the required purposes.

[0030] "Identity", when used in reference to the comparison of two or more nucleic acid or amino acid molecules, means that the sequences of said molecules share a certain degree of sequence similarity and that the sequences are partially identical.

[0031] Enzyme variants can be defined by their sequence identity when compared to the parent enzyme. Sequence identity is usually expressed as “% sequence identity” or “% identity”. To determine the percentage of identity between two amino acid sequences in the first step, a pairwise sequence alignment is made between these two sequences, and these two sequences are aligned over their full lengths (i.e., pairwise global alignment). The alignment is performed using a program implementing the Needleman and Wunsch algorithm (J. Mol. Biol. (1979) 48, p. 443 - 453), preferably using the program “NEEDLE” (European Molecular Biology Open Software Suite (EMBOSS)) with the program default parameters (gap open = 10.0, gap extension = 0.5, and matrix = EBLOSUM62). A preferred alignment for the purposes of the present invention is an alignment that can determine the maximum sequence identity.

[0032] The following example is intended to illustrate two nucleotide sequences, but the same calculation applies to protein sequences as well. Seq A: AAGATACTG Length: 9 bases Seq B: GATCTGA Length: 7 bases

[0033] Therefore, the shorter sequence is Sequence B.

[0034] The generation of a pairwise global alignment showing both sequences over their full lengths results in the following.

Chemical formula

[0035] The symbol “I” in the alignment indicates identical residues (meaning bases in the case of DNA or amino acids in the case of proteins). The number of identical residues is 6.

[0036] The symbol "-" within the alignment indicates a gap. The number of gaps introduced by alignment within Array B is 1. The number of gaps introduced by alignment at the edge of Array B is 2, and at the edge of Array A is 1.

[0037] The alignment length showing the arrays aligned over the full length is 10.

[0038] In accordance with the present invention, when generating a pairwise alignment showing a shorter array over its full length, the following is obtained as a result.

Chemical formula

[0039] In accordance with the present invention, when generating a pairwise alignment showing Array A over its full length, the following is obtained as a result.

Chemical formula

[0040] In accordance with the present invention, when generating a pairwise alignment showing Array B over its full length, the following is obtained as a result.

Chemical formula

[0041] The alignment length showing the shorter array over its full length is 8 (there is one gap included in the alignment length of the shorter array).

[0042] Therefore, the alignment length showing Seq A over its full length would be 9 (meaning that Seq A is the array of the present invention).

[0043] Therefore, the alignment length showing Seq B over its full length would be 8 (meaning that Seq B is the array of the present invention).

[0044] After aligning the two sequences, in a second step, the identity value is determined from the generated alignment. For the purposes of this description, percent identity is calculated by %-identity = (identical residues / length of the alignment region showing each sequence of the invention over its full length) × 100. Thus, the sequence identity related to the comparison of two amino acid sequences according to this embodiment is calculated by dividing the number of identical residues by the length of the alignment region showing each sequence of the invention over its full length. Multiplying this value by 100 gives the "percent identity". According to the example shown above, the percent identity is (6 / 9)*100 = 66.7% when sequence A is the sequence of the invention, and (6 / 8)*100 = 75% when sequence B is the sequence of the invention.

[0045] "Indel" is a term for random insertions or deletions of bases in the genome of an organism associated with the repair of DSBs by NHEJ. It is classified among small genetic changes and is measured from 1 to 10,000 base pairs in length. As used herein, it refers to random insertions or deletions of bases in or near the target site (e.g., less than 1000 bp, less than 900 bp, less than 800 bp, less than 700 bp, less than 600 bp, less than 500 bp, less than 400 bp, less than 300 bp, less than 250 bp, less than 200 bp, less than 150 bp, less than 100 bp, less than 50 bp, less than 40 bp, less than 30 bp, less than 25 bp, less than 20 bp, less than 15 bp, less than 10 bp or less than 5 bp upstream and / or downstream).

[0046] As used herein, when the term "material" refers to a material obtainable or obtained through the methods of the present disclosure, it refers to any material that can contain at least one target nucleic acid segment. "Material" can refer to a material directly obtained or obtainable from an organism or group of organisms, or can refer to cellular material obtained or obtainable by lysis, solubilization, and / or other preparation means. Further, the material may be self-propagating, such as the reproductive system and / or seeds, or may be non-propagating. Further, the material may refer to a purified or synthetic material, such as a plasmid, linear poly- or oligonucleotide.

[0047] As used herein, the term "moiety" refers to a functional unit of a diversified base editor. A moiety may be a single domain having one or more functionalities, such as enzymatic activity or binding activity, and a moiety may also consist of two or more domains having one or more functionalities synergistically. A moiety may include, or consist of, the complete protein sequence of a given protein, such as the complete Cas12a protein sequence or the complete adenosine or cytidine deaminase protein sequence, or may include, or consist of, a portion of the sequence of the polypeptide from which the moiety is derived, if it is known that a portion of the sequence is sufficient to have the desired one or more functionalities. A moiety generally may include, or consist of, a variant amino acid sequence compared to the wild-type protein sequence from which it is derived.

[0048] As used herein, the term "plant" encompasses the whole plant, ancestors and progeny of the plant, and seeds, shoots, stems, leaves, roots (including tubers), flowers, and parts of the plant including tissues and organs. The term "plant" also encompasses plant cells, suspension cultures, callus tissue, embryos, meristematic regions, gametophytes, sporophytes, pollen, and microspores.

[0049] As used herein, the term "targeted diversified base editing" refers to the state, quality, process, and / or result in which a diversified base editor is guided to a target nucleic acid sequence by a suitable guide RNA via hybridization of the guide RNA and the target nucleic acid sequence, resulting in a base substitution in the editing window at the target site. Targeted base editing does not exclude the presence of off-target base substitutions, i.e., base substitutions that do not occur at the target site. Those skilled in the art are well aware of the various factors that affect off-target base substitutions.

[0050] As used herein, "target nucleic acid segment" refers to a continuous DNA (single-stranded or double-stranded) or RNA such as genomic DNA for in vivo applications and / or applications targeting cells in cell culture, or isolated DNA for in vitro applications outside of living cells where DBE-induced base substitutions occur, e.g., plasmid DNA. The target nucleic acid segment is either within the target sequence (in applications where the editing window is smaller than the target sequence) or the target sequence is within the target nucleic acid segment (in applications where the editing window extends beyond the target sequence). Generally, "target nucleic acid segment" refers to a continuous DNA within the target site that can extend up to 10 bp, 20 bp, 30 bp, or 40 bp adjacent to the target site in both directions depending on the editing window.

[0051] As used herein, "target site" refers to both strands of double-stranded DNA, i.e., the target strand to which the guide RNA anneals and the complementary non-target strand. The target site is a continuous DNA to which the guide RNA has suitable complementarity to at least one DNA strand.

[0052] "Total base editing efficiency", as used herein, refers to the rate of introducing at least one nucleic acid base substitution within a target nucleic acid segment, i.e., at least one nucleic acid base in the target nucleic acid segment is substituted with a different nucleic acid base, regardless of the type of substitution for another naturally occurring nucleic acid base. For example, a total base editing efficiency of 10% means that, as determined before and after, or in the presence or absence of, a DBE as disclosed herein, 10 out of 100 nucleic acid molecules retain at least one nucleic acid base substitution in the target nucleic acid segment. Usually, the total base editing efficiency is determined by sequencing, i.e., the percentage of reads showing nucleic acid base substitution in the target nucleic acid segment relative to the total number of reads encompassing the target nucleic acid segment can be assumed to represent the total base editing efficiency within a reasonable error range for a given sequencing application.

Brief Description of the Drawings

[0053]

Figure 1A

Figure 1B

Figure 2

Figure 3

Figure 4A

Figure 4B

Figure 5A

Figure 5B

Figure 6

Figure 7

Figure 8

Figure 9A

Figure 9B

Mode for Carrying Out the Invention

[0054] To achieve the object of the present invention, a new class of Cas9 - based diversified base editors that enable a broad mutation spectrum has been developed. Furthermore, Cas12a - based diversified base editors have been newly designed and optimized.

[0055] In a first aspect, a method for targeted diversified base editing of at least one target nucleic acid segment, comprising: (a) providing at least one cell or construct comprising at least one target nucleic acid segment; (b) introducing into the target cell or contacting with the target construct (i) at least one diversified base editor (DBE), or at least one nucleic acid molecule encoding the same; and (ii) at least one suitable guide RNA or at least one nucleic acid molecule encoding the same; (c) enabling complex formation of (i) at least one diversified base editor and (ii) at least one suitable guide RNA; (d) obtaining at least one cell or construct comprising at least the modified target nucleic acid segment may be provided; The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; or The proportion of C to G substitutions is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; or The proportion of C to A substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; or At least one modification of the target nucleic acid segment occurs in an extended base editing window; or The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; the ratio of C to G substitutions is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the ratio of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; or The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; the ratio of C to G substitutions is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the ratio of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; the ratio of C to A substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the ratio of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; or The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; the proportion of C to G substitutions is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window; or The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; the proportion of C to A substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; or The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; the proportion of C to A substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window; or The total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, and the upper limit is 100% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window; or The proportion of C to G substitutions is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; the proportion of C to A substitutions is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitutions, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; or The ratio of C to G substitution is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the ratio of C to T substitution, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window, or The proportion of C to G substitution is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitution, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; the proportion of C to A substitution is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitution, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window, or the proportion of C to G substitution is at least 0.1, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90% of the proportion of C to T substitution, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window, or the proportion of C to A substitution is at least 0.1%, 0.5%, 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or at least 90%, and optionally, the upper limit is 100% or less, 110% or less, 120% or less, 130% or less, 140% or less, 150% or less, 160% or less, 170% or less, 180% or less, 190% or less, or 200% or less; at least one modification of the target nucleic acid segment occurs in the extended base editing window; preferably, the diversified base editor comprises a CRISPR-Cas moiety derived from a class 2 type V CRISPR-Cas endonuclease, and the class 2 type V CRISPR-Cas endonuclease may be a Cas12a endonuclease, or a portion thereof, and the method does not include a surgical or therapeutic treatment of the human or animal body, and / or a diagnostic method on the human or animal body, and / or a process for modifying the genetic identity of the human germline.

[0056] In some embodiments, the method is performed outside of a living cell, and at least one target nucleic acid segment is included in at least one construct, such as a linear DNA molecule, e.g., a PCR product or a restriction digest product, or a DNA vector, such as a plasmid vector. In such embodiments, at least one DBE is typically used in a purified form. Those skilled in the art are well aware of various standard procedures for protein expression and purification. At least one guide RNA can be purified, for example, from in vitro transcription or de novo synthesis.

[0057] In other embodiments, the method is performed in living cells, i.e., at least one DBE, at least one suitable guide RNA, or at least one nucleic acid encoding the same is introduced into at least one cell. The at least one DBE and the at least one suitable guide RNA can be introduced separately, together, and / or as an RNP complex. In embodiments regarding the introduction of at least one nucleic acid molecule encoding the at least one DBE and the at least one suitable guide RNA, the DBE can be encoded on the same nucleic acid molecule as the at least one suitable guide RNA, or it can be encoded on a different nucleic acid molecule. The nucleic acid molecule can be DNA, including RNA, typically an mRNA molecule, or a DNA expression vector including an expression plasmid vector. The at least one guide RNA is usually provided directly as a guide RNA molecule or as DNA encoding the same. Those skilled in the art are well aware of the design and preparation of different nucleic acid molecules and the various different methods for introducing proteins, nucleic acids, and RNPs into living cells.

[0058] In certain embodiments, the total base editing efficiency of introducing at least one substitution of any kind into at least the on-target nucleic acid segment is 30% - 100% or 35% - 100%, or 40% - 100% or 45% - 100% or 50% - 100%.

[0059] "Modified target nucleic acid segment", as used herein, refers to the presence of at least one nucleic acid base substitution of any kind within the target nucleic acid segment, and unless otherwise specified, any kind of substitution refers to a substitution of any of the four natural nucleic acid bases A, C, G, or T to any different one of the four natural nucleic acid bases.

[0060] At least one nucleic acid molecule encoding at least one DBE according to various embodiments and aspects of this specification may be codon-optimized and may further include a nucleic acid sequence encoding at least one compatible guide RNA. In any of the embodiments described herein, the nucleic acid sequence or molecule may be operably linked to various promoters and other regulatory elements for expression in a cell and / or organism of interest.

[0061] The methods according to the embodiments and aspects may include an additional step of regenerating at least one population of edited cells, tissues, organs, materials or whole organisms from at least one edited cell or construct.

[0062] In one embodiment of the first aspect, the diversified base editor includes a Cas9 endonuclease, or a CRISPR-Cas moiety derived from a naturally occurring and later artificially modified class 2 type V CRISPR-Cas endonuclease, preferably a class 2 type II CRISPR-Cas endonuclease, and preferably the diversified base editor includes a CRISPR-Cas moiety derived from a Cas12a endonuclease.

[0063] The CRISPR-Cas moiety may include or consist of a mutant Cas9 or Cas12a amino acid sequence. Typically, the CRISPR-Cas moiety includes at least one mutation that makes the CRISPR-Cas moiety a nickase (cuts one strand of double-stranded DNA) or an inactive (does not cut DNA) CRISPR-Cas moiety by not cutting both strands of double-stranded DNA. The CRISPR-Cas moiety may further include mutations that alter PAM specificity, thermotolerance and / or other characteristics.

[0064] In a preferred embodiment using the CRISPR-Cas9 moiety, the CRISPR-Cas moiety comprises or consists of SpCas9 having the mutations D10A, K848A, K1003A, and R1060A, also referred to herein as "enCas9" or "enCas9 nickase". The K848A, K1003A, R1060A mutations have been shown to weaken off-target strand binding by neutralizing positively charged residues in the groove of the non-target strand, thereby promoting dissociation of nCas9 from the DNA after introducing a break at the target locus (Slaymaker et al., 2016).

[0065] Preferred CRISPR-Cas12a moieties comprise or consist of LbCas12a having the mutations D156R and D832A, optionally further having the double mutations G532R / K538R, and / or the mutation E795L. In one embodiment, the CRISPR-Cas moiety comprises or consists of LbCas12a-D156R / G532R. In another embodiment, the CRISPR-Cas moiety comprises or consists of LbCas12a-D156R / G532R / K538R / D832A. In a further embodiment, the CRISPR-Cas moiety comprises or consists of LbCas12a-D156R / D832A / E795L. In yet another embodiment, the CRISPR-Cas moiety comprises or consists of D156R / G532R / K538R / D832A / E795L.

[0066] In one embodiment of the first aspect, the at least one target cell is a prokaryotic cell comprising a bacterial cell or an archaeal cell, or a eukaryotic cell comprising an insect cell, a mammalian cell, or a plant cell.

[0067] In another embodiment according to the first aspect, at least one target cell is a plant cell containing a plant protoplast, and optionally, the plant cell containing the plant protoplast is a plant belonging to the superfamily Viridiplantae, in particular, species of the genus Acer, species of the genus Actinidia, species of the genus Abelmoschus, Agave sisalana, species of the genus Agropyron, Agrostis stolonifera, species of the genus Allium, species of the genus Amaranthus, Ammophila arenaria, Ananas comosus, species of the genus Annona, Apium graveolens, species of the genus Arachis, species of the genus Artocarpus, Asparagus officinalis, species of the genus Avena (e.g., Avena sativa, Avena fatua, Avena byzantine, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, species of the genus Bambusa, Benincasa hispida, Bertholletia excelsea, Beta vulgaris, species of the genus Brassica (e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip]), Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, species of the genus Capsicum) Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g., Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp., Eriobotrya japonica, Eucalyptus sp., Eugenia uniflora, Fagopyrum spp., Fagus spp., Festuca arundinacea, Ficus carica, Fortunella spp.) Fragaria spp., Ginkgo biloba, Glycine spp. (e.g., Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g., Helianthus annuus), Hemerocallis fulva, Hibiscus spp., Hordeum spp. (e.g., Hordeum vulgare), Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lens culinaris, Linum usitatissimum, Litchi chinensis, Lotus spp., Luffa acutangula, Lupinus spp., Luzula sylvatica, Lycopersicon spp. (e.g., Lycopersicon esculentum, Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma spp., Malus spp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihot spp., Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp., Miscanthus sinensis, Momordica spp., Morus nigra, Musa spp.) Nicotiana spp., Olea spp., Opuntia spp., Ornithopus spp., Oryza spp. (e.g., Oryza sativa, Oryza latifolia), Panicum miliaceum, Panicum virgatum, Passiflora edulis, Pastinaca sativa, Pennisetum sp., Persea spp., Petroselinum crispum, Phalaris arundinacea, Phaseolus spp., Phleum pratense, Phoenix spp., Phragmites australis, Physalis spp., Pinus spp., Pistacia vera, Pisum spp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp., Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheum rhabarbarum, Ribes spp., Ricinus communis, Rubus spp., Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp.)(e.g., potato (Solanum tuberosum), African eggplant (Solanum integrifolium) or tomato (Solanum lycopersicum)), sorghum (Sorghum bicolor), spinach species (Spinacia spp.), Syzygium species (Syzygium spp.), Tagetes species (Tagetes spp.), tamarind (Tamarindus indica), cocoa (Theobroma cacao), Trifolium species (Trifolium spp.), Tripsacum dactyloides, Triticosecale rimpaui, Triticum species (Triticum spp.) (e.g., bread wheat (Triticum aestivum), durum wheat (Triticum durum), macaroni wheat (Triticum turgidum), Triticum hybernum, Triticum macha, common wheat (Triticum sativum), einkorn wheat (Triticum monococcum) or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium species (Vaccinium spp.), Vicia species (Vicia spp.), Vigna species (Vigna spp.), sweet violet (Viola odorata), Vitis species (Vitis spp.), maize (Zea mays), wild rice (Zizania palustris), or Ziziphus species (Ziziphus spp.), a leguminous plant for feed or forage, an ornamental plant, an edible crop, a tree or a shrub selected from the list, or a cell of a monocotyledonous and dicotyledonous plant containing the same or a cell derived therefrom.

[0068] Preferred plants are species of Abelmoschus spp., Allium spp., Apium graveolens, Asparagus officinalis, Avena spp. (e.g., Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Beta vulgaris, Brassica spp. (e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip rape]), Capsicum spp., Citrullus lanatus, Cucumis spp., Cynara spp., Daucus carota, Glycine spp. (e.g., Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g., Helianthus annuus), Hordeum spp. (e.g., Hordeum vulgare), Lactuca sativa, Medicago sativa, Oryza spp. (e.g., Oryza sativa, Oryza latifolia), a species of Pennisetum sp., Saccharum spp., Secale cereale, Solanum spp.)(e.g., potato (Solanum tuberosum), Solanum integrifolium, or tomato (Solanum lycopersicum)), sorghum (Sorghum bicolor), spinach species (Spinacia spp.), wheat species (Triticum spp.) (e.g., wheat (Triticum aestivum), durum wheat (Triticum durum), rivet wheat (Triticum turgidum), Triticum hybernum, Triticum macha, Triticum sativum, einkorn wheat (Triticum monococcum), or Triticum vulgare), or corn (Zea mays).

[0069] Preferred plants may also, in certain embodiments, be selected from Brassica spp. (e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip rape]), Capsicum spp., Glycine spp. (e.g., Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g., Helianthus annuus), Oryza spp. (e.g., Oryza sativa, Oryza latifolia), Solanum spp. (e.g., Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Triticum spp. (e.g., Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), or Zea mays.

[0070] In another embodiment of the first aspect, at least one diversification base editor of step (b-i) comprises: (i) one or more cytosine deaminase moieties; (ii) one or more adenine deaminase moieties; (iii) one or more CRISPR-Cas moieties (preferably, the CRISPR-Cas domain does not cleave both strands of double-stranded DNA); (iv) one, two, three or more nuclear localization sequences; and (v) at least one linker region, preferably one or more linker regions between (i) and (ii), and optionally one or more linker regions between (ii) and (iii).

[0071] Various adenine and cytosine deaminases are known to those skilled in the art (e.g., Fan et al., 2021; Jeong et al., 2020; Yan et al., 2021). Any adenine deaminase and / or cytosine deaminase, including variants of known deaminases, can be used in the diversification base editors of the present invention when combined in a suitable manner with other components according to the construction details as disclosed herein.

[0072] In some embodiments, the cytosine deaminase can be an apolipoprotein B mRNA editing complex (APOBEC) family deaminase. In some embodiments, the cytosine deaminase can be APOBEC1 deaminase, APOBEC2 deaminase, APOBEC3A deaminase, APOBEC3B deaminase, APOBEC3C deaminase, APOBEC3D deaminase, APOBEC3F deaminase, APOBEC3G deaminase, APOBEC3H deaminase, APOBEC4 deaminase, activation-induced deaminase (AID), e.g., hAID or AICDA, rAPOBEC1, PpAPOBEC1, AmAPOBEC1, SsAPOBEC3B, RrA3F, FERNY, cytosine deaminase, e.g., CDA1, CDA2, pmCDA1, or atCDA1, or cytosine deaminase acting on rRNA (CDAT), or variants thereof.

[0073] In a preferred embodiment, one or more cytosine deaminase moieties comprise or consist of human apolipoprotein B mRNA editing enzyme catalytic polypeptide-like 3A (hA3A). In one embodiment, one or more cytosine deaminase moieties comprise or consist of hA3A having the mutation R128A, or the mutation Y130F or the double mutation W104A / P134Y.

[0074] The adenosine deaminase moiety may comprise or consist of a monomeric adenosine deaminase or a dimeric adenosine deaminase, and the monomers of the dimeric adenosine deaminase are preferably linked via at least one linker region, preferably via a 32aa XTEN linker.

[0075] In some embodiments, the adenine deaminase moiety may be a tRNA-specific adenosine deaminase such as TadA (Gaudelli et al., 2017), or adenosine deaminase 1 (ADA1), ADA2; adenosine deaminase 1 acting on RNA (ADAR1), ADAR2, ADAR3 (e.g., Savva et al., 2012)); or adenosine deaminase 1 acting on tRNA (ADAT1), ADAT2, ADAT3, or variants thereof.

[0076] In some embodiments, TadA can be derived from E. coli (ecTadA). In some embodiments, TadA can be modified and / or truncated. In certain embodiments, TadA does not contain an N-terminal methionine. The TadA deaminase that can be used as part of a base editor or a base editor complex according to the present invention can be, for example, TadA8, TadA8e, TadA8 s, TadA7.9 TadA7.10, TadA7.10d, TadA8.17, TadA8.20, TadA9, or variants thereof.

[0077] In a preferred embodiment, one or more adenosine deaminase moieties comprise or consist of the dimer ecTadA / ecTadA7.10, or the dimer ecTadA / TadA8e-V106W, or the monomer TadA8e or monomer TadA9, preferably one or more adenosine deaminase moieties comprise or consist of the monomer TadA9.

[0078] In one embodiment, the DBE comprises at least one single-segment or double-segment nuclear localization signal (NLS), preferably at least one NLS comprising or consisting of the sequence of SEQ ID NO: 49. Any other NLS, and combinations of NLSs, specifically tested with respect to the DBE core structure as disclosed herein, or combinations thereof, may also be used. Suitable NLSs are disclosed, for example, in Lange et al., 2010.

[0079] The terms "nuclear localization signal", "nuclear localization sequence" and "NLS" are used interchangeably herein.

[0080] In certain embodiments, at least two or at least three, e.g., three repeats of the SV40 NLS may be used.

[0081] In certain embodiments, a dual-part NLS (dpNLS) at at least one of the C- or N-terminus of the DBE as well as another position within the DBE, preferably the C- and N-termini, may be used. At least one, or both, of the portions of the dpNLS may be a double-segment NLS, e.g., the sequence of SEQ ID NO: 49, or a sequence having at least 99% identity thereto. In certain embodiments, only one of the portions of the dpNLS is a double-segment NLS comprising the sequence of SEQ ID NO: 49, or a sequence having at least 99% identity thereto, and the second portion is, e.g., a triple SV40 NLS as disclosed herein.

[0082] In a preferred embodiment, the DBE comprises or consists of an NLS that includes an N-terminus and / or a C-terminus, preferably the sequence of SEQ ID NO: 49 at the N-terminus and C-terminus, or a sequence having at least 99% identity thereto.

[0083] In all embodiments using non-covalent linkage of the portions forming the DBE, each polypeptide forming a part of the DBE preferably includes at least one NLS, more preferably each polypeptide forming a part of the DBE includes the SV40 NLS, preferably three repeats of the SV40 NLS, or more preferably a bipartite NLS (dpNLS); at the C-terminus and / or N-terminus and a second position within the DBE, preferably at the C-terminus and N-terminus, preferably at least one, or both, of the dpNLS sequences is the sequence of SEQ ID NO: 49, or a sequence having at least 99% identity thereto.

[0084] The non-covalent bond can be achieved by an affinity tag, biotin-streptavidin interaction or any binding pair such as FRB-FKBP (Inobe and Nukina, 2016) etc., enabling a specific interaction. The non-covalent bond can be achieved by non-covalent protein-protein interactions and / or non-covalent protein-RNA interactions with a guide RNA. When the DBE is formed by non-covalent association with a guide RNA, the binding pair can be a modification of the guide RNA such as the inclusion of an RNA-binding moiety fused to a part of the DBE and a stem-loop and / or binding sequence enabling specific interaction with said RNA-binding moiety. Thereby, any part or group of parts can be non-covalently linked via the guide RNA to a CRISPR-Cas part or a group of parts including a CRISPR-Cas part.

[0085] In certain embodiments using non-covalent linkages, the DBE comprises or consists of a first group of moieties that are covalently linked to each other, and one moiety may be fused to another moiety via at least one linker region and to a second group of moieties that are covalently linked to each other, and one moiety may be fused to another moiety via at least one linker region, and the first and second groups of moieties each comprise moieties that enable non-covalent linkage of the first group of moieties to the second group of moieties.

[0086] In some embodiments of non-covalent linkages, the first group of moieties comprises a CRISPR-Cas moiety, and the second group of moieties comprises an ssRNA and / or dsRNA binding moiety, and a suitable guide RNA is modified to enable binding to the ssRNA and / or dsRNA binding moiety.

[0087] In certain embodiments of non-covalent linkages, the first group of moieties comprises or consists of one or more CRISPR-Cas moieties, optionally a uracil glycosylase inhibitor moiety, a uracil glycosylase moiety and / or an ssDNA binding moiety, and one or more additional moieties such as one, two, three or more nuclear localization signals at the C and / or N terminus of the first group of moieties, and one moiety may be fused to another moiety via at least one linker region; the second group of moieties comprises or consists of one or more adenosine deaminase moieties, and / or one or more cytosine deaminase moieties and one or more ssRNA and / or dsRNA binding moieties, preferably an MS2 protein moiety, and one, two, three or more nuclear localization signals at the C and / or N terminus of the second group of moieties, and one moiety may be fused to another moiety via at least one linker region.

[0088] The linker region, as used herein, refers to a polypeptide linker, where a first part is fused to the N-terminus of the polypeptide linker and a second part is fused to the C-terminus of the polypeptide linker. There are various available polypeptide linker regions recognized and used in the art.

[0089] The polypeptide linker may be a GS linker such as a polypeptide linker comprising or consisting of the amino acid sequence (GGS)n, S(GGS)n, or SGGS, where n is a number from 1 to 20 (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20). The polypeptide linker may also comprise or consist of the amino acid sequence: SEQ ID NO: 45. The polypeptide linker may also comprise or consist of the amino acid sequence: SEQ ID NO: 46, also referred to as an XTEN linker. Further, the polypeptide linker may comprise or consist of the amino acid sequence: SEQ ID NO: 47, which is also called a GS-XTEN-GS linker and is referred to herein as the "32aa XTEN linker". Further, the polypeptide linker may comprise or consist of SEQ ID NO: 48, the amino acid sequence referred to herein as the "48aa XTEN linker".

[0090] In one embodiment, one or more linker regions between parts (i) and (ii) as defined above comprise or consist of the 48aa XTEN linker.

[0091] In one embodiment, one or more linker regions between parts (ii) and (iii) as defined above comprise or consist of a 32aa XTEN linker, or preferably a GS linker consisting of 3, 5 or 6 repeats of the amino acid sequence GGGGS (see SEQ ID NO: 51), namely (GGGGS)3, (GGGGS)5 and (GGGGS)6 respectively. In another embodiment, the linker between part (ii) and part (iii) is replaced by a non-sequence-specific ssDNA-binding moiety, preferably the Rad51 ssDNA-binding domain (Rad51ssDBD), or a non-sequence-specific ssDNA-binding moiety, preferably the Rad51ssDBD, is added to the linker region between part (ii) and part (iii), preferably the (GGGGS)5 linker region. In yet another embodiment, the linker between part (ii) and part (iii) is replaced by non-covalent linkage as described above.

[0092] In a preferred embodiment, the linker between parts (ii) and (iii) as defined above comprises or consists of a (GGGGS)5 linker.

[0093] In certain embodiments, the different parts can be linked via bioconjugation, for example, using the SNAP-tag system (Hussain et al., 2013); the Halo-tag system (Los et al., 2008); the CLIP-tag system (Gautier et al., 2008) or any conjugation of specific biomolecules collectively referred to as "click chemistry" in the art.

[0094] In another embodiment of the first aspect, at least one diversified base editor of step (b-i) is at least one diversified base editor in the form of a fusion protein, preferably, the moieties (i), (ii), and (iii) as defined above are arranged in the N-terminal to C-terminal direction in the order of (i)-(ii)-(iii) together with one or more linker regions between each segment, and more preferably, one, two, three, or more nuclear localization sequences (iv) are located at the C-terminus of the diversified base editor, or one or more nuclear localization sequences (iii) are located at the N-terminus and one or more nuclear localization sequences (iii) are located at the C-terminus of the diversified base editor.

[0095] In one embodiment, three repeats of the SV40 NLS are located at the C-terminus of the DBE.

[0096] In another embodiment, the double part (dpNLS) is fused to the N-terminus or C-terminus, preferably the N-terminus and C-terminus, of the DBE as disclosed herein, and preferably, at least one of the dpNLS sequences is SEQ ID NO: 49, or a sequence having at least 99% identity thereto. In other embodiments, both parts of the dpNLS have the sequence of SEQ ID NO: 49, or a sequence having at least 99% identity thereto.

[0097] In another embodiment of the first aspect, the diversified base editor includes at least one additional moiety, preferably, the at least one additional moiety is selected from an MS2 protein moiety, an affinity tag-binding protein, a uracil glycosylase inhibitor moiety and / or a uracil glycosylase moiety, or any combination thereof, an ssDNA, ssRNA, or dsRNA-binding protein moiety.

[0098] In one embodiment, the at least one additional moiety comprises or consists of a uracil glycosylase inhibitor (UGI).

[0099] In one embodiment, the DBE comprises uracil DNA glycosylase (UDG) including uracil - N - glycosylase (UNG).

[0100] In certain embodiments, the DBE does not contain a uracil glycosylase inhibitor moiety and / or does not contain a uracil glycosylase moiety.

[0101] In certain embodiments, the DBE comprises a non - specific ssDNA - binding moiety, preferably the ssDNA - binding domain of Rad51 (Rad51ssDBD). For Cas9 cytosine base editors, it has been previously shown that the RAD51ssDBD between Cas9 and cytosine deaminase can increase base - editing efficiency and expand the base - editing window in cell lines of mouse embryos (Zhang et al., 2020). In certain embodiments, Rad51ssDBD is used in place of the linker region between part (ii) and part (iii), or Rad51ssDBD is added to the linker region between part (ii) and part (iii), preferably the (GGGGS)5 linker region.

[0102] In another embodiment of the first aspect, one or more adenine deaminase moieties and / or one or more cytosine deaminase moieties are linked to at least one ssRNA or dsRNA - binding protein moiety, preferably at least one MS2 protein moiety, at least one suitable guide RNA is adapted to enable interaction with at least one ssRNA or dsRNA - binding protein moiety, preferably one or more adenine base - editor moieties and / or one or more cytosine base - editor moieties are linked to at least one MS2 protein moiety, and the suitable guide RNA is adapted to contain two MS2 stem - loops.

[0103] The MS2 tagging strategy relies on the binding of the MS2 bacteriophage coat protein (referred to herein as the "MS2 protein" or, in the context of the DBE, simply "MS2 protein") to a hairpin structure derived from the phage genome referred to herein as the "MS2 (stem) loop".

[0104] In one embodiment, the one or more CRISPR-Cas moieties are one or more Cas12a moieties, the one or more adenine deaminase moieties and / or the one or more cytosine deaminase moieties are linked to one or more MS2 protein moieties, the guide RNA comprises two MS2 loops, and optionally, the guide RNA comprises the sequence of SEQ ID NO: 38, or SEQ ID NO: 39, or SEQ ID NO: 40, or SEQ ID NO: 41, or a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto.

[0105] In certain embodiments, (ii) the one or more adenine deaminase moieties are fused to the one or more MS2 protein moieties and one or more NLS moieties as a fusion protein optionally having the amino acid sequence of SEQ ID NO: 42 or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto; (i) the one or more cytosine deaminase moieties are preferably fused to parts (iii) and (iv) as a second fusion protein with one or more linker regions between (i) and (iii), optionally a 32aa-XTEN-linker, a 48aa-XTEN-linker, a (GGGGS)5 or a (GGGGS)6 linker.

[0106] In certain embodiments, (i) one or more cytosine deaminase moieties are fused to one or more MS2 protein moieties and one or more NLS moieties as a fusion protein optionally having the amino acid sequence of SEQ ID NO: 43 or SEQ ID NO: 44, or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto; (ii) one or more adenosine deaminase moieties are fused to part (iii) and (iv), preferably together with one or more linker regions between (ii) and (iii).

[0107] In certain embodiments, (i) one or more cytosine deaminase moieties are fused to one or more adenosine deaminase moieties, one or more MS2 protein moieties, and one or more NLS moieties via one or more linker regions; (iii) one or more CRISPR-Cas moieties are fused to part (iv) as a second fusion protein.

[0108] In certain embodiments, (i) one or more cytosine deaminase moieties are fused to one or more MS2 protein moieties and one or more NLS moieties as a fusion protein optionally having the amino acid sequence of SEQ ID NO: 43 or SEQ ID NO: 44, or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto, (ii) one or more adenosine deaminase moieties are fused to one or more MS2 protein moieties and one or more NLS moieties as a fusion protein optionally having the amino acid sequence of SEQ ID NO: 42 or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto; (iii) one or more CRISPR-Cas moieties are fused to part (iv) as a third fusion protein.

[0109] In another embodiment of the first aspect, the diversification base editor comprises an amino acid molecule selected from any one of SEQ ID NOs: 1-27, 52, or a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto.

[0110] SEQ ID NO: 1 is a Cas9-based DBE comprising hA3A, dimer ecTadA / ecTadA7.10, enCas9 with an additional D10A nickase mutation, and three repeats of SV40 NLS: hA3A 48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-enCas9(D10A)-SV40 NLS(3x).

[0111] SEQ ID NO: 2 has the same structure as SEQ ID NO: 1, but contains LbCas12a(D156R / D832A) instead of Cas9: hA3A-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-LbCas12a(D156R / D832A)-SV40 NLS(3x).

[0112] SEQ ID NO: 3 has an additional E795L mutation: hA3A-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-LbCas12a(D156R / E795L / D832A)-SV40 NLS(3x).

[0113] SEQ ID NO: 4 has the same structure as SEQ ID NO: 2, but contains the hA3A(R128A) cytosine deaminase mutant: hA3A(R128A)-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-LbCas12a(D156R / D832A)-SV40 NLS(3x).

[0114] Sequence number 5 has the same structure as sequence number 2, but contains the dimer TadA8e(V106W) adenine deaminase mutant: hA3A-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA8e(V106W)-32aa-XTEN-linker-LbCas12a(D156R / D832A)-SV40 NLS(3x).

[0115] Sequence number 6 contains both hA3A(R128A) and dimer TadA8e(V106W): hA3A(R128A)-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA8e(V106W)-32aa-XTEN-linker-LbCas12a(D156R / D832A)-SV40 NLS(3x).

[0116] Sequence number 7 contains dpNLS and monomeric TadA8e adenine deaminase at the N-terminus and C-terminus: dpNLS-hA3A-48aa-XTEN-linker-TadA8e-32aa-XTEN-linker-LbCas12a(D156R / D832A)-dpNLS.

[0117] Sequence number 8 has the same structure as sequence number 7, but contains additional K932 / N933 mutations in LbCas12a: dpNLS-hA3A-48aa-XTEN-linker-TadA8e-32aa-XTEN-linker-LbCas12a(D156R / D832A / K932G / N933G)-dpNLS.

[0118] Sequence number 9 has the same structure as sequence number 7, but contains an additional E795L mutation in LbCas12a: dpNLS-hA3A-48aa-XTEN-linker-TadA8e-32aa-XTEN-linker-LbCas12a(D156R / E795L / D832A)-dpNLS.

[0119] Sequence number 10 has the same structure as sequence number 7, but the linker region between part (ii) and part (iii) is a (GGGGS)5 linker: dpNLS-hA3A-48aa-XTEN-linker-TadA8e-(GGGGS)5-LbCas12a(D156R / D832A)-dpNLS.

[0120] Sequence number 11 has the same structure as sequence number 7, but contains monomeric TadA9 adenine deaminase: dpNLS-hA3A-48aa-XTEN-linker-TadA9-32aa-XTEN-linker-LbCas12a(D156R / D832A)-bdpNLS.

[0121] Sequence number 12 has the same structure as sequence number 11, but the linker region between part (ii) and part (iii) is a (GGGGS)6 linker: dpNLS-hA3A-48aa-XTEN-linker-TadA9-(GGGGS)6-LbCas12a(D156R / D832A)-dpNLS.

[0122] Sequence number 13 has the same structure as sequence number 12, but contains the hA3A(W104A / P134Y) mutant: dpNLS-hA3A(W104A / P134Y)-48aa-XTEN-linker-TadA9-(GGGGS)6-LbCas12a(D156R / D832A)-dpNLS.

[0123] Sequence number 14 has the same structure as sequence number 12, but contains the hA3A(Y130F) mutant: dpNLS-hA3A(Y130F)-48aa-XTEN-linker-TadA9-(GGGGS)6-LbCas12a(D156R / D832A)-dpNLS.

[0124] Sequence number 15 has the same structure as sequence number 10, but contains a uracil glycosylase inhibitor moiety and (GGGGS)6: dpNLS-hA3A-48aa-XTEN-linker-TadA8e-(GGGGS)6-LbCas12a(D156R / D832A)-UGI-dpNLS.

[0125] SEQ ID NO: 16 has the same structure as SEQ ID NO: 10, but the linker region between part (ii) and part (iii) has a (GGGGS)6 linker, which contains an E. coli uracil-N-glycosylase moiety: dpNLS-eUNG-hA3A-48aa-XTEN-linker-TadA8e-(GGGGS)6-LbCas12a(D156R / D832A)-dpNLS.

[0126] SEQ ID NO: 17 has the same structure as SEQ ID NO: 2, but contains the LbCas12a(D832A / D156R / G532R / K538R) variant (enCas12a(D832)): hA3A-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-enCas12a(D832)-SV40 NLS(3x).

[0127] SEQ ID NO: 18 has the same structure as SEQ ID NO: 17, but contains the dimeric TadA8e(V106W): hA3A-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-TadA8e(V106W)-32aa-XTEN-linker-enCas12a(D832)-SV40 NLS(3x).

[0128] SEQ ID NO: 19 has the same structure as SEQ ID NO: 17, but contains hA3A(R128A): hA3A(R128A)-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-enCas12a(D832)-SV40 NLS(3x).

[0129] SEQ ID NO: 20 contains both hA3A(R128A) and the dimeric TadA8e(V106W): hA3A(R128A)-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-TadA8e(V106W)-32aa-XTEN-linker-enCas12a(D832)-SV40 NLS(3x).

[0130] Array number 21 has the same structure as array number 17 but contains an additional E795L mutation in enLbCas12a: hA3A(R128A):hA3A-48aa-XTEN-linker-ecTadA-32aa-XTEN-linker-ecTadA7.10-32aa-XTEN-linker-enCas12a(E795L / D832)-SV40 NLS(3x).

[0131] Array number 22 contains dpNLS, monomeric TadA9, (GGGGS)5 linker region, and Rad51ssDBD: dpNLS-hA3A-48aa-XTEN-linker-TadA9-(GGGGS)5-Rad51ssDBD-LbCas12a(D156R / D832A)-dpNLS.

[0132] Array number 23 contains dpNLS, monomeric TadA9, and (GGGGS)5 linker region: dpNLS-hA3A-48aa-XTEN-linker-TadA9-(GGGGS)5-LbCas12a(D156R / D832A)-dpNLS.

[0133] Array number 24 contains dpNLS, hA3A(W104A / P134Y), monomeric TadA9, and (GGGGS)5 linker region: dpNLS-hA3A(W104A / P134Y)-48aa-XTEN-linker-TadA9-(GGGGS)5-LbCas12a(D156R / D832A)-dpNLS.

[0134] Array number 25 contains dpNLS, hA3A(Y130F), monomeric TadA9, and (GGGGS)5 linker region: dpNLS-hA3A(Y130F)-48aa-XTEN-linker-TadA9-(GGGGS)5-LbCas12a(D156R / D832A)-dpNLS.

[0135] Array number 26 has the same structure as array number 10 but contains a uracil glycosylase inhibitor moiety: dpNLS-hA3A-48aa-XTEN-linker-TadA8e-(GGGGS)5-LbCas12a(D156R / D832A)-UGI-dpNLS.

[0136] Array number 27 has the same structure as array number 10, but contains the E. coli uracil-N-glycosylase moiety: dpNLS-eUNG-hA3A-48aa-XTEN-linker-TadA8e-(GGGGS)5-LbCas12a(D156R / D832A)-dpNLS.

[0137] Array number 52 contains dpNLS, hA3A, monomeric TadA9 and a (GGGGS)3 linker region: dpNLS-hA3A-48aa-XTEN-linker-monomeric TadA9-(GGGGS)3-LbCas12a(D156R / D832A)-dpNLS (array number 52).

[0138] In a second aspect, edited cells, tissues, organs, materials or whole organisms obtainable or obtained by the method according to the first aspect may be provided.

[0139] In a third aspect, a diversified base editor, or a diversified base editor complex further comprising at least one suitable guide RNA, or at least one nucleic acid molecule encoding the same, may be provided, wherein the diversified base editor is as defined in the first aspect.

[0140] As is known to those skilled in the art, there are guide RNA scaffolds for different types of CRISPR nucleases, which can be individually designed to interact with the PAM motif at / near the target base to be edited / exchanged.

[0141] In a fourth aspect, a vector or expression construct, or two or more vectors and expression constructs, may be provided, each vector and / or expression construct comprising at least one nucleic acid molecule of the third aspect, different portions of the diversified base editor being encoded on the same vector or expression construct or on different vectors or expression constructs, and / or the diversified base editor, or a portion thereof, and at least one suitable guide RNA being encoded on the same vector or expression construct or on different vectors or expression constructs.

[0142] In a fifth aspect, there may be provided a cell comprising at least one diversified base editor or at least one diversified base editor complex of the third aspect, or at least one nucleic acid molecule encoding the same; or at least one vector or expression construct of the fourth aspect; the cell is a prokaryotic cell including a bacterial cell or an archaeal cell, or a eukaryotic cell including an insect cell, a mammalian cell, or a plant cell including a plant protoplast, preferably, the cell is a plant cell including a plant protoplast, optionally, the plant cell including a plant protoplast is a plant belonging to the superfamily Viridiplantae, in particular, species of the genus Acer, species of the genus Actinidia, species of the genus Abelmoschus, Agave sisalana, species of the genus Agropyron, Agrostis stolonifera, species of the genus Allium, species of the genus Amaranthus, Ammophila arenaria, Ananas comosus, species of the genus Annona, Apium graveolens, species of the genus Arachis, species of the genus Artocarpus, Asparagus officinalis, species of the genus Avena (e.g., Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Averrhoa carambola, species of the genus Bambusa, Benincasa hispida, Bertholletia excelsea, Beta vulgaris, species of the genus Brassica(e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip rape], Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea spp., Ceiba pentandra, Cichorium endivia, Cinnamomum spp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasia esculenta, Cola spp., Corchorus sp., Coriandrum sativum, Corylus spp., Crataegus spp., Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucus carota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyros spp., Echinochloa spp., Elaeis (e.g., Elaeis guineensis, Elaeis oleifera), Eleusine coracana, Eragrostis tef, Erianthus sp.)、loquat (Eriobotrya japonica), eucalyptus species (Eucalyptus sp.), Surinam cherry (Eugenia uniflora), buckwheat species (Fagopyrum spp.), beech species (Fagus spp.), tall fescue (Festuca arundinacea), fig (Ficus carica), kumquat species (Fortunella spp.), strawberry species (Fragaria spp.), ginkgo (Ginkgo biloba), soybean species (Glycine spp.) (e.g., soybean (Glycine max), Soja hispida or Soja max), cotton (Gossypium hirsutum), sunflower species (Helianthus spp.) (e.g., sunflower (Helianthus annuus)), daylily (Hemerocallis fulva), hibiscus species (Hibiscus spp.), barley species (Hordeum spp.) (e.g., barley (Hordeum vulgare)), sweet potato (Ipomoea batatas), walnut species (Juglans spp.), lettuce (Lactuca sativa), vetch species (Lathyrus spp.), lentil (Lens culinaris), flax (Linum usitatissimum), litchi (Litchi chinensis), lotus species (Lotus spp.), angled luffa (Luffa acutangula), lupinus species (Lupinus spp.), woodrush (Luzula sylvatica), tomato species (Lycopersicon spp.) (e.g., tomato (Lycopersicon esculentum), Lycopersicon lycopersicum, Lycopersicon pyriforme), Macrotyloma species (Macrotyloma spp.), apple species (Malus spp.), acerola (Malpighia emarginata), mammee apple (Mammea americana), mango (Mangifera indica), cassava species (Manihot spp.)、sapodilla (Manilkara zapota), alfalfa (Medicago sativa), sweet clover species (Melilotus spp.), mint species (Mentha spp.), Chinese silvergrass (Miscanthus sinensis), bitter gourd species (Momordica spp.), black mulberry (Morus nigra), banana species (Musa spp.), tobacco species (Nicotiana spp.), olive species (Olea spp.), prickly pear species (Opuntia spp.), bird's-foot trefoil species (Ornithopus spp.), rice species (Oryza spp.) (e.g., rice (Oryza sativa), broadleaf rice (Oryza latifolia)), proso millet (Panicum miliaceum), switchgrass (Panicum virgatum), passion fruit (Passiflora edulis), parsnip (Pastinaca sativa), pearl millet species (Pennisetum sp.), avocado species (Persea spp.), parsley (Petroselinum crispum), reed canarygrass (Phalaris arundinacea), common bean species (Phaseolus spp.), timothy (Phleum pratense), date palm species (Phoenix spp.), common reed (Phragmites australis), ground cherry species (Physalis spp.), pine species (Pinus spp.), pistachio (Pistacia vera), pea species (Pisum spp.), bluegrass species (Poa spp.), poplar species (Populus spp.), mesquite species (Prosopis spp.), cherry species (Prunus spp.), guava species (Psidium spp.), pomegranate (Punica granatum), European pear (Pyrus communis), oak species (Quercus spp.), radish (Raphanus sativus), rhubarb (Rheum rhabarbarum), currant species (Ribes spp.), castor bean (Ricinus communis), raspberry species (Rubus spp.) Saccharum spp., Salix sp., Sambucus spp., Secale cereale, Sesamum spp., Sinapis sp., Solanum spp. (e.g., Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Sorghum bicolor, Spinacia spp., Syzygium spp., Tagetes spp., Tamarindus indica, Theobroma cacao, Trifolium spp., Tripsacum dactyloides, Triticosecale rimpaui, Triticum spp. (e.g., Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), Tropaeolum minus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Viola odorata, Vitis spp., Zea mays, Zizania palustris, or Ziziphus spp.Cells of a plant selected from at least one target cell that is a plant cell of a monocotyledonous and dicotyledonous plant, including leguminous plants, ornamental plants, edible crops, trees or shrubs for feed or forage, or plant cells derived therefrom, or cells derived therefrom.

[0143] Preferred plants are species of Abelmoschus spp., Allium spp., Apium graveolens, Asparagus officinalis, Avena spp. (e.g., Avena sativa, Avena fatua, Avena byzantina, Avena fatua var. sativa, Avena hybrida), Beta vulgaris, Brassica spp. (e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip rape]), Capsicum spp., Citrullus lanatus, Cucumis spp., Cynara spp., Daucus carota, Glycine spp. (e.g., Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g., Helianthus annuus), Hordeum spp. (e.g., Hordeum vulgare), Lactuca sativa, Medicago sativa, Oryza spp. (e.g., Oryza sativa, Oryza latifolia), one species of Pennisetum sp., Saccharum spp., Secale cereale, Solanum spp.)(e.g., potato (Solanum tuberosum), Solanum integrifolium or tomato (Solanum lycopersicum)), sorghum (Sorghum bicolor), spinach species (Spinacia spp.), wheat species (Triticum spp.) (e.g., wheat (Triticum aestivum), durum wheat (Triticum durum), rivet wheat (Triticum turgidum), Triticum hybernum, Triticum macha, Triticum sativum, einkorn wheat (Triticum monococcum) or Triticum vulgare), or maize (Zea mays).

[0144] Preferred plants may also, in certain embodiments, be selected from Brassica spp. (e.g., Brassica napus, Brassica rapa ssp. [canola, rapeseed, turnip rape]), Capsicum spp., Glycine spp. (e.g., Glycine max, Soja hispida or Soja max), Gossypium hirsutum, Helianthus spp. (e.g., Helianthus annuus), Oryza spp. (e.g., Oryza sativa, Oryza latifolia), Solanum spp. (e.g., Solanum tuberosum, Solanum integrifolium or Solanum lycopersicum), Triticum spp. (e.g., Triticum aestivum, Triticum durum, Triticum turgidum, Triticum hybernum, Triticum macha, Triticum sativum, Triticum monococcum or Triticum vulgare), or Zea mays.

[0145] In a sixth aspect, there may be provided a kit comprising at least one diversified base editor or at least one diversified base editor complex of the third aspect, or at least one nucleic acid molecule encoding the same; or at least one vector or expression construct of the fourth aspect; or at least one cell of the fifth aspect, and optionally instructions for use and the necessary buffers, instruments and reagents.

[0146] A diversified base editor, a diversified base editor complex comprising a guide RNA, a nucleic acid molecule encoding the same, and / or a vector or expression construct are provided in a functional form that includes, for example, a stabilizer, a cofactor, and means for introducing the same into a target cell or tissue, etc.

[0147] In a seventh aspect, use of at least one diversified base editor or at least one diversified base editor complex of the third aspect, or at least one nucleic acid molecule encoding the same, for targeted directed evolution of at least one target nucleic acid segment, preferably for targeted directed evolution of at least one target nucleic acid segment in a plant; or at least one vector or expression construct of the fourth aspect; or at least one cell of the fifth aspect; or at least one kit of the sixth aspect may be provided, including use for optimizing or modifying traits in a plant, including optimizing or modifying yield-related traits, or disease or pathogen resistance-related traits, wherein the disease is caused by a virus, bacterium, fungus, nematode, or insect, or a non-biological stress-related trait including herbicide resistance-related traits, or salt or drought stress-related traits, or the pathogen is selected therefrom, and further including use for identification of a genomic locus associated with at least one gene and / or at least one trait of interest.

[0148] Targeted directed evolution is any strategy of genotypic and / or phenotypic screening and / or selection following diversification of a target nucleic acid segment, optionally including the application of a selection pressure typically carried out as iterative rounds of mutagenesis, wherein each round of mutagenesis includes regenerating an organism, such as a plant, a cell, tissue, and / or material including a plant protoplast or callus used for mutagenesis, and / or regenerating plant material, for example, via callus culture, or by direct rooting / rapid growth, and / or for mating including backcrossing, of any strategy that may be included.

[0149] In an eighth aspect, there is provided a Brassica Napus acetolactate synthase (ALS) 3 protein comprising the D358N and R359H mutations or an Arabidopsis thaliana acetohydroxyacid synthase (AHAS) protein comprising the D376N and R377H mutations.

[0150] In one embodiment, the Brassica Napus ALS3 protein comprises or consists of the amino acid sequence of SEQ ID NO: 77 or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto.

[0151] In a ninth aspect, there is provided a nucleic acid molecule encoding the ALS3 or AHAS protein of the eighth aspect.

[0152] In one embodiment, the nucleic acid molecule comprises or consists of the sequence of SEQ ID NO: 76 or a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity thereto.

[0153] In a tenth aspect, there is provided a plant or plant cell comprising and / or encoding the ALS3 protein or AHAS protein of the eighth aspect or the nucleic acid molecule of the ninth aspect.

[0154] In one embodiment, the plant or plant cell is a Brassica Napus or Arabidopsis thaliana plant or plant cell.

Examples

[0155] Example 1: Cloning method and plasmid construction Unless otherwise indicated, cloning procedures carried out for the purposes of the present invention, including restriction digestion, agarose gel electrophoresis, nucleic acid purification and ligation, transformation, selection and culturing of bacterial cells, are carried out as described in the literature previously available to those skilled in the art (see Sambrook, Fritsch and Maniatis, 1989). Sequence analysis of recombinant DNA was carried out by LGC Genomics (Berlin, Germany) using the Sanger technique. Restriction endonucleases and Gibson assembly reagents used to construct various expression vectors are from New England Biolabs (Ipswich, MA, USA). Oligonucleotides are synthesized by Integrated DNA Technologies (Coralville, IA, USA). Codon-optimized genes are from Genewiz (South Plainfield, NJ, USA).

[0156] All base editors were optimized for expression in plant cells and codon usage of highly expressed genes in wheat.

[0157] All expression vectors contain the maize polyubiquitin (Ubi) promoter (SEQ ID NO: 28) for constitutive expression located upstream of the coding sequence, and a fragment of the 3’ untranslated region of the octopine - type Ti plasmid gene 7 of Agrobacterium tumefaciens (SEQ ID NO: 29) or the 3’ end of the 35S gene of cauliflower mosaic virus (SEQ ID NO: 30). The gRNA expression cassette containing the truncated glycine - tRNA (SEQ ID NO: 31), 21 - bp direct repeat sequence (SEQ ID NO: 32), 23 - bp protospacer site targeting the rice OsAAT gene (LOC_Os01g55540.1) (SEQ ID NO: 33), and the rice polymerase III terminator sequence (containing 8 consecutive “Ts”) was ordered as a synthetic fragment and cloned into a standard E. coli vector (pUC derivative) via EcoRV blunt - end ligation. The expression of the gRNA is driven by the polymerase III - type promoter of the rice U6 snRNA gene (SEQ ID NO: 35).

[0158] All plasmids were transformed into E. coli for propagation and isolated using the ZymoPure II Plasmid Gigaprep Kit (Zymo Research, Irvine, CA, USA) for DNA purification.

[0159] Example 2: Design of Cas9 Diversified Base Editors To design a Cas9-based diversified base editor (Cas9 DBE), an existing Cas9 dual base editor, STEME-1 (Li et al., 2020), was optimized to allow for greater sequence diversification. To construct construct DBE-1, the Cas9(D10A) nickase in STEME-1 was replaced with enCas9, a variant of Cas9(D10A) with enhanced DNA dissociation (Slaymaker et al., 2016), the UGI domain was removed, and nucleoplasmin and a single SV40 NLS were replaced with three C-terminal repeats of the SV40 NLS (see Figure 1A). STEME-1 and DBE-1 were optimized for expression in monocot plants, and their corresponding activities were determined in protoplasts by measuring the total number of base edits at the AAT target site.

[0160] Transformation of protoplast cells was performed as described by Shan et al., 2014 with some modifications. Protoplasts were isolated from the leaf sheaths of 3-week-old aseptically grown rice seedlings. Healthy stems and leaf sheaths were bundled in groups of 20 and cut into strips with a sharp razor blade. The strips were then infiltrated with a cell wall-lysing enzyme solution (1.5% cellulase R10 and 0.75% macerozyme R10 in 10 mM KCl and 0.6 M mannitol, pH 7.5) and incubated overnight in the dark with gentle shaking (40 rpm) at 24°C. After enzymatic digestion, the released protoplasts were recovered by filtering the mixture through a 40-μm nylon mesh and resuspended in W5 solution. The resuspended protoplasts were washed with W5 solution, and the cell pellet was then suspended in MMG solution at a density of 2.5 million cells / ml. For transformation, 200 μl of cells (5 x 105) were mixed with 20 μg of plasmid DNA and 220 μl of freshly prepared polyethylene glycol (PEG) solution. The mixture was incubated in the dark for 15 - 20 minutes. After removing the PEG solution, the protoplasts were resuspended in 2 ml of WI solution, transferred to a 6-well plate, and incubated at 24°C for at least 48 hours.

[0161] Both STEME-1 and DBE-1 were co-transfected with an AAT-targeted Cas9 guide RNA construct containing a 5'-to-3'-truncated tRNA, a first mature direct repeat sequence, a spacer RNA, a second mature direct repeat sequence, and a poly-T tail (T-stretch terminator). Three days after transfection, protoplasts were collected by centrifugation and genomic DNA was extracted using either Phire Tissue Direct PCR Extraction Buffer (Thermo Fisher Scientific) or the Qiagen DNeasy Plant Kit. The AAT target region was amplified by PCR using primer sequence numbers 36 and 37 and subjected to amplicon deep sequencing.

[0162] While the overall base editing efficiencies of STEME-1 and DBE-1 were very similar (49.43% vs. 50.19%), DBE-1 showed an overall broader mutation spectrum with highly increased C-to-A and C-to-G substitutions (see FIGS. 1B and 2). Furthermore, DBE-1 showed a slightly expanded C-to-T base editing window spanning positions C1 to C16 as opposed to positions C7 to C16 for STEME-1 (counting the distal end relative to the PAM as position 1; data not shown). Consistent with the broader mutation spectrum and expanded editing window of DBE-1, the inventors also found a greater number of alleles identified around the AAT target site compared to STEME-1 transfected cells. Collectively, these results indicate that DBE-1 can increase the diversity of mutations at the target site using a single guide RNA.

[0163] Example 3: Development and Optimization of Cas12a Diversified Base Editors Based on the Cas9 DBE-1 structure, a series of expression vectors encoding different Cas12a diversified base editors (Cas12a DBE) were constructed. Each of these expression constructs contained different modifications regarding the NLS configuration, adenosine deaminase part, cytosine deaminase part; a protein linker that linked the Cas12a part and / or the adenosine part to Cas12a (see Figure 3).

[0164] All Cas12a DBEs were optimized for expression in monocotyledonous plants and transcribed from the constitutive maize Ubi promoter. To test their base editing activities, each of the Cas12a DBE constructs was transfected into rice protoplasts together with a guide RNA expression construct containing two mature direct repeats at the 5' and 3' of the truncated glycine-tRNA and spacer. Some base editor constructs were tested in combination with the LbCas12a R1138A mutation, which was expected to disrupt base editing either through non-target strand nicking or through residual DSB nuclease activity (Yamano et al., 2016). The total base editing efficiency of the selected Cas12A DBE structures as measured by amplicon deep sequencing is shown in Figure 3 and Table 1.

[0165]

Table 1

[0166] Table 1 shows the results of different LbCas12a DBE constructs at the OsAAT target site in rice protoplasts. The editing efficiencies of different constructs are presented compared to those shown by LbCas12a-DBE-10 (see Figure 3, SEQ ID NO: 10). The different LbCas12a-DBE structures used are as follows: DBE-7: dpNLS (Dual partial nuclear translocation signal) - hA3A - 48aa - XTEN - linker - monomeric TadA8e - 32aa - XTEN - linker - LbCas12a (D156R / D832A) - dpNLS (SEQ ID NO: 7) DBE-10: dpNLS - hA3A - 48aa - XTEN - linker - monomeric TadA8e - GGGGS - linker - (5x) - LbCas12a (D156R / D832A) - dpNLS (SEQ ID NO: 10) DBE-11: dpNLS - hA3A - 48aa - XTEN - linker - monomeric TadA9 - GGGGS - linker - (5x) - LbCas12a (D156R / D832A) - dpNLS (SEQ ID NO: 23) DBE-12: dpNLS - hA3A (Y130F) - 48aa - XTEN - linker - monomeric TadA9 - GGGGS - linker - (5x) - LbCas12a (D156R / D832A) - dpNLS (SEQ ID NO: 25) DBE-13: dpNLS - hA3A - 48aa - XTEN - linker - monomeric TadA9 - GGGGS - linker - (3x) - LbCas12a (D156R / D832A) - dpNLS (SEQ ID NO: 52) DBE-14: dpNLS - eUNG - hA3A - 48aa - XTEN - linker - monomeric TadA8e - GGGGS - linker - (5x) - LbCas12a (D156R / D832A) - dpNLS (SEQ ID NO: 27)

[0167] Multiple rounds of optimization demonstrated that the following modifications could increase base diversification: - Use of monomeric TadA adenine deaminase instead of dimeric TadA adenine deaminase - Use of monomeric TadA9 deaminase instead of TadA8e deaminase - Use of (GGGGS)5 as the linker between part (ii) and part (iii) - Use of the bipartite NLS SEQ ID NO: 49 instead of three repeats of the SV40 NLS - Use of dpNLS at both the N - terminus and C - terminus

[0168] The highest level of base editing was determined for construct 11 (SEQ ID NO: 23; see Table 1) containing a (GGGGS)5 linker that links TadA9 to catalytically inactive LbCas12a with a dual-nuclear localization signal (NLS) (SEQ ID NO: 49) at both the 5' and 3' termini, an hA3A cytosine deaminase domain, a monomeric TadA9 as an adenosine deaminase domain, and a D156R mutation internally.

[0169] The second highest level of base editing (average 16.4%) was determined for construct 10 (SEQ ID NO: 10; see Table 3) containing a (GGGGS)5 linker that links TadA8e to catalytically inactive LbCas12a with a dual NLS (SEQ ID NO: 49) at both the 5' and 3' termini, an hA3A cytosine deaminase domain, a monomeric TadA8e as an adenosine deaminase domain, and a D156R mutation internally.

[0170] Interestingly, it was previously hypothesized that base editing could be enhanced by introducing a nick into the target strand (Paul et al., 2021). Introduction of the K932G / N933G mutations in the Cas12a domain decreased the efficiency of base substitution by strongly increasing indel formation (Construct 8; see Figure 3). Also, an amino acid change that was found to enhance Cas12a activity in mammalian cells, the substitution of glutamic acid at position 795 in Cas12a with leucine (E795L) (Pamphlet of International Publication No. WO 2020 / 172502 A1), was unable to substantially increase the base substitution rate in com protoplasts (Constructs 3, 6, and 9; see Figure 3), while introduction of the Y130F mutation in the hA3A domain or use of a tri-GGGGS linker between the adenosine deaminase domain and Cas12a decreased the editing rate (see Table 1). Interestingly, the efficiency of DBE-10 can be further enhanced through co-delivery with uracil DNA N-glycosylase (eUNG) derived from Escherichia coli expressed in trans from a strong 35S promoter (see Table 1), which suggests that the generation of abasic sites and subsequent introduction of base excision DNA repair promote target diversification by DBE. Further, in contrast to findings regarding Cas9 (Kurt et al., 2021), addition of eUNG to the N-terminus of Cas12a DBE had a strong negative impact on editing activity (see Table 1).

[0171] Further modifications are currently being tested, including the effect of fusing the C-terminal UGI and UNG domains to Cas12a DBE, the mutagenesis efficiency of the (GGGGS)6 linker between part (ii) and part (iii), the introduction of the W104A / P134Y mutation in the hA3A domain, and the effect of a non-sequence-specific ssDNA binding domain between part (ii) and part (iii) on the base substitution rate.

[0172] Example 4: LbCas12a-DBE Activity in Soybean and Rapeseed To determine the activity of LbCas12a-DBE in dicotyledonous plants, additional experiments were performed using rapeseed (Brassica napus) and soybean (Glycine max) protoplasts. Rapeseed protoplasts were isolated from leaves of 4- to 7-week-old plants grown aseptically. Healthy leaves were cut into fine pieces with a sharp scalpel blade. The pieces were infiltrated with a cell wall-lysing enzyme solution (0.25% cellulase R10 and 0.25% macerozyme R10) and incubated overnight in the dark at 24 °C with gentle shaking (40 rpm). After enzymatic digestion, the released protoplasts were recovered by filtering the mixture through a 40-μm nylon mesh and resuspended in W5 solution. The resuspended protoplasts were placed on ice and allowed to sediment by gravity, after which the cell pellet was resuspended in MMG. For transformation, 200 μL of cells (2.5 x 105) were mixed with 20 μg of plasmid DNA and 220 μL of freshly prepared polyethylene glycol (PEG) solution. The mixture was incubated in the dark for 15-20 minutes. After removing the PEG solution, the protoplasts were resuspended in 2 mL of W5 solution and incubated at 24 °C. Soybean protoplasts were isolated from single cotyledon leaves of 6-day-old seedlings and transfected essentially as described for rapeseed. After removing the PEG solution, the protoplasts were resuspended in 2 mL of WI solution.

[0173] Cas12a-DBE activity was first evaluated using two different reporter systems. The first reporter is activated upon C-to-T editing that requires changing codon 66 from CAC (histidine) to TAC (tyrosine; see SEQ ID NO: 53) to convert blue fluorescent protein (BFP) to green fluorescent protein (GFP). The second assay detects A-to-G editing of an inactivated GFP reporter that has an early stop codon internally resulting from the change of CGA (arginine) at codon 110 to TAG (see SEQ ID NO: 54). Editing of either the BFP or the inactivated GFP reporter will restore the GFP coding sequence and result in GFP fluorescence.

[0174] The rapeseed protoplasts were co-transfected with three vectors: (1) a vector encoding either BFP (SEQ ID NO: 53) containing both engineered TTTC Cas12a PAM sites (due to the T62S substitution in BFP and the silent AAG to AAA mutation at K114 in GFP) or inactivated GFP (SEQ ID NO: 54), (2) a Cas12a-DBE expression construct containing a penta-GGGGS linker (i.e., DBE-10; SEQ ID NO: 10) that links hA3A as the cytosine deaminase domain and TadA9 and TadA8e as the adenosine deaminase domains to the dLbCas12a (D156R) module located 3' of TadA8e, and (3) a vector encoding a Cas12a gRNA targeting either BFP or GFP reporter and containing two mature direct repeats (SEQ ID NO: 55; SEQ ID NO: 56) at the 5' and 3' of the spacer. The DBE-10 vector contained the Arabidopsis ubiquitin 10 promoter (SEQ ID NO: 57) for constitutive expression, while the expression of the gRNA was driven by the polymerase III-type promoter of the Arabidopsis U6 snRNA gene (SEQ ID NO: 58). As a positive control, the protoplasts were transfected with a construct expressing wild-type eGFP under the control of the strong cauliflower mosaic virus (CaMV) 35S promoter (SEQ ID NO: 59). As a negative control, the Cas12a-DBE-10 fusion protein was tested without the gRNA. Fluorescent imaging on the second day after transfection revealed approximately 35% GFP-fluorescent cells in the positive control and 3.5% and 2.1% by the BFP and dGFP reporters, respectively, in dCas12a-DBE-10 (see Figure 4A). Importantly, no GFP-positive cells were observed in the absence of the gRNA (data not shown).

[0175] To confirm Cas12a-DBE activity in endogenous sites, the pAtUbi10>DBE-10 expression construct was co-transfected into rapeseed or soybean protoplasts together with a Cas12a gRNA targeting the BnFAD2 (gRNA: SEQ ID NO: 60), BnALS3 (gRNA: SEQ ID NO: 61), or GmFAD2 (gRNA: SEQ ID NO: 62) gene. Transfected rapeseed protoplasts were cultured in alginate, and the editing efficiency was determined by deep amplicon sequencing 14 days after transfection. Conversely, soybean protoplasts were incubated in WI solution for 72 hours and analyzed by droplet digital PCR. As shown in Figure 4B, transfection of DBE-10 resulted in efficient editing of all three genes tested, with up to 4.5% of the NGS or ddPCR reads showing base changes from C to T and / or from A to G (1.51%, 2.66%, and 1.92% on average for BnFAD2, BnALS3, and GmFAD2, respectively). Combined with the data in rice protoplasts, these results indicate that Cas12a-DBE is active in both monocotyledonous and dicotyledonous plants.

[0176] Example 5: MS2 tagging for diversified base editors To develop an MS2 tagging strategy for Cas12a-DBE, four different Cas12a guide RNAs with two MS2 stem-loops internally at the 5' end of the guide were designed (see Figures 5A and 5B; SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41). To test the effect of the additional MS2 stem-loop on the activity of the Cas12a-crRNA complex, in vitro digestion with purified OsAAT PCR products targeted by different guide RNA designs was performed. The sequence of the gRNA target site is listed as SEQ ID NO: 33.

[0177] 25 μl of the reaction mixture was prepared by mixing 500 ng of purified OsAAT PCR substrate, 29 picomoles of crRNA, and 22 picomoles of protein in 2 μl of pre-constructed Cas12a RNP, along with 2.5 μl of 10x NEB buffer 2.1. The reaction mixture was incubated at 37 °C for 60 minutes, heat-inactivated at 85 °C for 2 minutes, and separated on a 1% agarose gel containing 1 / 100 (v / v) SYBR-Safe (Invitrogen). A shift in the position of the OsAAT PCR product indicates successful cleavage. As shown in Figure 6, all four MS2-modified guide RNA designs resulted in bands showing substrate cleavage similar to that seen in the positive control sample (i.e., unmodified guide RNA). Equivalent levels of indel formation were also found in co-transfected protoplasts with either LbCas12a and untagged gRNA or one of the four MS2-modified variants (see Table 2).

[0178]

Table 2

[0179] Table 2 shows the indel frequencies in protoplasts for the OsAAT target site induced by four different Cas12a-MS2 guide RNAs shown in Figure 5, compared to those induced by the untagged crRNA control.

[0180] After confirming that the addition of the MS2 stem-loop did not affect the cleavage activity of the Cas12a gRNA, the inventors evaluated the effect of MS2 tagging on the level of base editing. For this purpose, co-protoplasts were co-transfected with an expression construct containing catalytically inactive LbCas12a (D832A; SEQ ID NO: 63) along with one of four gRNAs having two MS2 hairpin binding sites and a third vector encoding a fusion of bacteriophage MS2 N55K coat protein (MCP) and the hA3A cytidine deaminase domain (SEQ ID NO: 43). The MCP coding sequence contained the N55K mutation that increases protein affinity for the MS2 stem-loop (Peabody, 1993). The base editing activities of different dCas12a-specific MS2-hA3A fusions were determined 3 days after transfection by amplicon deep sequencing and compared to that of Cas12a-DBE-10. Different MS2-gRNA designs showed various mutation efficiencies depending on the target gene, while the recruitment of hA3A via dCas12a generally improved the editing activity compared to that of the DBE-10 fusion protein (see FIGS. 7 and 3). The largest increase in editing was observed for the OsDEP1 target, and MS2-gRNA designs 3 and 4 (see FIG. 5) resulted in average increases in editing efficiency of 8.25-fold and 7.42-fold, respectively, compared to DBE-10. Collectively, these results demonstrate that targeted recruitment of deaminases via MS2-modified gRNAs and catalytically inactive Cas12a can be utilized to enhance the level of targeted random mutagenesis in plants.

[0181]

Table 3

[0182] Table 3 shows the total base editing efficiency in co - protoplasts at the OsDEP1 and OsACC target sites of the dLbCas12a - specific MS2 - hA3A fusion by four different Cas12a - MS2 guide RNA structures shown in Fig. 5. Cas12a - DBE - 10 refers to construct 10 (SEQ ID NO: 10) in Fig. 3. The mutation efficiency is represented as the percentage of NGS reads with base changes.

[0183] Example 6: Use of Cas12a - DBE for the directed evolution of novel herbicide tolerance in rapeseed Diversified base editors hold great promise for rapidly improving agronomic traits via protein-directed evolution. To test the potential of Cas12a-DBE for evolving novel herbicide resistance, DBE-10 (SEQ ID NO: 10) was used for the directed evolution of acetohydroxyacid synthase (AHAS, EC 2.2.1.6) in rapeseed plants. AHAS, also known as acetolactate synthase (ALS), is the first enzyme in the pathway for the biosynthesis of the branched-chain essential amino acids valine, leucine, and isoleucine. AHAS inhibitor herbicides have been widely used since their first introduction in the early 1980s due to their broad-spectrum weed control at very low rates, low toxicity to mammals, and wide crop selectivity. Twelve Cas12a gRNAs targeting the ALS3 gene of rapeseed (Brassica Napus) were designed, including six gRNAs with TTTV-3’ PAM sites and six gRNAs with TYTC-3’ PAM (see Table 4). To test the activity of the designed gRNAs, individual guides combined with LbCas12a (for targeting TTTV PAM) or LbCas12a-G532R / K595R (for targeting TYTC PAM) were transfected into rapeseed protoplasts. Amplicon deep sequencing showed high indel frequencies for most target sites (see Figure 8), so a proof-of-concept experiment was initiated in which rapeseed protoplasts were transformed with multiple ALS3-targeting gRNAs together with LbCas12a-DBE-10 or LbCas12a(G532R / K595R)-DBE-10. The transfected protoplasts were incorporated into a 1% alginate layer and cultured at 24 °C for at least two weeks before being sown on modified MS medium containing a selective concentration of the AHAS inhibitor bispyribac sodium salt. Approximately 3-4 weeks after plating, the developed structures were transferred to MS regeneration medium, and individual shoots were sequenced to search for mutations conferring resistance. Screening of shoots derived from protoplasts transformed with a gRNA library pooled with DBE-10 identified one resistant rapeseed strain that survived 1 nM bispyribac treatment (see Table 5).The Sanger sequencing of this strain revealed two single-allele D358N and R359H mutations. The former resulted from a single C-to-T change, and the latter was caused by both C-to-T and A-to-G conversions, showing simultaneous deaminase activity from DBE-10 (BnALS3_D358N / R359H coding sequence: SEQ ID NO: 76; BnALS3_D376N / R377H amino acid sequence: SEQ ID NO: 77). The D358N mutation (corresponding to D376N in Arabidopsis thaliana AHAS, i.e., AtAHAS) is a known artificially created resistance-conferring amino acid substitution, and R359H (corresponding to R377H in AtAHAS) has been previously demonstrated in resistant weeds (Yu and Powles, 2014). Both amino acid substitutions are predicted to cause protein structural changes that reduce the binding affinity of AHAS for bispyribac. As shown in Figure 9, bispyribac has three aromatic rings and was found to adopt a curved "S" shape conformation when bound to AtAHAS with the pyrimidinyl group inserted deepest into the herbicide binding site (Garcia et al., 2017). One of the bispyribac methoxy groups contacts D376 of AtAHAS, while the carboxylic acid group of the bispyribac form salt cross-links to the side chain of R377. In summary, these results indicate the ability of our DBE to evolve novel herbicide-resistant alleles under selection pressure.

[0184]

Table 4

[0185] Table 4 shows an overview of the different Cas12a gRNAs used for the targeted evolution of the BnALS3 gene in Brassica Napus.

[0186]

Table 5

[0187] Table 5 shows the results of Cas12a DBE-mediated directed evolution experiments in rapeseed (Brassica Napus) aimed at developing resistance to the AHAS-inhibiting herbicide bispyribac. Screening of protoplasts transformed with the pooled gRNA library (G1 to G6 in Table 4) and Cas12a DBE-10 (SEQ ID NO: 10) identified herbicide-resistant strains with two amino acid substitutions in the BnALS3 gene.

[0188] References Eid A, Alshareef S, Mahfouz MM. CRISPR base editors: genome editing without double-stranded breaks. Biochem J. 2018 Jun 11;475(11):1955-1964. doi:10.1042 / BCJ20170793. Fan J, Ding Y, Ren C, Song Z, Yuan J, Chen Q, Du C, Li C, Wang X, Shu W. Cytosine and adenine deaminase base-editors induce broad and nonspecific changes in gene expression and splicing. Commun Biol. 2021 Jul 16;4(1):882. doi:10.1038 / s42003-021-02406-5. Garcia, MD et al. Comprehensive understanding of acetohydroxyacid synthase inhibition by different Herbicide families. Proceedings of the National Academy of Sciences of the United States of America vol. 114, 7(2017):E1091-E1100. doi:10.1073 / pnas.1616142114. Gaudelli NM, Komor AC, Rees HA, Packer MS, Badran AH, Bryson DI, Liu DR. Programmable base editing of A·T to G·C in genomic DNA without DNA cleavage. Nature. 2017 Nov 23;551(7681):464 - 471. doi:10.1038 / nature24644. Gautier A, Juillerat A, Heinis C, Correa IR Jr, Kindermann M, Beaufils F, Johnsson K. An engineered protein tag for multiprotein labeling in living cells. Chem Biol. 2008 Feb;15(2):128 - 36. doi:10.1016 / j.chembiol.2008.01.007. Hussain AF, Amoury M, Barth S. SNAP - tag technology: a powerful tool for site specific conjugation of therapeutic and imaging agents. Curr Pharm Des. 2013;19(30):5437 - 42. doi:10.2174 / 1381612811319300014. Inobe T, Nukina N. Rapamycin - induced oligomer formation system of FRB - FKBP fusion proteins. J Biosci Bioeng. 2016 Jul;122(1):40 - 6. doi:10.1016 / j.jbiosc.2015.12.004. Jeong YK, Song B, Bae S. Current Status and Challenges of DNA Base Editing Tools. Mol Ther. 2020 Sep 2;28(9):1938 - 1952. doi:10.1016 / j.ymthe.2020.07.021. Komor, A., Kim, Y., Packer, M. et al. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420 - 424 (2016). Https: / / doi.org / 10.1038 / nature17946. Komor AC, Zhao KT, Packer MS, Gaudelli NM, Waterbury AL, Koblan LW, Kim YB, Badran AH, Liu DR. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with Higher efficiency and product purity. Sci Adv. 2017 Aug 30;3(8):eaao4774. doi:10.1126 / sciadv.aao4774. Kurt, IC et al. “CRISPR C-to-G base editors for inducing targeted DNA transversions in Human cells.” Nature biotechnology vol. 39, 1 (2021): 41 - 46. doi:10.1038 / s41587-020-0609-x. Lange A, McLane LM, Mills RE, Devine SE, Corbett AH. Expanding the definition of the classical bipartite nuclear localization signal. Traffic. 2010 Mar;11(3):311 - 23. doi:10.1111 / j.1600-0854.2009.01028.x. Li, C., Zhang, R., Meng, X. et al. Targeted, random mutagenesis of plant genes with dual cytosine and adenine base editors. Nat Biotechnol 38, 875 - 882 (2020). Https: / / doi.org / 10.1038 / s41587-019-0393-7. Los GV, Encell LP, McDougall MG, Hartzell DD, Karassina N, Zimprich C, Wood MG, Learish R, Ohana RF, Urh M, Simpson D, Mendez J, Zimmerman K, Otto P, Vidugiris G, Zhu J, Darzins A, Klaubert DH, Bulleit RF, Wood KV. HaloTag: a novel protein labeling technology for cell imaging and protein analysis. ACS Chem Biol. 2008 Jun 20;3(6):373 - 82. doi:10.1021 / cb800025k. Paul B, Chaubet L, Verver DE, Montoya G. Mechanics of CRISPR-Cas12a and engineered variants on λ-DNA. Nucleic Acids Res. 2021 Dec 24: gkab1272. doi:10.1093 / nar / gkab1272. Peabody, DS. The RNA binding site of bacteriophage MS2 coat protein. The EMBO journal vol.12, 2(1993):595 - 600. doi:10.1002 / j.1460-2075.1993.tb05691.x. Rees HA, Liu DR. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat Rev Genet. 2018 Dec;19(12):770 - 788. doi:10.1038 / s41576 - 018 - 0059 - 1. Erratum in: Nat Rev Genet. 2018 Oct 19. Sambrook, J., Fritsch, E. R., & Maniatis, T. (1989). Molecular Cloning: A Laboratory Manual (2nd ed.). Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Savva YA, Rieder LE, Reenan RA. The ADAR protein family. Genome Biol. 2012 Dec 28;13(12):252. doi:10.1186 / gb - 2012 - 13 - 12 - 252. Shan Q, Wang Y, Li J, Gao C. Genome editing in rice and wheat using the CRISPR / Cas system. Nat Protoc. 2014 Oct;9(10):2395 - 410. doi:10.1038 / nprot.2014.157. Epub 2014 Sep 18. PMID:25232936. Slaymaker IM, Gao L, Zetsche B, Scott DA, Yan WX, Zhang F. Rationally engineered Cas9 nucleases with improved specificity. Science. 2016 Jan 1;351(6268):84 - 8. doi:10.1126 / science.aad5227. Yamano T, Nishimasu H, Zetsche B, Hirano H, Slaymaker IM, Li Y, Fedorova I, Nakane T, Makarova KS, Koonin EV, Ishitani R, Zhang F, Nureki O. Crystal Structure of Cpf1 in Complex with Guide RNA and Target DNA. Cell. 2016 May 5;165(4):949-62. doi:10.1016 / j.cell.2016.04.003. Yan D, Ren B, Liu L, Yan F, Li S, Wang G, Sun W, Zhou X, Zhouh. High-efficiency and multiplex adenine base editing in plants using new TadA variants. Mol Plant. 2021 May 3;14(5):722-731. doi:10.1016 / j.molp.2021.02.007. Yu Q, Powles SB. Resistance to AHAS inhibitor Herbicides: current understanding. Pest management science vol.70,9(2014):1340-50. doi:10.1002 / ps.3710. Zhang X, Chen L, Zhu B, Wang L, Chen C, Hong M, Huang Y, Lih, Hanh, Cai B, Yu W, Yin S, Yang L, Yang Z, Liu M, Zhang Y, Mao Z, Wu Y, Liu M, Li D. Increasing the efficiency and targeting range of cytidine base editors through fusion of a single-stranded DNA-binding protein domain. Nat Cell Biol. 2020 Jun;22(6):740-750. doi:10.1038 / s41556-020-0518-8.

Claims

1. A method for targeted diversified base editing of at least one target nucleic acid segment, (a) To provide at least one cell or construct comprising at least one target nucleic acid segment; (b) (i) at least one diversified base editor (DBE), or at least one nucleic acid molecule encoding it; and (ii) at least one suitable guide RNA or at least one nucleic acid molecule encoding it Introduction of the target into target cells or contact with the target construct; (c) (i) enabling the formation of a complex of the at least one diversifying base editor and (ii) the at least one suitable guide RNA; (d) Obtain at least one cell or construct containing at least one modified target nucleic acid segment. Including; The total base editing efficiency of introducing at least one substitution of any type into at least an on-target nucleic acid segment is at least 0.2%, 0.5%, 1%, 5%, 10%, 15%, 20%, or at least 25%, with an upper limit of 100% or less; and / or At least one modification of the target nucleic acid segment occurs within an extended base editing window; The aforementioned method does not include a method for treating and / or diagnosing the human or animal body by surgical or therapeutic procedures performed on the human or animal body, and / or a process for altering the genetic identity of the human germline. A method wherein the diversified base editor comprises a CRISPR-Cas moiety derived from a class 2 V-type CRISPR-Cas endonuclease.

2. The method according to claim 1, wherein the diversified base editor comprises a CRISPR-Cas moiety derived from Cas12a endonuclease.

3. The method according to claim 1 or 2, wherein at least one of the target cells is a prokaryotic cell including a bacterial cell or an archaeal cell, or a eukaryotic cell including an insect cell, a mammalian cell or a plant cell.

4. The method according to claim 1, wherein at least one of the target cells is a plant cell containing a plant protoplast.

5. The aforementioned at least one diversified base editor, (i) One or more cytosine deaminase moieties, (ii) One or more adenine deaminase moieties, (iii) One or more CRISPR-Cas moieties (preferably, the CRISPR-Cas domain does not cleave both strands of the double-stranded DNA), (iv) 1, 2, 3 or more nuclear localization sequences; and (v) at least one linker region, preferably one or more linker regions between (i) and (ii), and optionally one or more linker regions between (ii) and (iii) The method according to claim 1, including the method described in claim 1.

6. The method according to claim 5, wherein the at least one diversifying base editor of step (b-i) is at least one diversifying base editor in the form of a fusion protein, preferably having portions (i), (ii), and (iii) as defined in claim 5, arranged in the order (i)-(ii)-(iii) from N-terminus to C-terminus together with one or more linker regions between each segment, and more preferably having one, two, three or more nuclear localization sequences (iv) located at the C-terminus of the diversifying base editor, or having one or more nuclear localization sequences (iii) located at the N-terminus and one or more nuclear localization sequences (iii) located at the C-terminus of the diversifying base editor.

7. The method according to claim 1, wherein the diversified base editor comprises at least one further portion, preferably the at least one further portion is selected from ssDNA, ssRNA, or dsRNA-binding protein portions, comprising an MS2 protein portion, an affinity tag-binding protein, a uracilglycosylase inhibitor portion and / or a uracilglycosylase portion, or any combination thereof.

8. The one or more adenine deaminase moieties and / or the one or more cytosine deaminase moieties are ligated to at least one ssRNA or dsRNA binding protein moiety, preferably at least one MS2 protein moiety, and the at least one suitable guide RNA is adapted to enable interaction with the at least one ssRNA or dsRNA binding protein moiety, preferably one or more adenine base editor moieties and / or one or more cytosine base editor moieties, and at least one MS2 protein moiety The method according to claim 1, wherein the preferred guide RNA is ligated to two protein portions and fitted to include two MS2 stem-loops, and optionally the preferred guide RNA includes a sequence selected from SEQ ID NOs: 38 to 41, or a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% sequence identity thereto.

9. The method according to claim 1, wherein the diversified base editor includes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% sequence identity with an amino acid molecule selected from any one of SEQ ID NOs: 1 to 27 or their respective reference sequences.

10. Edited cells, tissues, organs, materials, or whole organisms obtained or obtainable by the method described in claim 1.

11. A diversified base editor complex further comprising a diversified base editor, or at least one suitable guide RNA, or at least one nucleic acid molecule encoding the same, wherein the diversified base editor is as defined in claim 5.

12. A vector or expression construct, or two or more vectors and expression constructs, wherein each vector and / or expression construct comprises at least one nucleic acid molecule as described in claim 11, and different portions of the diversifying base editor are encoded on the same vector or expression construct or different vectors or expression constructs, and / or the diversifying base editor, or a portion thereof, and the at least one preferred guide RNA are encoded on the same vector or expression construct or different vectors or expression constructs.

13. A cell comprising at least one diversified base editor or at least one diversified base editor complex as described in claim 11, or at least one nucleic acid molecule encoding the same; or at least one vector or expression construct as described in claim 12; wherein the cell is a prokaryotic cell including a bacterial cell or an archaeal cell, or a eukaryotic cell including an insect cell, a mammalian cell including a human cell, or a plant cell including a plant protoplast, preferably the cell is a plant cell including a plant protoplast, and optionally the plant cell including the plant protoplast is a plant belonging to the superfamily green plant subkingdom (Viridiplantae), particularly Acer sp., Actinidia sp., Abelmoschus sp., Agave sisalana, Agropyron , oats (Agrostis stolonifera), Allium spp., Amaranthus spp., Ammophylla areenaria, pineapple (Ananas comosus), Annona spp., celery (Apium graveolens), Arachis spp., Artocarpus spp., asparagus (Asparagus officinalis), Avena spp. (for example, oats (Avena Avena sativa, wild oat (Avena fatua), Byzantine oat (Avena byzantina), Avena fatua var. sativa, Avena hybrida, star fruit (Averrhoa carambola), Bambusa sp., winter melon (Benincasa hispida), Brazil nut (Bertholletia excelsea), sugar beet (Beta vulgaris), Brassica sp. (for example, rapeseed (Brassica albino)napus), Brassica rapa ssp. [canola, rapeseed, rapeseed], Cadaba farinosa, Camellia sinensis, Canna indica, Cannabis sativa, Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Carya spp., Carthamus tinctorius, Castanea , kapok (Ceiba pentandra), endive (Cichorium endivia), cinnamon species (Cinnamomum spp.), watermelon (Citrus lanatus), mandarin species (Citrus spp.), coconut species (Cocos spp.), coffee species (Coffea spp.), taro (Colocasia esculenta), cola species (Cola spp.), coriander species (Corchorus sp.), coriander (Coriandrum sativum), hazelnut species (Corylus spp.), hawthorn species (Crataegus) ), saffron (Crocus sativus), pumpkin species (Cucurbita spp.), cucumber species (Cucumis spp.), Korean thistle species (Cynara spp.), carrot (Daucus carota), desmodium species (Desmodium spp.), longan (Dimocarpus longan), yam species (Dioscorea spp.), persimmon species (Diospyrros spp.), barnyard millet species (Echinochloa spp.), oil palm (Elaeis) (for example, Guinea oil palm (Elaeis) Guineensis, American oil palm (Elaeis oleifera), finger millet (Eleusine coracana), teff (Eragrostis tef), Erianthus species (Erianthus sp.), loquat (Eriobotrya)(e.g., soybean, soybean, cotton) hirsutum, sunflower species (Helianthus sp.) (e.g., sunflower (Helianthus annuus)), daylily (Hemerocallis fulva), hibiscus species (Hibiscus sp.), barley species (Hordeum sp.) (e.g., barley (Hordeum vulgare)), sweet potato (Ipomoea batatas), walnut species (Juglans sp.), lettuce (Lactuca sativa), Lathyrus species (Lathyrus sp.), lentil (Lens culinaris), flax (Linum) *Usitatissimum*, *Litchi chinensis*, *Lotus* species, *Luffa acutangula*, *Lupinus* species, *Luzula sylvatica*, *Lycopersicon* species (e.g., *Lycopersicon esculentum*, *Lycopersicon lycopersicum*, *Lycopersicon pyriforme*), *Macrotyloma* species ), Malus sp., Acerola (Malphia emarginata), Mammea americana, Mango (Mangifera indica), Manihot sp., Sapodilla (Manilkara)zapota), Medicago sativa, Melilotus sp., Mentha sp., Miscanthus sinensis, Momordica sp., Morus nigra, Musa sp., Nicotiana sp., Olea sp., Opuntia sp., Ornithopus sp., Oryza sp. (for example, rice (Oryza) (Panax sativa), Oryza latifolia, Millet (Panicum miliaceum), Switchgrass (Panicum virgatum), Passion fruit (Passiflora edulis), Parsnip (Pastinaca sativa), Pennisetum sp., Persea sp., Parsley (Petroselinum crispus), Phalaris arundinacea, Bean (Phaseolus sp.), Timothy grass (Phleum pratense), Date palm (Phoenix) , reed (Phragmites australis), Physalis spp., pine (Pinus spp.), pistachio (Pistacia vera), pea (Pisum spp.), strawberry vine (Poa spp.), poplar (Populus spp.), prosopis spp., cherry (Prunus spp.), bunjiro (Psidium spp.), pomegranate (Punica granatum), European pear (Pyrus communis), oak (Quercus) ), radish (Raphanus sativus), rhubarb (Rhemum rhabbarum), gooseberry species (Ribes spp.), castor bean (Ricinus communis), raspberry species (Rubus spp.), sugarcane species (Saccharum), willow species (Salix sp.), elderberry species (Sambucus sp.), rye (Secale cereale), sesame species (Sesamum sp.), white mustard species (Sinapis sp.), eggplant species (Solanum sp.) (e.g., potato (Solanum tuberosum), red eggplant (Solanum integrifolium) or tomato (Solanum lycopersicum)), sorghum (Sorghum bicolor), spinach species (Spinacia sp.), myrtle species (Syzygium sp.), tangerine species (Tagetes) , tamarind (Tamarindus indica), cacao (Theobroma cacao), Trifolium species (Trifolium spp.), Trypsacum dactyloides, Triticosecale rimpaui, Triticum species (Triticum spp.) (for example, bread wheat (Triticum aestivum), durum wheat (Triticum durum), macaroni wheat (Triticum turgidum), Triticum hybernum, Triticum macha (Triticum macha) Macha, wheat (Triticum sativum), Einkorn wheat (Triticum monococcum) or Triticum vulgare, Tropaeolum minus, nasturtium (Tropaeolum majus), Vaccinium species, Vicia species, Vigna species, sweet violet (Viola odorata), Vitis species, corn (Zea mays), wild rice (Zizania palustris), or Ziziphus speciesA plant cell selected from at least one target cell which is a plant cell or a cell derived therefrom, including plant cells of monocotyledonous and dicotyledonous plants, such as leguminous plants for feed or fodder, ornamental plants, food crops, trees or shrubs, or plant cells derived therefrom.

14. A kit comprising at least one diversified base editor or at least one diversified base editor complex according to claim 11, or at least one nucleic acid molecule encoding the same; or at least one vector or expression construct according to claim 12.

15. Use of at least one vector or expression construct according to claim 11 for targeted directional evolution of at least one target nucleic acid segment, preferably for targeted directional evolution of at least one target nucleic acid segment in a plant, including use for optimizing or modifying traits in a plant, including optimization or modification of yield-related traits or disease or pathogen resistance-related traits, wherein the disease is caused by a virus, bacterium, fungus, nematode, or insect, or by an abiotic stress-related trait including herbicide resistance-related traits or salinity or drought stress-related traits, or the pathogen is selected from there, further including use for identification of at least one read gene.