Multiple vector system and its use

CN107466325BActive Publication Date: 2026-06-30FONDAZIONE TELETON

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
FONDAZIONE TELETON
Filing Date
2016-03-03
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing adeno-associated virus (AAV) vectors have limited packaging capacity, making it difficult to effectively deliver genes larger than 5Kb. Furthermore, binary AAV vectors suffer from the problem of truncated protein product generation in applications, affecting their safety and efficiency.

Method used

By employing a multiple vector system, degradation signals such as ubiquitination signals, microRNA target sequences, and artificial stop codons are introduced into the vectors to reduce the expression of truncated proteins and increase the generation of full-length proteins.

Benefits of technology

It significantly reduced the expression of truncated proteins, increased the yield of full-length proteins, and enhanced the effectiveness and safety of gene therapy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN107466325B_ABST
    Figure CN107466325B_ABST
Patent Text Reader

Abstract

This invention relates to constructs, vectors, corresponding host cells, and pharmaceutical compositions that enable effective gene therapy, particularly for genes larger than 5 kb.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to constructs, vectors, corresponding host cells, and pharmaceutical compositions that enable effective gene therapy, particularly for genes larger than 5 kb. Background of the Invention

[0003] Restorative treatment for many inherited retinal degenerations (IRDs) remains a major unmet medical need. To date, gene therapy using adeno-associated virus (AAV) vectors represents the most promising approach for treating many IRDs. Indeed, years of preclinical studies and numerous clinical trials for various IRDs have established the ability of AAVs to efficiently deliver therapeutic genes to the diseased layers of the retina [photoreceptors (PR) and retinal pigment epithelium (RPE)]. 1,2 Furthermore, their excellent safety and efficacy profile in human cases has been highlighted. 3-7 However, one obstacle to extending this success to other blinding conditions is the packaging capacity of AAV vectors (approximately 5kb). This has become a limiting factor in the development of gene replacement therapies for conventional IRDs caused by mutations in genes with coding sequences (CDS) larger than 5kb (hereinafter also referred to as large genes).

[0004] Therefore, in recent years, there has been considerable interest in identifying strategies to increase AAV carrying capacity. Binary AAV vectors, based on the ability of the AAV genome to concatamerize through intermolecular recombination, have been successfully used to address this issue. 14-16 Binary AAV vectors are generated by splitting a large transgenic expression cassette into two halves, each packaged in a single standard-sized (NS; <5kb) AAV vector. Reconstruction of the full-length expression cassette is achieved by co-infecting identical cells with two binary AAV vectors, followed by: i) tail-to-head circularization mediated by inverted terminal repeats (ITRs) of the genomes of both vectors, followed by splicing (binary AAV trans-splicing, TS). 15 ii) Homologous recombination (binary AAV overlap, OV) between overlapping regions contained in the genomes of these two vectors. 15 iii) The combination of these two (binary AAV hybrid) 16 The inventors and others have recently demonstrated the potential of binary AAV carriers in the retina. 14,17-19 The most commonly used recombination initiation region for binary AAV hybrid vectors is an 872 bp sequence derived from the middle third of the human alkaline phosphatase cDNA, which has been shown to provide high levels of binary AAV hybrid vector reconstruction. 16The inventors demonstrate that binary AAV hybrid vectors including the AK sequence outperform those including the positive alkaline phosphatase head region sequence. 14 Those carriers, invented by Ghosh et al. 22 This is generated based on the description. Additional studies have shown that the level of transgene reconstitution provided by the head or tail of this alkaline phosphatase region is comparable to that achieved using the middle third of the full-length alkaline phosphatase cDNA. 22 Similarly, the inventors discovered that binary AAV trans-splicing and a hybrid AK vector (containing a short AK recombination initiation sequence from F1 phage) can efficiently transduce mouse and porcine retinas and rescue mouse models of Sturgeon's disease (STGD) and Usher 1B (USH1B). 14,19 The level of PR transduction achieved using binary AAV TS and heterozygous AK vectors led to a significant improvement in the retinal phenotype in an IRD mouse model and may be effective in treating hereditary blindness. Furthermore, vectors with heterologous ITRs from serotypes 2 and 5 (ITR2 and ITR5, respectively) exhibited high heterogeneity (58% homology23) compared to vectors with homologous ITRs. 24 This demonstrates a reduced ability to form cyclic monomers and an increased tendency for directional tail-head cyclication. Based on this, Yan et al. have shown that binary AAV vectors with heterologous ITR2 and ITR5 are more efficient than binary AAV vectors with homologous ITRs. 24,25 To reconstruct transgenic expression more efficiently.

[0005] While these studies highlight the potential of binary AAV vectors for large gene reconstruction in tissues of interest, such as the retina, key issues have also been identified that need to be addressed before considering further clinical translation of this strategy.

[0006] The generation of truncated protein products from the 5′ half-vector (which contains the promoter sequence) and / or from the 3′ half-vector is due to the low promoter activity of this ITR. 14,17,20,21 However, a major problem remains associated with the application of binary carriers. To date, no formal toxicity studies have been conducted to evaluate the potential adverse effects of these truncated products in vivo, thus raising safety concerns. Therefore, there is a strong desire to reduce or eliminate their formation. Therefore, the object of this invention is to solve this major problem associated with the application of binary carrier systems. Summary of the Invention

[0007] This invention relates to constructs, vectors, corresponding host cells, and pharmaceutical compositions that enable effective gene therapy, particularly for genes larger than 5 kb.

[0008] Large genes include, for example:

[0009]

[0010]

[0011] Sturgeon's disease (STGD1; MIM#248200) is the most common form of inherited macular degeneration caused by mutations in ABCA4 (CDS: 6822bp), a gene that encodes the photoreceptor-specific all-reflection retinal transporter. 8,9 Cone-rod dystrophy type 3, macular degeneration, age-related macular degeneration type 2, early-onset severe retinal dystrophy, and retinitis pigmentosa type 19 are also associated with ABCA4 mutations (ABCA4-related diseases). Usher syndrome type IB (USH1B; MIM#276900) is caused by MYO7A (CDS: 6648bp). 10 The most severe combination of deafness and retinitis pigmentosa caused by mutations in this gene, which encodes actin-based motors expressed in the PR and RPE of the retina. 11-13 .

[0012] In addition, many other genetic diseases (not necessarily causing retinal symptoms) are also attributed to mutations in large genes. These include, for example, Duchenne muscular dystrophy caused by mutations in the DMD gene, cystic fibrosis caused by mutations in the CFTR gene, hemophilia A caused by mutations in the F8 gene, and Dysferlin myopathy caused by mutations in the DYSF gene.

[0013] Specifically, the objective of this invention is to reduce the expression of truncated protein products associated with multiplex vector systems (preferably multiplex viral vector systems) by utilizing signals that mediate protein degradation or prevent their translation (hereinafter referred to as degradation signals). Degradation signals have never been used in multiplex viral vectors before. This invention surprisingly discovers that when a degradation signal is present in at least one vector of a multiplex vector system, the expression of truncated protein forms is significantly reduced, resulting in higher yields of full-length protein.

[0014] Therefore, a first aspect of the present invention provides a vector system for expressing the coding sequence of a gene of interest in cells, the coding sequence comprising a first portion and a second portion, the vector system comprising:

[0015] e) The first carrier, which includes:

[0016] -The first portion (CDS1) of the encoded sequence,

[0017] - a first reconstruction sequence; and

[0018] f) A second carrier, comprising:

[0019] -The second part (CDS2) of the encoded sequence,

[0020] -Second reconstruction sequence,

[0021] The first and second reconstruction sequences are selected from the following group:

[0022] i] The first reconstructed sequence is composed of the 3' end of the first portion of the encoded sequence, and the second reconstructed sequence is composed of the 5' end of the second portion of the encoded sequence, wherein the first and second reconstructed sequences are overlapping sequences; or

[0023] ii] The first reconstructed sequence contains a splice donor signal (SD), and the second reconstructed sequence contains a splice acceptor signal (SA). Optionally, each of the first and second reconstructed sequences further contains a recombination initiation sequence.

[0024] The feature is that one or both of the first and second vectors further comprise a nucleotide sequence of a degradation signal, said sequence being located at the 3' end of CDS1 and / or the 5' end of CDS2 in case i), and at the 3' position relative to SD and / or the 5' position relative to SA in case ii).

[0025] Preferably, both the first and second vectors further contain the nucleotide sequence of the degradation signal, wherein the nucleotide sequence of the degradation signal in the first vector is the same as or different from that in the second vector.

[0026] Preferably, the first reconstructed sequence contains a splice donor signal (SD) and a recombination initiating region at the 3' position relative to the SD, and the second reconstructed sequence contains a splice acceptor signal (SA) and a recombination initiating sequence at the 5' position relative to the SA; wherein the nucleotide sequence of the degradation signal is located at the 5' end and / or 3' end of the nucleotide sequence of the recombination initiating region of one or both of the first and second vectors.

[0027] Preferably, the nucleotide sequence of the degradation signal is selected from: one or more protein ubiquitination signals, one or more microRNA target sequences, and / or one or more artificial stop codons.

[0028] Preferably, the nucleotide sequence of the degradation signal comprises or consists of a sequence encoding a sequence selected from CL1 SEQ ID No.1, CL2 SEQ ID No.2, CL6 SEQ ID No.3, CL9 SEQ ID No.4, CL10 SEQ ID No.5, CL11 SEQ ID No.6, CL12 SEQ ID No.7, CL15 SEQ ID No.8, CL16 SEQ ID No.9, SL17 SEQ ID No.10, or PB29 (SEQ ID No.14 or SEQ ID No.15); or, the nucleotide sequence of the degradation signal comprises or consists of a sequence selected from miR-204 SEQ ID No.11, miR-124 SEQ ID No.12, or miR-26a SEQ ID No.13.

[0029] The preferred first vector's degradation signal nucleotide sequence comprises or consists of the following: a sequence encoding CL1 SEQ ID No. 1, or comprises or consists of the following: SEQ ID No. 16, or comprises or consists of the following: miR-204 SEQ ID No. 11 and miR-124 SEQ ID No. 12, preferably comprising three copies of miR 204 SEQ ID No. 11 and three copies of miR 124 SEQ ID No. 12, or comprises or consists of the following: miR-26a SEQ ID No. 13, preferably comprising four copies of miR-26a SEQ ID No. 13.

[0030] The nucleotide sequence of the degradation signal of the preferred second vector comprises or consists of the following: a sequence encoding PB29 (SEQ ID No. 14 or SEQ ID No. 15), or comprises or consists of the following: SEQ ID No. 19 or SEQ ID No. 20. Preferably, the degradation signal of the second vector comprises or consists of the following: a sequence encoding three copies of PB29 of SEQ ID No. 14 or SEQ ID No. 15.

[0031] Preferably, the first carrier further includes a promoter sequence operatively connected to the 5' end portion of the first part (CDS1) of the encoded sequence.

[0032] Preferably, both the first and second vectors further comprise a 5' terminal repeat (5'-TR) nucleotide sequence and a 3' terminal repeat (3'-TR) nucleotide sequence, preferably the 5'-TR is a 5'-inverted terminal repeat (5'-ITR) nucleotide sequence and the 3'-TR is a 3'-inverted terminal repeat (3'-ITR) nucleotide sequence, preferably the ITR is derived from the same viral serotype or from different viral serotypes, preferably the virus is AAV.

[0033] Preferably, the recombination initiating sequence is selected from the following group: AKGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAAT (SEQ ID No. 22) or

[0034] GGGATTTTTCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAAT (SEQ ID NO. 23), AP1 (SEQ ID NO. 24), AP2 (SEQ ID NO. 25) and AP (SEQ ID NO. 26).

[0035] Preferably, the coding sequence is divided into a first part and a second part at the natural exon-exon junction.

[0036] Preferably, the splice donor signal comprises or consists substantially of a sequence that is at least 70%, 75%, 80%, 85%, 90%, 95%, or 100% identical to GTAAGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACTCTTGCGTTTCT (SEQ ID No. 27).

[0037] Preferably, the splice acceptor signal comprises or is substantially composed of a sequence that is at least 70%, 75%, 80%, 85%, 90%, 95%, or 100% identical to GATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCACAG (SEQ ID No. 28).

[0038] Preferably, the first vector further comprises at least one enhancer nucleotide sequence operatively linked to the coding sequence.

[0039] Preferably, the coding sequence encodes a protein capable of correcting retinal degeneration.

[0040] Preferably, the coding sequence encodes a protein that can correct Duchenne muscular dystrophy, cystic fibrosis, hemophilia A, and Dysferlin myopathy.

[0041] In the case of retinal degradation, the coding sequence is preferably a coding sequence of a gene selected from the group consisting of: ABCA4, MYO7A, CEP290, CDH23, EYS, PCDH15, CACNA1, SNRNP200, RP1, PRPF8, RP1L1, ALMS1, USH2A, GPR98, and HMCN1.

[0042] In cases of Duchenne muscular dystrophy, cystic fibrosis, hemophilia A, and Dysferlin myopathy, the coding sequence is preferably a gene sequence selected from the group consisting of DMD, CFTR, F8, and DYSF.

[0043] The preferred first vector does not contain a polyadenylation signal nucleotide sequence.

[0044] Preferably, the carrier system comprises:

[0045] e) A first carrier, which comprises, in the 5'-3' direction:

[0046] -5' inverted terminal repeat (5'-ITR) sequence;

[0047] -Initiation subsequence;

[0048] - The 5' end portion (CDS1) of the coding sequence of the gene of interest, which is operatively linked to and controlled by the promoter;

[0049] - The nucleotide sequence of the splicing donor signal;

[0050] - The nucleotide sequence of the recombination-initiating region; and

[0051] -3' inverted terminal repeat (3'-ITR) sequence; and

[0052] f) A second carrier, which comprises, in the 5'-3' direction:

[0053] -5' inverted terminal repeat (5'-ITR) sequence;

[0054] - The nucleotide sequence of the recombination-initiating region;

[0055] - The nucleotide sequence of the splice acceptor signal;

[0056] - The 3' end (CDS2) of the encoded sequence;

[0057] - Polyadenylation signal nucleotide sequence; and

[0058] -3' inverted terminal repeat (3'-ITR) sequence,

[0059] The feature is that it further comprises a nucleotide sequence of a degradation signal, said sequence being located at the 5' or 3' end of a nucleotide sequence in the recombination initiation region of one or both of the first and second vectors.

[0060] Preferably, in the vector of the present invention, the first and second vectors are independently viral vectors, preferably adenovirus vectors or adeno-associated virus (AAV) vectors, preferably the first and second adeno-associated virus (AAV) vectors are selected from the same or different AAV serotypes, and preferably the adeno-associated virus is selected from serotype 2, serotype 8, serotype 5, serotype 7 or serotype 9.

[0061] Preferably, the carrier system of the present invention further includes a third carrier, which includes a third portion (CDS3) of the encoded sequence and a reconstructed sequence, wherein the second carrier includes two reconstructed sequences, each reconstructed sequence being located at each end of CDS2.

[0062] Preferably, the reconstructed sequence of the first vector is composed of the 3' end of CDS1, the two reconstructed sequences of the second vector are each composed of the 5' end and the 3' end of CDS2, respectively, and the reconstructed sequence of the third vector is composed of the 5' end of CDS3.

[0063] The reconstructed sequence of the second vector, composed of the 5' end of CDS2, and the reconstructed sequence of the first vector are overlapping sequences.

[0064] The reconstructed sequence of the second vector, which is composed of the 3' end of CDS2, and the reconstructed sequence of the third vector are overlapping sequences.

[0065] The second carrier further includes a degradation signal located at the 5' end and / or 3' end of CDS2.

[0066] Preferably, the third carrier further comprises at least one nucleotide sequence of a degradation signal.

[0067] Preferably, the second vector further comprises a polyadenylated signal nucleotide sequence linked to the 3' end portion (CDS2) of the coding sequence.

[0068] This invention provides host cells that are transformed using a vector system as defined above.

[0069] Preferably, the vector system or host cell of the present invention is used for medical applications. It is particularly preferred for gene therapy. It is also preferred for treating and / or preventing pathological conditions or diseases characterized by retinal degeneration, or for preventing and / or treating Duchenne muscular dystrophy, cystic fibrosis, hemophilia A, and Dysferlin myopathy.

[0070] Preferably, the retinal degeneration is hereditary.

[0071] Preferably, the pathological condition or disease is selected from the group consisting of: retinitis pigmentosa (RP), Leber congenital amaurosis (LCA), Sturgeon's disease (STGD), Arthur's disease (USH), Alstréd syndrome, congenital stationary night blindness (CSNB), macular dystrophy, occult macular dystrophy, and diseases caused by mutations in the ABCA4 gene.

[0072] The present invention provides pharmaceutical compositions comprising a carrier system or host cell as defined above, and a pharmaceutically acceptable carrier.

[0073] The present invention provides a method for treating and / or preventing a pathological condition or disease characterized by retinal degeneration, comprising administering to a subject in need an effective amount of a carrier system, host cell, or pharmaceutical composition as defined above.

[0074] The present invention provides a method for treating and / or preventing Duchenne muscular dystrophy, cystic fibrosis, hemophilia A, or Dysferlin myopathy, comprising administering to a subject in need an effective amount of a carrier system, host cell, or pharmaceutical composition as defined above.

[0075] This invention provides the application of nucleotide sequences of degradation signals in vector systems to reduce the expression of truncated protein forms.

[0076] The present invention provides a method for reducing the expression of truncated forms of proteins, comprising inserting a nucleotide sequence of a degradation signal into one or more vectors of a vector system.

[0077] According to a preferred embodiment of the present invention, a vector system for expressing the coding sequence of a gene of interest in a cell comprises two vectors, each vector comprising a different portion of the coding sequence and a reconstructed sequence; preferably, the reconstructed sequence of the first vector is a sequence containing a splice donor, while the reconstructed sequence of the second vector is a sequence containing a splice acceptor.

[0078] According to a more preferred embodiment of the present invention, a vector system for expressing the coding sequence of a gene of interest in a cell comprises three vectors, each vector comprising a different portion of the coding sequence and at least one reconstructed sequence; preferably, the first vector comprises a reconstructed sequence of a splice donor at a 3' position relative to a first portion of the coding sequence, the second vector comprises a reconstructed sequence of a splice acceptor at a 5' position relative to a second portion of the coding sequence and a reconstructed sequence of a splice donor at a 3' position relative to the second portion of the coding sequence, and the third vector comprises a reconstructed sequence of a splice acceptor at a 5' position relative to a third portion of the coding sequence.

[0079] Preferably, the reconstruction sequences of the first and second vectors or the reconstruction sequences of the first, second and third vectors further include recombination initiation regions, preferably located at the 3' position relative to the splice donor and the 5' position relative to the splice acceptor.

[0080] One, two, or all of the carriers in the carrier system of the present invention further contain a nucleotide sequence of a degradation signal.

[0081] Preferably, the first carrier contains a degradation signal. Preferably, the second carrier contains a degradation signal.

[0082] According to a preferred embodiment of the present invention, the vector comprises a reconstructed sequence, the recombinant sequence comprises a recombination initiation region, and a degradation signal located at the 5' or 3' end of the sequence in the recombination initiation region.

[0083] According to a preferred embodiment of the present invention, a vector system for expressing the coding sequence of a gene of interest in cells comprises two vectors; the first vector of the vector system comprises, in a 5'-3' orientation:

[0084] -The 5' end portion of the coding sequence of the gene of interest.

[0085] - Nucleic acid sequences that splice donor signals.

[0086] -The nucleic acid sequence of the recombination-initiating region, and

[0087] - The nucleic acid sequence of the degradation signal.

[0088] According to a preferred embodiment of the present invention, a vector system for expressing the coding sequence of a gene of interest in cells comprises two vectors, wherein the second vector of the vector system comprises, in a 5'-3' orientation:

[0089] -The nucleic acid sequence of the recombination-initiating region,

[0090] -The nucleic acid sequence of the degradation signal,

[0091] -The nucleic acid sequence of the splice acceptor signal, and

[0092] - The 3' end portion of the coding sequence of the gene of interest.

[0093] Preferably, the first vector of the vector system of the present invention further comprises a promoter sequence, more preferably the promoter sequence is operatively linked to the 5' end of a first portion of the coding sequence of the gene of interest.

[0094] Preferably, the second vector of the vector system composed of two vectors further comprises a polyadenylated signal nucleic acid sequence, more preferably the polyadenylated signal nucleic acid sequence is linked to the 3' end of the second portion of the coding sequence of the gene of interest. Preferably, the first vector of the vector system of the present invention does not contain a polyadenylated signal nucleic acid sequence.

[0095] Preferably, the third vector of the vector system consisting of three vectors further comprises a polyadenylation signal nucleic acid sequence, more preferably the polyadenylation signal nucleic acid sequence is linked to the 3' end of the third part of the coding sequence of the gene of interest.

[0096] Preferably, at least one of the carriers of the carrier system of the present invention, more preferably the first carrier of the carrier system of the present invention, comprises a degradation signal of the following sequence, the sequence comprising or consisting of the following portion: a sequence encoding CL1SEQ ID No. 1; preferably, the sequence encoding CL1SEQ ID No. 1 comprises or consists of the following portion: SEQ ID No. 16.

[0097] Preferably, at least one of the vectors of the vector system of the present invention, more preferably the first vector of the vector system of the present invention, comprises a degradation signal containing the following sequence: the sequence comprises miR-204 SEQ ID No. 11 and miR-124 SEQ ID No. 12, more preferably three copies of miR 204 SEQ ID No. 11 and three copies of miR 124 SEQ ID No. 12; preferably miR 204 sequence and miR 124 sequence, and / or, each copy of miR 204 sequence and miR 124 sequence is linked by a linker sequence of at least 1, at least 2, at least 3, or at least 4 nucleotides.

[0098] Preferably, at least one of the carriers of the carrier system of the present invention, more preferably the first carrier of the carrier system of the present invention, comprises a degradation signal containing or consisting of the following sequence: miR-26a SEQ ID No. 13, more preferably containing four copies of miR-26a SEQ ID No. 13.

[0099] Preferably, at least one of the carriers of the carrier system of the present invention, more preferably a second carrier of the carrier system of the present invention, comprises a degradation signal of the following sequence, said sequence comprising or consisting of a sequence encoding PB29 (SEQ ID No. 14 or SEQ ID No. 15); preferably, said sequence encoding PB29 comprises or consists of SEQ ID No. 19 or SEQ ID No. 20; more preferably, said degradation signal of the sequence comprises or consists of a sequence encoding three copies of PB29 of SEQ ID No. 14 or SEQ ID No. 15.

[0100] According to a preferred embodiment of the present invention, the carrier system comprises:

[0101] a) A first carrier, which comprises, in the 5'-3' direction:

[0102] -5' inverted terminal repeat (5'-ITR) sequence;

[0103] -Initiation subsequence;

[0104] - The first portion of the coding sequence of the gene of interest, preferably the 5' end portion of the coding sequence, preferably the first portion is operatively linked to and controlled by the promoter;

[0105] - The nucleic acid sequence of the spliced ​​donor signal;

[0106] - The nucleic acid sequence of the recombination-initiating region; and

[0107] -3' inverted terminal repeat (3'-ITR) sequence; and

[0108] b) A second carrier, which comprises, in the 5'-3' direction:

[0109] -5' inverted terminal repeat (5'-ITR) sequence;

[0110] -The nucleic acid sequence of the recombination-initiating region;

[0111] -The nucleic acid sequence of the receptor signal is cut;

[0112] - The second part of the coding sequence of the gene of interest, preferably the 3' end portion of the coding sequence;

[0113] - Polyadenylation signal nucleic acid sequence; and

[0114] -3' inverted terminal repeat (3'-ITR) sequence,

[0115] The first and / or second vector further comprises a nucleic acid sequence of a degradation signal, the sequence being located at the 5' or 3' end of the nucleic acid sequence of the recombination initiation region.

[0116] According to a more preferred embodiment of the present invention, the carrier system comprises:

[0117] a) A first carrier, which comprises, in the 5'-3' direction:

[0118] -5' inverted terminal repeat (5'-ITR) sequence;

[0119] -Initiation subsequence;

[0120] - The first portion of the coding sequence of the gene of interest, which is preferably operatively linked to and controlled by the promoter;

[0121] - The nucleic acid sequence of the spliced ​​donor signal;

[0122] - The nucleic acid sequence of the recombination-initiating region; and

[0123] -3' inverted terminal repeat (3'-ITR) sequence;

[0124] b) A second carrier, which comprises, in the 5'-3' direction:

[0125] -5' inverted terminal repeat (5'-ITR) sequence;

[0126] -The nucleic acid sequence of the recombination-initiating region;

[0127] -The nucleic acid sequence of the receptor signal is cut;

[0128] - The second part of the coding sequence of the gene of interest;

[0129] - The nucleic acid sequence of the spliced ​​donor signal;

[0130] -The nucleic acid sequence of the recombination-initiating region;

[0131] -3' inverted terminal repeat (3'-ITR) sequence; and

[0132] c) A third carrier, which includes the following in the 5'-3' direction:

[0133] -5' inverted terminal repeat (5'-ITR) sequence;

[0134] -The nucleic acid sequence of the recombination-initiating region;

[0135] -The nucleic acid sequence of the receptor signal is cut;

[0136] - The third part of the coding sequence of the gene of interest;

[0137] - Polyadenylation signal nucleic acid sequence; and

[0138] -3' inverted terminal repeat (3'-ITR) sequence,

[0139] The first and / or second and / or third vector further comprises a nucleic acid sequence of a degradation signal, said sequence being located at the 5' or 3' end of a nucleic acid sequence in one or more recombination initiation regions.

[0140] Preferably, the pathological condition or disease is selected from: Usher 1F (USH1F), congenital static night blindness (CSNB2), autosomal dominant (ad) and / or autosomal recessive (ar) retinitis pigmentosa (RP), USH1B, STGD1, Leber congenital amaurosis type 10 (LCA10), RP, Usher 1D (USH1D), Usher 2A (USH2A), autosomal dominant macular dystrophy, Usher 2C (USH2C), occult macular dystrophy, and Alstrém syndrome.

[0141] In this invention, the vector system refers to the construct system, the plasmid system, and the viral particles.

[0142] In this invention, the construct or carrier system may include more than two carriers.

[0143] Specifically, the construct system may include a third carrier containing a third portion of the sequence of interest.

[0144] In this invention, when different (2, 3 or more) vectors are introduced into cells, the full-length coding sequence is reconstructed or obtained.

[0145] The coding sequence can be divided into two parts. These parts can be of equal or different lengths. When the vector of the vector system is introduced into cells, a full-length coding sequence is obtained. The first part can be the 5' end of the coding sequence. The second part can be the 3' end of the coding sequence. Alternatively, the coding sequence can be divided into three parts. These parts can be of equal or different lengths. When the vector of the vector system is introduced into cells, a full-length coding sequence is obtained. The first part is the 5' end of the coding sequence, the second part is the middle part of the coding sequence, and the third part is the 3' part of the coding sequence.

[0146] In this invention, the cells are preferably mammalian cells, and more preferably human cells.

[0147] In this invention, the presence of a degradation signal in any of the vectors is sufficient to reduce the generation of truncated protein forms.

[0148] The term degradation signal refers to a sequence (nucleotide or amino acid) that mediates the degradation of the mRNA / protein containing it.

[0149] The term "truncated protein" or "truncated protein" refers to a protein that is not produced in its full-length form because it contains deletions ranging from a single amino acid to multiple amino acids (e.g., 1-10, 1-20, 1-50, 100, 200, etc.).

[0150] In this invention, the "reconstructed sequence" is a sequence that allows the reconstruction of a full-length coding sequence with the correct frame, thus allowing the expression of the functional protein.

[0151] The term "splicing donor / receptor signal" refers to the nucleotide sequence involved in the splicing of mRNA.

[0152] In this invention, any splice donor or acceptor signal sequence from any intron can be used. Those skilled in the art know how to identify and select suitable splice donor or acceptor signal sequences through conventional experiments.

[0153] In this invention, two sequences are considered to overlap if at least a portion of each sequence is homologous to the other. The sequences may overlap by at least 1, 2, 5, 10, 20, 50, 100, or 200 nucleotides.

[0154] The term "recombination-initiating region or sequence" refers to a sequence that mediates recombination between two different sequences. "Recombination-initiating region or sequence" and "homologous region" are used interchangeably in this document.

[0155] The term "terminal repeat" refers to a sequence that repeats at both ends of a nucleotide sequence.

[0156] The term "reverse terminal repeat" refers to a sequence that repeats in opposite directions (reverse complementarity) at both ends of a nucleotide sequence.

[0157] Protein ubiquitination signals are signals that mediate protein degradation via the proteasome.

[0158] In this invention, if the degradation signal contains a repetitive sequence (either the same or different sequences), the repetitive sequence is preferably linked by a linker of at least one nucleotide.

[0159] Artificial stop codons are nucleotide sequences that are intentionally included in transcripts to induce premature termination of protein translation.

[0160] Enhancer sequences are sequences that increase gene transcription.

[0161] According to the present invention, suitable degradation signals include: (i) a short degradation determinant CL1, a C-terminal unstable peptide that shares structural similarity with misfolded proteins and is therefore recognized by the ubiquitination system.31,32 (ii) Ubiquitin, whose fusion at the N-terminus of the donor protein mediates direct protein degradation or degradation via the N-terminal regular pathway. 33,34 (iii) The N-terminal PB29 degradation determinant, a 9-amino acid-long peptide similar to the CL1 degradation determinant, is expected to fold in structures recognized by enzymes in the ubiquitination pathway. 35 The inventors discovered that incorporating degradation sequences or signals into a multiplex vector system can reduce the expression of truncated proteins. In one instance, the inventors found that including a CL1 degradation signal can lead to the selective degradation of truncated proteins from the 5' half without affecting the generation of full-length proteins in vitro and in the retina of large pigs.

[0162] In addition, artificial stop codons can be inserted to cause early termination of the mRNA.

[0163] MicroRNA (miR) target sequences, artificial stop codons, or protein ubiquitination signals can be used to mediate the degradation of truncated protein products. In this invention, the degradation signal sequence may contain repetitive sequences, such as more than one microRNA (miR) target sequence, artificial stop codon, or protein ubiquitination signal, wherein the repetitive sequences are the same sequence repeated at least twice or different sequences; preferably, the repetitive sequences are linked by a linker of at least one nucleotide.

[0164] Among miRs expressed in the retina, miR-let7b or -26a are expressed at high levels. 26-29 miR-204 and -124 have been shown to restrict AAV-mediated transgene expression to RPE or photoreceptors. 30 Karali et al. 30 The efficacy of miR target sites in regulating the expression of genes included in a single AAV vector in specific cell types was tested. In Karali et al., the miR target sites were included in a typical expression cassette (encoding the entire reporter gene), downstream of the coding sequence and prior to the polyadenylation signal (polyA). Karali et al. used miR target sites targeting miR-204 or miR-124, and employed four tandem copies of each miR.

[0165] In this invention, miRs can also be miR mimics (Xiao et al. J Cell Physiol 212:285-292, 2007; Wang Z Methods Mol Biol 676:211-223, 2011). The inventors are the first to apply these strategies to multiple vector constructs and are able to silence the expression of truncated proteins generated from said vectors.

[0166] Over the past decade, gene therapy has been applied to treat diseases in hundreds of clinical trials. Various tools have been developed for delivering genes into human cells. In this invention, a delivery vector can be given to a patient. Those skilled in the art can determine the appropriate dosage. The term "give" includes delivery via viral or non-viral techniques. Non-viral delivery mechanisms include, but are not limited to, lipid-mediated transfection, liposomes, immunoliposomes, liposome transfection reagents, cationic surface amphiphiles (CFAs), and combinations thereof. In viral delivery, genetically engineered viruses, including adeno-associated viruses, are among the most widely used tools for gene delivery. The concept of virus-based gene delivery involves engineering viruses to express genes of interest or regulatory sequences such as promoters and introns. Depending on the specific application and virus type, most viral vectors contain mutations that inhibit their ability to replicate freely in the host like wild-type viruses. Viruses from several different families have been modified to generate viral vectors for gene delivery. These viruses include retroviruses, lentiviruses, adenoviruses, adeno-associated viruses, herpesviruses, baculoviruses, piconeviruses, and alpha viruses. This invention preferably uses adeno-associated viruses (AAVs). Most systems contain vectors and helper cells capable of accommodating the gene of interest, which provide viral structural proteins and enzymes to allow the generation of infectious viral particles containing the vector. AAVs are a family of viruses that vary in nucleotide and amino acid sequences, genome structure, pathogenicity, and host range. This diversity provides opportunities to develop different therapeutic applications using viruses with different biological properties. As with any delivery tool, efficiency, the ability to target specific tissue or cell types, expression of the gene of interest, and the safety of AAV-based systems are crucial for the successful application of gene therapy. Numerous attempts have been made in these research areas in recent years. AAV-based vectors and helper cells have been modified in various ways to alter gene expression, target delivery, improve viral titer, and enhance safety. This invention represents an improvement in this design process, in which the gene of interest is efficiently delivered to such viral vectors.

[0167] Ideal adeno-associated virus (AAV)-based vectors for gene delivery must be highly efficient, cell-specific, regulated, and safe. Delivery efficiency is crucial because it determines therapeutic efficacy. Current efforts aim to achieve cell-type-specific infection and gene expression using AAV vectors. Furthermore, AAV vectors are being developed for regulating the expression of genes of interest, as treatment may require long-term or regulated expression. Safety is a significant concern in viral gene delivery because most viruses are pathogens or have pathogenic potential. Importantly, patients should not inadvertently receive a pathogenic virus with full replication potential during gene delivery.

[0168] Adeno-associated virus (AAV) is a small virus that infects humans and some other primates. AAV is currently known not to cause disease, thus evoking a very mild immune response. Gene therapy vectors using AAV can infect dividing and quiescent cells and remain in an extrachromosomal state without integrating into the host cell's genome. These characteristics make AAV a very attractive candidate for establishing viral vectors for gene therapy and for creating syngeneic human disease models.

[0169] Wild-type AAV has attracted considerable interest from gene therapy researchers due to several key characteristics. Most notably, it is remarkably lacking in pathogenicity. It can infect non-dividing cells and stably integrate into the host cell genome at a specific site on human chromosome 19 (named AAVS1). This characteristic makes it more predictable compared to retroviruses, which represent the threat of random insertion and mutagenesis, sometimes leading to cancer. The AAV genome integrates most frequently into this site, while random integration into the genome occurs at a negligible frequency. However, the development of AAV as a gene therapy vector has eliminated this integration capability by removing the rep and cap from the vector DNA. The desired gene and the promoter driving its expression are inserted between the ITRs, which help form concatamers in the nucleus after the single-stranded vector DNA is converted to double-stranded DNA by the host cell DNA polymerase complex. AAV-based gene therapy vectors form concatamers in the host cell nucleus. In non-dividing cells, these concatamers remain intact throughout the host cell's lifespan. In dividing cells, AAV DNA is lost during cell division because the additional DNA does not replicate with the host cell DNA. Random integration of AAV DNA into the host genome is detectable, but occurs at a very low frequency. AAV also exhibits very low immunogenicity, appearing to be limited to generating neutralizing antibodies, and it does not induce a well-defined cytotoxic response. This characteristic, along with its ability to infect quiescent cells, makes it superior to adenoviruses as a vector for human gene therapy.

[0170] AAV genomics, transcriptomics, and proteomics

[0171] The AAV genome consists of approximately 4.7 kilobases of single-stranded deoxyribonucleic acid (ssDNA) of either sense or antisense. The genome contains inverted terminal repeats (ITRs) at both ends of the DNA strand and two open reading frames (ORFs): rep and cap. The former consists of four overlapping sequences encoding the Rep protein required for the AAV life cycle, while the latter contains overlapping nucleotide sequences of capsid proteins: VP1, VP2, and VP3, which interact to form a 24-sided symmetrical capsid.

[0172] ITR sequence

[0173] The name "inverted terminal repeat (ITR)" sequences comes from their symmetry, which is required for the efficient addition of the AAV genome. Another property of these sequences is their ability to form hairpin structures, which facilitate the so-called self-priming effect, allowing for primase-independent synthesis of the second DNA strand. These ITRs are also shown to be essential for the integration and rescue of AAV DNA into and from the host cell genome (human chromosome 19), as well as for the efficient capsidation of AAV DNA and the generation of fully assembled and deoxynuclease-resistant AAV particles.

[0174] For gene therapy, the ITR appears to be the only sequence required to be cis-adjacent to the therapeutic gene: the structural (cap) and packaging (rep) genes can be trans-delivered. Under this assumption, numerous methods have been developed to efficiently generate recombinant AAV (rAAV) vectors containing reporter or therapeutic genes. However, it has also been revealed that the ITR is not the only element required for efficient cis-replication and capsidation. Some research groups have identified a sequence within the rep gene coding sequence named the cis-acting Rep-dependent element (CARE). CARE has been shown to increase replication and capsidation in the cis-case.

[0175] Eleven AAV serotypes had been described by 2006, with the eleventh described in 2004. All known serotypes can infect cells from a variety of different tissue types. Determining tissue specificity through capsid serotype and pseudotyping AAV vectors to alter their tropism range may be important for their therapeutic applications.

[0176] The inverted terminal repeat (ITR) sequence used in the AAV vector system of the present invention can be any AAV ITR. The ITRs used in the AAV vector can be the same or different. For example, the vector may contain an ITR of AAV serotype 2 and an ITR of AAV serotype 5. In one embodiment of the vector of the present invention, the ITR is derived from AAV serotypes 2, 4, 5, or 8. In the present invention, ITRs of AAV serotypes 2 and 5 are preferred. AAV ITR sequences are well known in the art (e.g., for ITR2, see GenBank accession number AF043303.1; NC_001401.2; J01901.1; JN898962.1; for ITR5, see GenBank accession number NC_006152.1).

[0177] Serotype 2

[0178] To date, serotype 2 (AAV2) has been the most thoroughly studied. AAV2 exhibits a natural affinity for skeletal muscle, neurons, vascular smooth muscle cells, and hepatocytes.

[0179] Three cellular receptors targeting AAV2 have been described: heparan sulfate proteoglycan (HSPG), a V β5 integrin and fibroblast growth factor receptor 1 (FGFR-1) are key receptors. The first functions as the primary receptor, while the latter two have co-receptor activity and enable AAV to enter cells via receptor-mediated endocytosis. These findings have been questioned by Qiu, Handa, and others. HSPG functions as the primary receptor, but its abundance in the extracellular matrix can clear AAV particles and impair infection efficiency.

[0180] Serotype 2 and cancer

[0181] Research has shown that serotype 2 (AAV-2) of the virus significantly kills cancer cells without harming healthy cells. "Our results indicate that adeno-associated virus type 2, which infects most populations but has no known disease-causing effects, kills multiple types of cancer cells but has no effect on healthy cells," said Craig Meyers, professor of immunology and microbiology at Penn State University School of Medicine in Pennsylvania. This could lead to new anticancer agents.

[0182] Other serotypes

[0183] While AAV2 is the most commonly used serotype in various AAV2-based studies, other serotypes have demonstrated greater effectiveness as gene delivery vectors. For example, AAV appears to be better at infecting airway epithelial cells, AAV7 exhibits very high transduction rates in mouse skeletal muscle cells (similar to AAV1 and AAV5), AAV8 is best suited for transducing hepatocytes and photoreceptors, and AAV1 and 5 have shown great effectiveness in gene delivery to vascular endothelial cells. In the brain, most AAV serotypes exhibit neuronal tropism, while AAV5 can also transduce astrocytes. The heterozygous AAV6, a hybrid of AAV1 and AAV2, also shows lower immunogenicity than AAV2.

[0184] The differences between serotypes may lie in the receptors they bind to. For example, AAV4 and AAV5 transduction can be inhibited by soluble sialic acid (in different forms for various serotypes), and AAV5 shows entry into cells via blood cell-derived growth factor receptors.

[0185] This invention also relates to viral vector systems comprising the polynucleotides, expression constructs, or vector constructs of the present invention. In one embodiment, the viral vector system is an AAV system. Methods for preparing viruses and viral particles or constructs comprising heterologous polynucleotides are known in the art. For AAV, cells can be co-infected or transfected with adenovirus or a polynucleotide construct containing adenovirus genes suitable for AAV helper functions. Examples of materials and methods are described, for example, in U.S. Patent Nos. 8,137,962 and 6,967,018. The AAV virus or AAV vector of the present invention can be any AAV serotype, including but not limited to serotypes AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, and AAV11. In specific embodiments, serotypes AAV2, AAV5, AAV7, AAV8, or AAV9 are used. In one embodiment, the AAV serotype provides one or more tyrosine-phenylalanine (YF) mutations on the capsid surface. In a specific implementation, AAV is the AAV8 serotype, which has a tyrosine-phenylalanine mutation at position 733 (Y733F).

[0186] The vector system described in this invention delivers one or more therapeutic genes or regulatory sequences, such as promoters or introns, which can be used alone or in combination with other therapeutic or therapeutic components.

[0187] This invention also relates to host cells comprising the construct system or viral vector system of this invention. The host cell can be a cultured cell or a primary cell, i.e., directly isolated from an organism (such as a human). The host cell can be an adhesive cell or a suspension cell, i.e., a cell grown in suspension. Suitable host cells are known in the art and include, for example, DH5α, *E. coli* cells, *Hamster ovary* cells, monkey VERO cells, COS cells, HEK293 cells, etc. The cell can be a human cell or derived from other animals. In one embodiment, the cell is a photoreceptor cell or RPE cell. In a specific embodiment, the cell is a cone cell. The cell can also be a muscle cell, specifically, skeletal muscle cells, lung cells, pancreatic cells, hepatocytes, kidney cells, intestinal cells, and blood cells. In a specific embodiment, the cell is a human cone cell or rod cell. Those skilled in the art can select a suitable host based on the teachings herein. Preferably, the host cell is an animal cell, and more preferably a human cell. The cell can express the nucleotide sequence provided in the viral vector system of this invention.

[0188] Those skilled in the art will understand standard methods for integrating polynucleotides or vectors into host cells, such as transfection, lipid transfection, electroporation, microinjection, viral infection, heat shock, and transformation (after cell fusion or chemical permeation of the cell membrane). The constructs or vector systems of the present invention can also be introduced into the body in the form of naked DNA, wherein the introduction is performed using methods known in the art, such as transfection, microinjection, electroporation, calcium phosphate precipitation, and gene gun methods.

[0189] In this document, the term "host cell or genetically engineered host cell" refers to a host cell transduced, transformed, or transfected using the construct system of the present invention or the viral vector system of the present invention.

[0190] As used herein, the terms "nucleic acid," "polynucleotide sequence," and "construct" refer to deoxyribonucleotides or ribonucleotide polymers in single-stranded or double-stranded form, and unless otherwise limited, will encompass known analogs of natural nucleotides capable of functioning in a manner similar to naturally occurring nucleotides. The polynucleotide sequences include full-length sequences as well as shorter sequences derived from full-length sequences. It should be understood that specific polynucleotide sequences include degenerate codons of one or more original sequences that may be introduced to provide codon preference in specific host cells. Polynucleotide sequences falling within the scope of this invention also include sequences that specifically hybridize with sequences encoding the peptides of this invention. The polynucleotides include sense and antisense strands, in the form of individual strands or in double strands.

[0191] The present invention also envisions polynucleotide molecules having sequences that are sufficiently homologous to the polynucleotide sequences of the present invention, thereby allowing hybridization with the sequences under standard, stringent conditions using standard methods (Maniatis, T. et al., 1982).

[0192] The present invention also relates to construct systems that may include regulatory elements that function in a target host cell in which the construct is expressed. Those skilled in the art can select regulatory elements for suitable host cells (e.g., mammalian or human host cells). Regulatory elements include, for example, promoters, transcription termination sequences, translation termination sequences, enhancers, signal peptides, degradation signals, and polyadenylation elements. The constructs of the present invention may include a promoter sequence operatively linked to a nucleotide sequence encoding a desired polypeptide.

[0193] Promoters envisioned for use in this invention include, but are not limited to, the original gene promoter, the cytomegalovirus (CMV) promoter (KF853603.1, bp 149-735), the chimeric CMV / chicken β-actin promoter (CBA) and the truncated form of CBA (smCBA) promoter (US8298818 and "Light-Driven Cone Arrestin Translocation in Cones of Postnatal Guanylate Cyclase-1 Knockout Mouse Retina Treated with AAVGC1"), the rhodopsin promoter (NG_009115, bp 4205-5010), and the photoreceptor interretinoid retinol-binding protein promoter (NG_029718.1, bp 149-735). 4777-5011), vitrectomyces macular dystrophy 2 promoter (NG_009033.1, bp4870-5470), PR-specific human G protein-coupled receptor kinase 1 (hGRK1; AY327580.1 bp1793-2087 or bp1793-1991) (Haire et al. 2006; US Patent No. 8,298,818). However, any suitable promoter known in the art may be used. In a specific embodiment, the promoter is a CMV or hGRK1 promoter. In one embodiment, the promoter is a tissue-specific promoter that exhibits selective activity in one or a group of tissues but low or no activity in other tissues. In one embodiment, the promoter is a photoreceptor-specific promoter. In another embodiment, the promoter is a cone-specific and / or rod-specific promoter.

[0194] Preferred promoters are CMV, GRK1, CBA, and IRBP promoters. More preferred promoters are hybrid promoters that combine modulating elements from different promoters (e.g., a chimeric CBA promoter that combines an enhancer from a CMV promoter, a CBA promoter, and an Sv40 chimeric intron, referred to herein as a CBA hybrid promoter).

[0195] Promoters can be incorporated into constructs using standard techniques known in the art. Multiple copies of the promoter or multiple promoters can be used in the vectors of this invention. In one embodiment, the distance between the promoter and the transcription start site can be approximately the same as its distance from the transcription start site in its natural genetic environment. Some variation in this distance is allowed without significantly reducing promoter activity. In the systems of this invention, the transcription start site is typically included in the 5' construct but not in the 3' construct. In another embodiment, the transcription start site may be included in the 3' construct upstream of the degradation signal.

[0196] The constructs of this invention may optionally include a transcription termination sequence, a translation termination sequence, a signal peptide sequence, an internal ribosome entry site (IRES), an enhancer element, and / or a post-transcriptional regulatory element such as a marmot hepatitis virus (WHV) post-transcriptional regulatory element (WPRE). The transcription termination region is typically derived from the 3' untranslated region of a eukaryotic or viral gene sequence. The transcription termination sequence may be located downstream of the coding sequence to provide efficient termination. In the systems of this invention, the transcription termination site is typically included in the 3' construct but not in the 5' construct.

[0197] A signal peptide sequence is an N-terminal sequence encoding information associated with the repositioning of polypeptides operatively linked to a wide variety of post-translational cellular destinations, from specific organelle compartments to protein action and extracellular environmental sites. Enhancers are cis-acting elements that increase gene transcription and may also be included in the vector. Enhancer elements are known in the art and include, but are not limited to, the CaMV 35S enhancer element, the cytomegalovirus (CMV) early promoter enhancer element, and the SV40 enhancer element. Polyadenylated DNA sequences that guide mRNA encoded by structural genes may also be included in the vector.

[0198] Preferably, in this invention, the coding sequence is divided into first and second segments or portions (5' end portion and 3' end portion) at the natural exon-exon junction. Preferably, the size of each segment or portion of the coding sequence should not exceed 60 kb, and more preferably, the size of each segment or portion of the coding sequence should not exceed 50 kb, 40 kb, 30 kb, 20 kb, or 10 kb. Preferably, the size of each segment or portion of the coding sequence is approximately 2 kb, 2.5 kb, 3 kb, 3.5 kb, 4 kb, 4.5 kb, 5 kb, 5.5 kb, 6 kb, 6.5 kb, 7 kb, 7.5 kb, 8 kb, 8.5 kb, 9 kb, 9.5 kb, or smaller.

[0199] Splicing introns are typically located within the sequences of protein-coding genes in eukaryotic cells. Within an intron, splicing requires a donor site (the 5' end of the intron), a branching site (near the 3' end of the intron), and an acceptor site (the 3' end of the intron). The splice donor site, located at the 5' end of the intron within a relatively large, less conserved region, contains a nearly unchanged sequence GU. The splice acceptor site at the 3' end of the intron terminates the intron with a nearly unchanged AG sequence. Upstream of the AG (in the 5' direction), there is a pyrimidine-rich (C and U) region or a polypyrimidine sequence. Upstream of this polypyrimidine sequence is a branching point containing adenine nucleotides. The splice acceptor and splice donor signals can also be selected by those skilled in the art from known sequences.

[0200] Signals that mediate protein degradation and have not been previously used in multiviral systems include, but are not limited to: short degradation determinants such as CL1, CL2, CL6, CL9, CL10, CL11, CL12, CL15, CL16, and SL17; C-terminal destabilized peptides that share structural similarity with misfolded proteins and are therefore recognized by ubiquitination systems; ubiquitin, which mediates direct protein degradation or degradation via the N-terminus of a donor protein through a regular N-terminal pathway; the N-terminal PB29 degradation determinant, a 9-amino acid-long peptide similar to the CL1 degradation determinant, which is expected to fold in a structure recognized by enzymes in the ubiquitination pathway; and artificial stop codons that cause early termination of mRNA and microRNA (miR) target sequences.

[0201] Those skilled in the art will readily understand that, in addition to those variants that can be artificially generated by laboratory technicians, there can be many variant sequences of naturally occurring proteins. The polynucleotides and polypeptides of this invention cover those specifically exemplified herein, as well as any naturally occurring variants thereof, and any variants that can be artificially generated, provided that those variants retain the desired functional activity. The scope of this invention also covers polypeptides that have the same amino acid sequence as the polypeptides exemplified herein, differing only in the presence of amino acid substitutions, additions, or deletions in the sequence of the polypeptide, provided that these variant polypeptides substantially retain the same relevant functional activity as the polypeptides exemplified herein. For example, conserved amino acid substitutions in a polypeptide that do not affect the function of the polypeptide will fall within the scope of this invention. Therefore, it should be understood that the polypeptides described herein include variants and fragments of the sequences of the specific examples, as described above. This invention also includes nucleotide sequences encoding the polypeptides described herein. These nucleotide sequences can be readily constructed by those skilled in the art who are familiar with the protein and amino acid sequences described herein. Those skilled in the art will understand that the simplicity of the genetic code enables technicians to construct a variety of nucleotide sequences encoding specific polypeptides or proteins. The choice of specific nucleotide sequences may depend, for example, on the codon usage of the specific expression system or host cell. Peptides having amino acid substitutions different from those in the specific examples of the subject peptides are also contemplated within the scope of this invention. For example, the amino acids of the peptides of this invention may be substituted with non-natural amino acids, as long as the peptide having the substituted amino acids substantially retains the same activity as the peptide in which the amino acids have not been substituted. Examples of non-natural amino acids include, but are not limited to, ornithine, citrulline, hydroxyproline, homoserine, phenylglycine, taurine, iodinated tyrosine, 2,4-diaminobutyric acid, α-aminoisobutyric acid, 4-aminobutyric acid, 2-aminobutyric acid, γ-aminobutyric acid, ε-aminohexanoic acid, 6-aminohexanoic acid, 2-aminoisobutyric acid, 3-aminopropionic acid, leucine, pentane, sarcosine, homocitrulline, sulfalanine, τ-butylglycine, τ-butylalanine, phenylglycine, cyclohexylalanine, β-alanine, fluoroamino acids, designed amino acids such as β-methyl amino acids, C-methyl amino acids, N-methyl amino acids, and general amino acid analogs. Non-natural amino acids also include amino acids having derived side groups. Furthermore, any amino acid in the protein may be in the D (dextral) or L (levorotatory) form. Amino acids can generally be classified into the following categories: nonpolar, non-polar, basic, and acidic. Conservative substitutions (where an amino acid of one class of the polypeptide is replaced by another amino acid of the same class) fall within the scope of this invention, as long as the polypeptide having the substitution retains substantially the same biological activity as the polypeptide without the substitution. Table 1 provides examples of amino acids belonging to each category.

[0202]

[0203] The scope of this invention also includes polynucleotides that have the same nucleotide sequence as the polynucleotides exemplified herein, differing only in that the polynucleotides have nucleotide substitutions, additions, or deletions in their sequence, provided that these variant polynucleotides substantially retain the same relevant functional activity as the polynucleotides exemplified herein (e.g., they encode proteins with the same amino acid sequence or the same functional activity as the substances encoded by the polynucleotides exemplified herein). Therefore, it should be understood that the polynucleotides described herein include variants and fragments of the sequences of the specific examples, as stated above.

[0204] The present invention also envisions polynucleotide molecules having sequences sufficiently homologous to the polynucleotide sequences of the present invention, thereby allowing hybridization with these sequences under standard, rigorous conditions using standard methods (Maniatis, T. et al., 1982). The polynucleotides described herein may also be defined by more specific ranges of similarity and / or identicality to those exemplified herein. Sequence identicality will generally be greater than 60%, preferably greater than 75%, more preferably greater than 80%, even more preferably greater than 90%, and may be greater than 95%. The similarity and / or similarity of sequences can be 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% or greater of the similarity and / or similarity with the sequences used in this paper. Unless otherwise stated, the percentage of sequence similarity and / or similarity between two sequences used in this paper can be determined using the algorithm of Karlin and Altschul (1990), modified by Karlin and Altschul (1993). The algorithms are integrated into the NBLAST and XBLAST programs by Altschul et al. (1990). BLAST retrieval is performed using the NBLAST program with a score of 100 and a word length of 12 to obtain sequences with the desired percentage of sequence similarity. For gapped BLAST algorithms to obtain information for comparison purposes, gapped BLAST can be used as described by Altschul et al. (1997). When using the BLAST and gapped BLAST programs, the default parameters of each program (such as NBLAST and XBLAST) can be used. See the NCBI / N1H website.

[0205] This invention also relates to pharmaceutical compositions comprising the vector system or viral vector system of the present invention or a host cell, optionally in combination with a pharmaceutically acceptable carrier, diluent, excipient, or adjuvant. The selection of the pharmaceutical carrier, excipient, or diluent may be based on a specified route of administration and standard pharmaceutical practice. In addition to the carrier, excipient, or diluent, the pharmaceutical composition may also contain any suitable binder, lubricant, suspending agent, coating agent, solubilizer, and other carrier agents (e.g., lipid delivery systems) that assist or enhance viral entry into the target site. The construct or carrier may be administered in vivo or ex vivo.

[0206] Pharmaceutical compositions containing a certain amount of the compound, suitable for local or parenteral administration, constitute preferred embodiments of the present invention. For parenteral administration, these compositions are preferably in the form of sterile aqueous solutions and may contain other substances such as sufficient amounts of salt or monosaccharides to make the solution isotonic with blood. The pharmaceutical compositions of the present invention can preferably be delivered to the retina by subretinal injection, or may also be prepared in the form of injectable suspensions, ophthalmic washes, or ophthalmic ointments, which can be delivered to the retina in a non-invasive manner.

[0207] In this invention, the dosage administered to a patient (specifically a human) should be sufficient to achieve a therapeutic response within a reasonable timeframe without causing fatal toxicity, and preferably without causing side effects or morbidity exceeding acceptable levels. Those skilled in the art will understand that the dosage will depend on a variety of factors, including the subject's condition (health), subject weight, type of concurrent treatment (if present), treatment frequency, treatment ratio, and the severity and stage of the pathological condition.

[0208] The methods of this invention can be used in humans and other animals. The terms "patient" and "subject" as used herein are used interchangeably and are intended to include, for example, human and non-human species. Similarly, the in vitro methods of this invention can be performed on cells of said human and non-human species.

[0209] This invention also relates to kits comprising the construct system, viral vector system, or host cell of the present invention in one or more containers. The kits of the present invention may optionally include pharmaceutically acceptable carriers and / or diluents. In one embodiment, the kit of the present invention includes one or more other components, appendages, or adjuvants, as described herein. In one embodiment, the kit of the present invention includes instructions or packaging materials describing how to administer the carrier system of the kit. The containers of the kit may be made of any suitable material, such as glass, plastic, metal, etc., and have any suitable size, shape, or configuration. In one embodiment, the construct system, viral vector system, or host cell of the present invention is provided in solid form in the kit. In another embodiment, the construct system, viral vector system, or host cell of the present invention is provided in liquid or solution form in the kit. In one embodiment, the kit comprises ampoules or syringes containing the construct system, viral vector system, or host cell of the present invention in liquid or solution form.

[0210] The present invention also provides a pharmaceutical composition for treating an individual via gene therapy, the composition comprising a therapeutically effective amount of the vector system or viral vector system or host cell of the present invention, comprising one or more deliverable therapeutic and / or diagnostic transgenes or viral particles generated or obtained from said transgenes. This pharmaceutical composition may be used for human or animal purposes. Typically, a general technical clinician can determine the most suitable practical dose for a single patient, and this dose will vary depending on the individual's age, weight, response, and route of administration. For humans, a dose range of 1x10e10 to 1x10e15 genome copies / kg for each vector, preferably 1x10e11 to 1x10e13 genome copies / kg for each vector, is expected to be effective. A dose range of 1x10e10 to 1x10e15 genome copies / eye for each vector, preferably 1x10e10 to 1x10e13, is expected to be effective for ocular administration.

[0211] The dosage regimen and effective dose can be determined by a clinician of general technical skill. Administration may be in the form of a single dose or multiple doses. General methods for performing gene therapy using polynucleotides, expression constructs, and vectors are known in the art (see, for example, Gene Therapy: Principles and Applications, Springer Verlag 1999; and U.S. Patents 6,461,606; 6,204,251 and 6,106,826). The invention also relates to methods for expressing selected polypeptides in cells. In one embodiment, the method includes: introducing a vector system of the invention comprising a polynucleotide sequence encoding a selected polypeptide into the cells, and expressing the polynucleotide sequence in the cells. The selected polypeptide may be heterologous to the cells. In one embodiment, the cells are mammalian cells. In one embodiment, the cells are human cells. In one embodiment, the cells are photoreceptor cells or RPE cells. The cells may also be muscle cells, specifically skeletal muscle cells, lung cells, pancreatic cells, hepatocytes, kidney cells, intestinal cells, and blood cells. In one particular embodiment, the cells are cone cells or rod cells. In a specific embodiment, the cells are human cone cells or rod cells.

[0212] sequence list

[0213] AP1 (SEQ ID No. 24)

[0214] AP2 (SEQ ID No. 25)

[0215] AK seqA (SEQ ID No. 22)

[0216] AK seqB (SEQ ID No. 23)

[0217] AP (SEQ ID No. 26)

[0218] Left ITR2 (SEQ ID No.29)

[0219] Right ITR2 (SEQ ID No.30)

[0220] Left ITR5 (SEQ ID No. 31)

[0221] Right ITR5 (SEQ ID No. 32)

[0222] CMV

[0223] CMV enhancer (SEQ ID No. 33)

[0224] CMV promoter (SEQ ID No. 34)

[0225] Chimeric intron (SV40 intron) (SEQ ID No. 35)

[0226] hGRK1 promoter (SEQ ID No. 36)

[0227] CBA Hybrid Promoter

[0228] CMV enhancer (SEQ ID No. 37)

[0229] CBA promoter (SEQ ID No. 38)

[0230] IRBP (SEQ ID No. 39)

[0231] Splicing donor signal (SEQ ID No. 27)

[0232] miR-let 7b degradation signal (SEQ ID No. 40)

[0233] 4xmiR-let 7b degradation signal (SEQ ID No. 41)

[0234] miR-26a degradation signal (SEQ ID No. 13)

[0235] 4xmiR-26a degradation signal (SEQ ID No. 18)

[0236] miR-204 degradation signal (SEQ ID No. 11)

[0237] miR-124 degradation signal (SEQ ID No. 12)

[0238] 3xmiR-204+3xmiR-124 degradation signal (SEQ ID No. 17)

[0239] CL1 degradation signal (degradation determinant)

[0240] Nucleotide sequence: (SEQ ID No. 16)

[0241] Amino acid sequence: (SEQ ID No. 1)

[0242] CL2 degradation signal (degradation determinant)

[0243] Nucleotide sequence: (SEQ ID No. 42)

[0244] Amino acid sequence: (SEQ ID No. 2)

[0245] CL6 degradation signal (degradation determinant)

[0246] Nucleotide sequence: (SEQ ID No. 43)

[0247] Amino acid sequence: (SEQ ID No. 3)

[0248] CL9 degradation signal (degradation determinant)

[0249] Nucleotide sequence: (SEQ ID No. 44)

[0250] Amino acid sequence: (SEQ ID No. 4)

[0251] CL10 degradation signal (degradation determinant)

[0252] Nucleotide sequence: (SEQ ID No. 45)

[0253] Amino acid sequence: (SEQ ID No. 5)

[0254] CL11 degradation signal (degradation determinant)

[0255] Nucleotide sequence: (SEQ ID No. 46)

[0256] Amino acid sequence: (SEQ ID No. 6)

[0257] CL12 degradation signals (degradation determinants)

[0258] Nucleotide sequence: (SEQ ID No. 47)

[0259] Amino acid sequence: (SEQ ID No. 7)

[0260] CL15 degradation signal (degradation determinant)

[0261] Nucleotide sequence: (SEQ ID No. 48)

[0262] Amino acid sequence: (SEQ ID No. 8)

[0263] CL16 degradation signal (degradation determinant)

[0264] Nucleotide sequence: (SEQ ID No. 49)

[0265] Amino acid sequence: (SEQ ID No. 9)

[0266] SL17 degradation signal (degradation determinant)

[0267] Nucleotide sequence: (SEQ ID No. 50)

[0268] Amino acid sequence: (SEQ ID No. 10)

[0269] PB29 degradation signal (degradation determinant)

[0270] Nucleotide sequence: (SEQ ID No. 19)

[0271] Amino acid sequence: (SEQ ID No. 15)

[0272] Short PB29 degradation signal (degradation determinant)

[0273] Nucleotide sequence: (SEQ ID No. 20)

[0274] Amino acid sequence: (SEQ ID No. 14)

[0275] 3x PB29 degradation signal (degradation determinant) (SEQ ID No. 21)

[0276] Artificial stop codon (SEQ ID No. 51)

[0277] Shear receptor signal (SEQ ID No. 28)

[0278] SV40 Poly A (SEQ ID No. 52)

[0279] ABCA4 5'(SEQ ID No.53)

[0280] The full-length sequence hGRK1-5'ABCA4+AK+CL1 (SEQ ID No. 54)

[0281]

[0282]

[0283] Note:

[0284] ITR: Bold uppercase letters

[0285] hGRK promoter: lowercase letters in bold italics

[0286] ABCA4 5': lowercase letter underscore

[0287] SDS: lowercase bold

[0288] AK: Uppercase letters

[0289] CL1: Lowercase letters in italics and underlined

[0290] Abca4_3'(SEQ ID No. 55)

[0291] The full-length sequence ABCA4 3'+AK_SV40 (SEQ ID No. 56)

[0292]

[0293]

[0294]

[0295] Note:

[0296] ITR: uppercase letters, bold, underlined

[0297] AK: Uppercase letters

[0298] SAS: lowercase bold

[0299] ABCA4 3': lowercase letter underscore

[0300] SV40 Poly A: Lowercase letters, bold, italic

[0301] The full-length CMV 5'ABCA4-SD-AK sequence (SEQ ID No. 57)

[0302] The full-length sequence AK-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 58).

[0303] The full-length sequence of CMV 5'ABCA4-SD-AP1 (SEQ ID No. 59)

[0304] The full-length sequence AP1-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 60)

[0305] The full-length sequence of CMV 5'ABCA4-SD-AP2 (SEQ ID No. 61)

[0306] The full-length sequence AP2-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 62)

[0307] The full-length CMV 5'ABCA4-SD-AP sequence (SEQ ID No. 63)

[0308] The full-length sequence of AP-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 64)

[0309] The full-length sequence of hGRK1 5'ABCA4-SD-AP1 (SEQ ID No. 65)

[0310] The full-length sequence of GRK1 5'ABCA4-SD-AP2 (SEQ ID No. 66)

[0311] The full-length sequence of ITR5-CMV 5'ABCA4-SD-AK-ITR2 (SEQ ID No. 67)

[0312] The full-length sequence of ITR2-AK-SA-3'ABCA4-SV40-ITR5 (SEQ ID No. 68)

[0313] Full-length sequence of ITR5-CBA 5'MYO7A-SD-AK-ITR2 (SEQ ID No. 69)

[0314] The full-length sequence of ITR2-AK-SA-3'MYO7A-HA-BGH-ITR5 (SEQ ID No. 70)

[0315] The full-length sequence of CMV 5'ABCA4-3XFLAG-SD-AK-4xmiR26a (SEQ ID No. 71)

[0316] The full-length sequence of CMV 5'ABCA4-3XFLAG-SD-AK-3xmiR204+3xmir124 (SEQ ID No. 72)

[0317] The full-length sequence of CMV 5'ABCA4-3XFLAG-SD-AK-CL1 (SEQ ID No. 73)

[0318] The full-length sequence AK-STOP-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 74)

[0319] The full-length sequence AK-PB29-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 75)

[0320] The full-length sequence of AK-3XPB29-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 76)

[0321] The full-length sequence of AK-ubiquitin-SA-3'ABCA4-3XFLAG-SV40 (SEQ ID No. 77).

[0322] The invention is illustrated by referring to the following non-limiting embodiments.

[0323] Figure 1 A schematic diagram of the multiple-vector strategy of this invention. ITR: Reverse terminal repeat; Prom: Promoter; CDS: Coding sequence; SD: Splice donor signal; RR: Recombination initiation region, AK or from alkaline phosphatases (AP1, AP2, and AP); Deg Sig: Degradation signal (see Table 2); SA: Splice acceptor signal; pA: Polyadenylation signal. A and C: (Binary or ternary) hybrid vector strategy, including trans-splicing and recombination initiation region. According to a preferred embodiment of the invention, B and D: (Binary or ternary) vector overlapping strategy. For other embodiments, see [reference needed]. Figure 12-14 .

[0324] Figure 2 High-efficiency ABCA4 protein expression using homologous AK, AP1, and AP2 regions.

[0325] (a, c): (a) Representative Western blot analysis of HEK293 cells infected with the binary AAV2 / 2 (AAV serotype 2, with a homologous ITR from AAV2) vector (50 μg / lane), or (c) representative Western blot analysis of C57BL / 6 retina (whole retinal lysate) injected with the binary AAV2 / 8 (AAV serotype 8, with a homologous ITR from AAV2) vector (encoding ABCA4). Arrows indicate full-length protein, and molecular weight gradients are described on the left. (b) Quantification of the ABCA4 protein band from the Western blot analysis in (a). The intensity of the ABCA4 band in (a) is divided by the intensity of the filamentin A band. These bars show protein expression as a percentage relative to the binary AAV hybrid AK vector, with the mean shown above the corresponding bar. Values ​​are expressed as mean ± sem (the standard error of the mean). *pANOVA < 0.05; asterisks indicate significant differences from AK, AP1, and AP2. (ac)AK: Cells infected with the binary AAV hybrid AK vector or injected into the eye; AP1: Cells infected with the binary AAV hybrid AP1 vector or injected into the eye; AP2: Cells infected with the binary AAV hybrid AP2 vector or injected into the eye; AP: Cells infected with the binary AAV hybrid AP vector; neg: Cells infected with the 3' half vector or EGFP expression vector (as a negative control) or injected into the eye. α-3xflag: Western blot using anti-3xflag antibody; α-Filament A: Western blot using anti-filament A antibody, used as a loading control; α-Dysferlin: Western blot using anti-Dysferlin antibody, used as a loading control.

[0326] Figure 3 Genomic and transduction efficacy of vectors using heterologous ITR2 and ITR5.

[0327] (a) 3 × 10⁻⁶ samples from 5′- and 3′-ABCA4 half-vectors using homologous (2:2) or heterologous (5:2 or 2:5) ITRs, and from control AAV preparations (CTRL) using homologous ITR2. 10 Basic Southern blot analysis of GC-extracted DNA. Expected genome size is described below each lane. Molecular weight markers (kb) are described on the left, 5′:5′ half-vector; 3′:3′ half-vector. (b–d) Representative Western blot analysis and quantification, in moi, of HEK293 cells infected with binary AAV2 / 2 hybrid ABCA4 vectors using heterologous ITR2 and ITR5 or homologous ITR2, based on ITR2 (b and c) or transgenic (b and d) titers. Western blot image (b) is representative of n=3 independent experiments; quantifications (c and d) are from n=3 independent experiments. (b) Upper arrows indicate full-length ABCA4 protein, lower arrows indicate truncated protein; molecular weight gradient is described on the left. Protein micrograms loaded are shown below the image. α-3×flag: Western blot using anti-3×flag antibody; α-Filament A: Western blot using anti-filament A antibody (used as a loading control). (c and d) Quantification of full-length and truncated ABCA4 protein bands from Western blot analysis of cells infected with a given dose of the vector (based on ITR2 (c) or transgenic (d) titer). The bar chart shows the intensity of the full-length and truncated protein bands divided by the intensity of the filament A band, or the intensity of the full-length protein band divided by the intensity of the truncated protein band in the corresponding lane. Representative Western blot analysis and quantification of HEK293 cells infected with a binary AAV2 (AAV serotype 2) hybrid vector encoding heterologous ITR2 and ITR5 or homologous ITR2 (e, f). Western blot images (e) are representative, and quantifications (f) are from n = 3 independent experiments. (e) The upper arrow indicates the full-length protein, and the lower arrow indicates the truncated protein. The molecular weight gradient is described on the left. The micrograms of protein loaded are described below the image. (f) Quantification of the MYO7A protein band from Western blot analysis.

[0328] The mean is described above the corresponding bar. Values ​​are expressed as: mean ± sem. *p Steinmann t-test ≤ 0.05.

[0329] 2:2: Cells infected with a binary AAV hybrid vector containing a homologous ITR from AAV2; 5:2: Cells infected with a binary AAV hybrid vector containing heterologous ITRs from AAV2 and AAV5; neg: Cells infected with an EGFP-expression vector as a negative control.

[0330] Figure 4 The inclusion of the miR target site in the 5' half of the vector did not result in a significant reduction in the truncated protein product.

[0331] Representative Western blot analysis of HEK293 cells infected with the binary AAV2 / 2 (AAV serotype 2) hybrid vector, which encodes ABCA4 and contains miR target sites for miR-let7b (left panel), miR-204+124 (middle panel), or miR-26a (right panel). The upper arrow indicates the full-length ABCA4 protein, and the lower arrow indicates the truncated protein; molecular weight gradients are described on the left. The micrograms of protein loaded are described below the images. 5'+3': Cells co-infected with a 5' half-vector and a 3' half-vector without a miR target site; 5'+3'+Scrambled sequence: Cells co-infected with a 5' half-vector and a 3' half-vector without a miR target site in the presence of a scrambled miR mimic; 5'mir+3': Cells co-infected with a 5' half-vector and a 3' half-vector containing a miR target site; 5'mir+3'+Scrambled sequence: Cells co-infected with a 5' half-vector and a 3' half-vector containing a miR target site in the presence of a scrambled miR mimic; 5'mir+3'+Mimetic let7b: Cells co-infected with a 5' half-vector and a 3' half-vector containing a miR target site in the presence of a mir-let7b mimic; 5': Cells infected with a 5' half-vector without a miR target site; 5'mir: Cells infected with a 5' half-vector containing a miR target site in the presence of a scrambled miR mimic. Cells infected with miR target sites: 5'mir+ mimic let7b: Cells infected with a 5' half-vector containing miR target sites in the presence of mir-let7b mimic; neg: Control cells infected with a 3' half-vector or EGFP-expression vector; 5'mir+3'+ mimic 204+124: Cells co-infected with a 5' half-vector and a 3' half-vector containing miR target sites in the presence of mir-204 and 124 mimics; 5'mir+ mimic 204+124: Cells infected with a 5' half-vector containing miR target sites in the presence of mir-204 and 124 mimics; 5'mir+3'+ mimic 26a: Cells co-infected with a 5' half-vector and a 3' half-vector containing miR target sites in the presence of mir-26a mimic; 5'mir+ mimic 26a: Cells infected with a 5' half-vector containing miR target sites in the presence of mir-26a mimic. α-3xflag: Western blot using anti-3xflag antibody; α-filamentin A: Western blot using anti-filamentin A antibody, used as a loading control.

[0332] The scrambled sequences correspond to different miRNA sequences. For example, in experiments using the mir-let7b mimic, the scrambled sequence is the miR26a sequence.

[0333] Figure 5 The inclusion of the CL1 degradation signal in the 5' half vector resulted in a significant reduction in the truncated protein product.

[0334] Representative Western blot analyses: (a) HEK293 cells infected with the binary AAV2 / 2 (AAV serotype 2, with a homologous ITR from AAV2) hybrid vector, or (b) pig eyes (RPE+ retina) one month after injection with the binary AAV2 / 8 (AAV serotype 8, with a homologous ITR from AAV2) hybrid vector (encoding ABCA4, with or without CL1 degradation signal). The upper arrow indicates the full-length ABCA4 protein, and the lower arrow indicates the truncated protein from the 5' half of the vector; molecular weight gradients are described on the left. The micrograms of protein loaded are described below each image. 5'+3': Cells co-infected with a 5' half-vector and a 3' half-vector without CL1 or co-injected eyes; 5'-CL1+3': Cells co-infected with a 5' half-vector and a 3' half-vector containing CL1 or co-injected eyes; 5': Cells infected with a 5' half-vector without CL1; 5'-CL1: Cells infected with a 5' half-vector containing CL1; neg: Control cells infected with a 3' half-vector or an EGFP expression vector (as a negative control) or injected control eyes; α-3xflag: Western blot using anti-3xflag antibody; α-filamentin A: Western blot using anti-filamentin A antibody, used as a loading control; α-Dysferlin: Western blot using anti-Dysferlin antibody, used as a loading control. (a) Western blot images are representative images from n = 3 independent experiments. (b) Western blot images are representative images of n=5 eyes (injected with 5'+3' vector), n=2 eyes (injected with 5'-CL1+3' vector), and n=5 eyes (injected with 3' half vector or EGFP expression vector as negative control).

[0335] Figure 6 The inclusion of degradation signals in the 3' half of the vector resulted in a slight reduction in the truncated protein product.

[0336] Western blot analysis of representative HEK293 cells infected with the binary AAV2 / 2 hybrid vector, which encodes ABCA4 and contains distinct degradation signals. The upper arrow indicates the full-length ABCA4 protein, and the lower arrow indicates the truncated protein product; molecular weight gradients are described on the left. The micrograms of protein loaded are described below each image. 5'+3': Cells co-infected with both the 5' half-vector and the 3' half-vector (without degradation signal); 5': Cells infected with the 5' half-vector; 3' (unlabeled): Cells infected with the 3' half-vector (without degradation signal); Termination: Cells infected with the 3' half-vector containing a stop codon; PB29: Cells infected with the 3' half-vector containing the PB29 degradation signal; 3xPB29: Cells infected with the 3' half-vector containing three tandem copies of the PB29 degradation signal; Ubiquitin: Cells infected with the 3' half-vector containing the ubiquitin degradation signal. α-3xflag: Western blot using anti-3xflag antibody; α-filamentin A: Western blot using anti-filamentin A antibody, used as a loading control.

[0337] Figure 7 Schematic diagram of the homologous AP, AP1, and AP2 regions derived from ALPP (placental alkaline phosphatase) used in the preferred embodiment of the present invention. CDS: Coding sequence

[0338] Figure 8Subretinal delivery of the modified binary AAV vector resulted in a significant reduction in ABCA4 expression in mouse photoreceptors and lipofuscin accumulation in the retina of Abca4- / - mice. (a) Representative Western blot analysis of C57BL / 6 retina (whole retinal lysate) injected with the binary AAV2 / 8 hybrid ABCA4 vector (5'+3') or with a negative control (neg). Arrows indicate full-length proteins, and molecular weight gradients are described on the left. α-3×flag: Western blot using anti-3×flag antibody; α-Dysferlin: Western blot using anti-Dysferlin antibody, used as a loading control. (b and c) Representative images (b) and quantification (c) of lipofuscin autofluorescence (red signal) in the retinas (RPE or RPE+OS) of pigmented Abca4+ / - mice that were not injected with AAV (as a control (Abca4+ / -)) or were not injected with (Abca4- / -) or were injected with a dual AAV hybrid ABCA4 vector (Abca4- / -AAV5′+3′). (b) Scale bar (75 μm) is shown in the figure. RPE: retinal pigment epithelium; ONL: outer nuclear layer; INL: inner nuclear layer; GCL: ganglion cell layer. Arrows indicate lipofuscin signal. (c) Mean lipofuscin autofluorescence in the temporal side of three sections for each sample. Mean autofluorescence in each section is normalized to the length of the potential RPE. Mean values ​​are described above the corresponding bars. Values ​​are expressed as mean ± sem. ***p ANOVA < 0.0001. n = 4 eyes per group. (d) Mean number of RPE lipofuscin granules counted in the retinas of albino Abca4+ / + mice in at least 40 fields of view (25 μm²) of light-sensitive lipofuscin granules in mice that were not injected (Abca4+ / + not injected) or injected with PBS (Abca4+ / + PBS) and albino Abca4- / - mice injected with PBS (Abca4- / - PBS) or a binary AAV hybrid ABCA4 vector (Abca4- / - AAV5′+3′). Mean values ​​are described above the corresponding columns. Values ​​are expressed as mean ± sem. *pANOVA ≤ 0.05; **pANOVA ≤ 0.01. n = 4 eyes from Abca4+ / + not injected; n = 4 eyes from Abca4+ / + PBS; n = 3 eyes from Abca4- / - PBS; n = 3 eyes from Abca4- / - AAV5′+3′.

[0339] Figure 9Similar electrical activity of the eyes in mice and pigs treated with negative controls or modified binary AAV. (a) Mean a-wave (left panel) and b-wave (right panel) amplitudes in C57BL / 6 mice 1 month after injection of binary AAV heterozygous ABCA4 vector (AAV5'+3') or negative control (i.e., negative control AAV vector or PBS;neg). Data are expressed as mean ± sem; n indicates the number of eyes analyzed.

[0340] (b) Mean b-amplitude (μV) in scotopia, maximum response, photopic vision, and scintillation ERG tests in pigs 1 month after injection of the binary AAV hybrid ABCA4 vector (AAV5′+3′) or PBS. n = 5 eyes injected with the binary AAV hybrid ABCA4 vector; n = 4 eyes injected with PBS; *: n = 2.

[0341] Figure 10 EGFP protein expression from the IRBP and GRK1 promoters in porcine rod and cone photoreceptors. Subretinal injection of 1x102 EGFP protein from three-month-old Large White pigs. 11 Each AAV2 / 8-IRBP- or AAV2 / 8-GRK1-EGFP vector in GC / eye. Retinal cryosections were obtained 4 weeks post-injection and EGFP was analyzed using fluorescence microscopy. (ab) Representative images (a) and quantification (b) of fluorescence intensity in the PR layer. Quantification of fluorescence intensity of cryosections from each group of animals (six different fields of view / eye; 20x magnification). (cd) Representative images (c) and quantification (d) of cone transduction efficacy. Evaluation of cone transduction efficacy on cryosections (six different fields of view / eye; 63x magnification), the cryosections were immunostained with anti-LUMIf-hCAR antibody and expressed as the total number of cones (CAR+) in each field of view over the number of cones expressing EGFP (EGFP+ / CAR+). (a, c) Scale bars are shown in the figure. (bd) n = 3 eyes (injected with AAV2 / 8-IRBP-EGFP vector); n = 3 eyes (injected with AAV2 / 8-GRK1-EGFP vector). Values ​​are expressed as mean ± sem. A Steinmann t-test revealed no significant difference. OS: outer segment; ONL: outer nuclear layer; EGFP: original EGFP fluorescence; CAR: anti-cone repressor protein staining; DAPI: 4',6'-bisamidin-2-phenylindole staining. Arrows point to transduced cones.

[0342] Figure 11Subretinal delivery of the modified binary AAV vector resulted in a significant reduction in lipofuscin accumulation in the retina of Abca4- / - mice. A mosaic of images from the temporal (injection) side of the retinal cross-section shows autofluorescence (red signal) of lipofuscin in the retinas (RPE or RPE+OS) of pigmented Abca4+ / - mice that were uninjected or injected with AAV as a control (Abca4+ / -) or uninjected (Abca4- / -) or injected with the binary AAV hybrid ABCA4 vector (Abca4- / -AAV5'+3'). For each group, n = 4 eyes. T: temporal side; N: nasal side.

[0343] Figure 12 : Analogous electrical activity of the eye in mice and pigs treated with negative control or modified binary AAV. (a) Representative ERG traces from C57BL / 6 mice one month after injection of binary AAV heterozygous ABCA4 vector (AAV5'+3') or negative control (i.e., negative control AAV vector or PBS;neg). (b) Representative traces from scotopic, maximal response, photopic, and flicker ERG tests in pigs one month after injection of binary AAV heterozygous ABCA4 vector (AAV5'+3') or PBS.

[0344] Figure 13. Schematic diagram of a vector system strategy according to an embodiment of the present invention. (A) Schematic diagram of a vector system consisting of two vectors according to a preferred embodiment of the present invention: a first vector contains a first portion (CDS1 portion) of a coding sequence, and a second vector contains a second portion (CDS2 portion) of a coding sequence. (A1) The reconstructed sequence of the vector system is located at the overlapping ends of the coding sequence portions. (A2) The reconstructed sequences of the first and second vectors are respectively located at splice donor and splice acceptor sequences. (A3) Each reconstructed sequence contains a splice donor / acceptor, as arranged in A2, and also contains a recombination initiation region. Degradation signals are contained in at least one vector. This figure shows all possible locations of one or more degradation signals of the vector system for each vector according to a preferred non-limiting embodiment of the present invention.

[0345] (B) A schematic diagram of a vector system comprising three vectors according to a preferred embodiment of the present invention: a first vector comprising a first portion of a coding sequence (CDS1 portion), a second vector comprising a second portion of a coding sequence (CDS2 portion), and a third vector comprising a third portion of a coding sequence (CDS3 portion). (B1) The reconstructed sequence of the vector system is located at the overlapping ends of the coding sequence portions (the 3' end of CDS1 overlaps with the 5' end of CDS2; the 3' end of CDS2 overlaps with the 5' end of CDS3). (B2) The reconstructed sequence of the first vector is located at a splice donor; the second vector contains a first reconstructed sequence at the 5' end of CDS2, and a second reconstructed sequence is located at the 3' end of CDS2, the first reconstructed sequence being a splice acceptor and the second being a splice donor; the reconstructed sequence of the third vector is located at a splice acceptor. (B3) Each reconstructed sequence contains a splice donor / acceptor, as arranged in B2, and also contains a recombination initiation region. A degradation signal is contained in at least one vector. This figure shows all possible locations of one or more degradation signals of the carrier system for each carrier, according to a preferred, non-limiting embodiment of the invention.

[0346] CDS, coding sequence; SD, splice donor signal; RR, recombination initiation region; Deg Sig, degradation signal (see Table 2); SA, splice acceptor signal.

[0347] Figure 14 A schematic diagram of existing technologies for large gene transduction based on multiple vector strategies. CDS: coding sequence; pA: polyadenylation signal; SD: splice donor signal; SA: splice acceptor signal; AP: alkaline phosphatase recombination initiation region; AK: F1 phage recombination initiation region. The dashed line shows splicing between SD and SA, and the dotted line shows the overlapping region available for homologous recombination. Normal-sized and extra-large AAV vector plasmids contain a full-length expression cassette containing a promoter, a full-length transgenic CDS, and a polyadenylation signal (pA). The two separate AAV vector plasmids (5' and 3') required to generate a binary AAV vector contain a promoter followed by the N-terminal portion of the transgenic CDS (5' plasmid) or the C-terminal portion of the transgenic CDS followed by the pA signal (3' plasmid). Detailed Implementation

[0348] Materials and methods

[0349] plasmid generation

[0350] All plasmids used for AAV vector generation are derived from binary hybrid AK vector plasmids, which encode human ABCA4, human MYO7A, or EGFP reporter proteins (containing inverted terminal repeats (ITRs) of AAV serotype 2).14 .

[0351] The AK recombination initiation sequence contained in the vector plasmid encoding ABCA4 14 Replaced with three different recombinant priming sequences derived from the alkaline phosphatase gene: AP(NM_001632, bp 823-1100, 14 ); AP1(XM_005246439.2, bp1802-1516 20 ); AP2(XM_005246439.2, bp 1225-938 20 ).

[0352] Binary AAV vector plasmids carrying a heterologous ITR from AAV serotype 2 (ITR2) and an ITR from AAV serotype 5 (ITR5) in a 5:2-2:5 configuration are generated by replacing the left ITR2 in the 5' half and the right ITR2 in the 3' half with ITR5 (NC_006152.1, bp 1-175), respectively. Binary AAV vector plasmids carrying heterologous ITR2 and ITR5 in a 2:5 or 5:2 configuration are generated by replacing the right or left ITR2 with ITR5, respectively. The pAAV5 / 2 packaging plasmid containing the Rep5 (NC_006152.1, bp 171-2206) and AAV2Cap (AF043303bp2203-2208) genes (Rep5Cap2) was obtained from the pAAV2 / 2 packaging plasmid containing the Rep (AF043303 bp321-1993) and Cap (AF043303 bp2203-2208) genes from AAV2 (Rep2Cap2), by replacing the Rep2 gene with the Rep5 open reading frame from AAV5 (NC_006152.1, bp 171-2206).

[0353] The pZac5:5-CMV-EGFP plasmid containing an EGFP expression cassette with ITR5 is derived from the pAAV2.1-CMV-EGFP plasmid containing an ITR2 side-linked EGFP expression cassette. 45 get.

[0354] The degradation signals were cloned as follows in the binary AAV heterozygous AK vector encoding ABCA4: in the 5' half of the vector plasmid between the AK sequence and the right ITR2; and in the 3' half of the vector plasmid between the AK sequence and the splice acceptor signal. Detailed information on the degradation signal sequences can be found in Table 2.

[0355] Table 2. Degradation signals used in this study

[0356]

[0357]

[0358] The underlined sequence corresponds to a degradation signal; for degradation signals that include repetitive sequences, the un-underlined nucleotides are shown, which are included between the repetitive sequences for cloning purposes.

[0359] The ABCA4 protein expressed from the binary AAV vector carries 3xflag tags at both the N- (amino acid position 590) and C-terminus for use in... Figure 3 and 4 and Figure 6 The experiment shown, with a separate label at the C-terminus, is used for... Figure 2 and 8 The experiment shown in Figure a.

[0360] The binary AAV hybrid vector set encoding ABCA4 used in this study includes ubiquitous CMV. 46 Or PR-specific human G protein-coupled receptor kinase 1 (GRK1) 47 The promoter, and the binary AAV hybrid vector encoding MYO7A includes a ubiquitous CB promoter. 39 .

[0361] AAV vector generation and characterization

[0362] Large-scale AAV vector preparations were generated from the TIGEM AAV vector center via ternary transfection of HEK293 cells followed by two rounds of CsCl2 purification. AAV vectors carrying homologous ITR2 were prepared as described previously. 48 get.

[0363] To obtain an AAV vector carrying heterologous ITR2 and ITR5, 1.1 x 10 9 Low-passaged HEK293 cell suspension was analyzed using the calcium phosphate method with 500 μg of pDeltaF6 helper plasmid (containing the Ad helper gene). 49 260 μg of pAAV cis plasmid and different amounts of Rep2Cap2 and Rep5 packaging constructs were transfected fourfold. The amounts of Rep2Cap2 and Rep5 packaging constructs are as follows:

[0364] (i) Option A: 130 μg of each Rep5 and Rep2Cap2 (ratio 1:1)

[0365] (ii) Option B: 90 μg of Rep5 and 260 μg of Rep2Cap2 (ratio 1:3)

[0366] (iii) Option C: 26 μg of Rep5 and 260 μg of Rep2Cap2 (ratio 1:10)

[0367] Then, each AAV preparation was carried out according to the published protocol. 48 purification.

[0368] The Rep competition experiment was conducted using the scheme described below:

[0369] 1- To evaluate the competition between Rep5 and Rep2 to generate AAV vectors with ITR2, HEK293 cells were transfected using the calcium phosphate method with pDeltaF6, pAAV2.1-CMV-EGFP cis, Rep2Cap2, and Rep5Cap2 constructs at a weight ratio of 2:1:1.5:1.5 fourfold. Alternatively, as a control, cells were transfected with pDeltaF6, pAAV2.1-CMV-EGFP, Rep2Cap2 packaging constructs and unrelated control plasmids at a weight ratio of 2:1:1.5:1.5 fourfold.

[0370] 2- To evaluate the competition between Rep2 and Rep5 to generate AAV vectors with ITR5, HEK293 cells were transfected using the calcium phosphate method with pDeltaF6, pZac5:5-CMV-EGFP, Rep5Cap2, and Rep2Cap2 constructs at a weight ratio of 2:1:1.5:1.5 fourfold. Alternatively, as a control, cells were transfected with pDeltaF6, pZac5:5-CMV-EGFP, Rep5 constructs, and unrelated control plasmids at a weight ratio of 2:1:1.5:1.5 fourfold.

[0371] For large-scale AAV vector preparations, the physical titer [genome copies (GC) / mL] was determined using the following method, employing TaqMan (Applied Biosystems, Carlsbad, CA). 48 The titer obtained by PCR quantification of probes annealed on ITR2 and the analysis by dot blot method 50 Titer averaging was performed using probes annealed to within 1 kb of ITR2. For large-scale AAV vector preparations produced with different Rep5:Rep2Cap2 weight ratios, physical titer [genome copies (GC) / mL] was determined by PCR quantification using probes annealed to ITR2 in TaqMan. For AAV vector preparations used in competition experiments, physical titer [genome copies (GC)] was determined by PCR quantification using probes annealed to bovine growth hormone (BGH) polyadenylation signals (included in EGFP-expression cassettes packaged within AAV vectors) in TaqMan.

[0372] AAV infection of HEK293 cells

[0373] AAV infection of HEK293 cells was performed as described above. 14 The AAV2 vector carrying heterogeneous ITR2 and ITR5, generated according to scheme C, is used in each vector at a ratio of 1x10. 4 GC / cell infection multiples (moi) infected HEK293 cells (2x10⁻¹⁰) 4 Total GC / cell (when the inventors used binary AAV vectors at a 1:1 ratio), calculating the lowest titer achieved by each viral preparation. Infection with AAV2 / 2 carrying recombination initiation regions and degradation signals was performed at 5 x 10⁻⁶ cells per vector. 4 GC / cell moi were performed (in the case of a 1:1 ratio binary AAV vector, 1x10 5 Total GC / cell), calculated to account for the average titer between TaqMan and dot blot methods.

[0374] For experiments using 5' half vectors containing miR target sites, cells were transfected with calcium phosphate 4 hours before infection with the corresponding miR mimics (50 nM; miRIDIAN microRNA mimics hsa-let-7b-5p, hsa-miR-204-5p, hsa-miR-124-3p, and hsa-miR-26a-5p; Dharmacon, Lafayette, Colorado, USA).

[0375] Subretinal injection of AAV vector in mice and pigs

[0376] Mice were housed in the animal facility of the Institute of Genetics and Biophysics (Naples, Italy) and maintained in a 12-hour light / dark cycle (10–50 lux exposure during the light phase). C57BL / 6 mice were purchased from Harlan Italy SRL (Udine, Italy). Pigmented Abca4- / - mice were developed by albino Abca4- / - mice. 14 Produced by continuous crosses with Sv129 mice and maintained through syngeneic mating; bred between heterozygous and homozygous mice. Albino Abca4- / - mice were produced by continuous crosses and backcrosses with BALB / c mice (Rpe65 Leu450 homozygotes) and maintained through syngeneic mating; bred between heterozygous and homozygous mice. C57BL / 6 (5 weeks old), pigmented Abca4- / - (5.5 months old), and albino Abca4- / - (2.5-3 months old) mice were anesthetized as previously described. 61 Then through Liang etc. 62 The method described involves subretinal delivery of 1 μl of PBS or AAV2 / 8 carrier to the temporal side of the retina via the scleral-choroidal approach. AAV2 / 5-VMD2-human tyrosinase is then introduced. 63(Dosage: 2x10) 8 GC / eye) was added to the AAV2 / 8 carrier solution, which was then subretinally delivered to albino Abca4- / - mice. Figure 8 d). This allows us to label the RPE in the transduced portion of the eyewash cup, which is subsequently dissected and analyzed.

[0377] The Large White female pigs used in this study were registered as purebreds in the Italian National Pig Breeders' Association's LW (Large Breed) register. The pigs were housed in the Cadarli Hospital Animal Facility (Naples, Italy) and kept under a 12-hour light / dark cycle (10–50 lux exposure during the light period). The study was conducted in accordance with the Association for Research in Vision and Ophthalmology Statement for the Use of Animals in Ophthalmic and Vision Research and the Italian Ministry of Health's regulations on animal handling. All procedures were submitted to the Italian Ministry of Health; the Ministry of Public Health, Animal Health, Nutrition and Food Safety. Surgeries were performed under anesthesia, with every effort made to minimize pain. Animals were euthanized as previously described. 39 As previously described, AAV vectors were delivered subretinally to 3-month-old pigs. 39 All eyes were treated with 100 μl of PBS or AAV2 / 8 carrier solution. The AAV2 / 8 dose was 1 x 10⁻⁶ for each carrier. 11 GC / eye, therefore, co-injection of the binary AAV vectors at a 1:1 ratio resulted in 2x10 11 Total dose of GC / eye.

[0378] for Figure 2 In the animal studies included in c, 5b, 8, 9, 10, 11 and 12, the right and left eyes were randomly assigned to different experimental groups, and the researchers who performed and quantified the experiments were unaware of the treatments the animals underwent.

[0379] Western blot analysis

[0380] For Western blot analysis of HEK293 cells, mouse and porcine retinas were lysed in RIPA buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl, 1% NP40, 0.5% sodium deoxycholate, 1 mM EDTA pH 8.0, 0.1% SDS). Protease inhibitors (Complete Protease Inhibitor Mixture Tablets, Roche) and 1 mM benzyl sulfonyl fluoride were added to the lysis buffer. After lysis, samples containing MYO7A cells were denatured at 99°C for 5 min in 1X Laemli sample buffer; samples containing ABCA4 cells were denatured at 37°C for 15 min in 1X Laemli sample buffer supplemented with 4 M urea. Lysates were passed through 6-7% (ABCA4 and MYO7A samples, respectively) or 8% (… Figure 5 In b, Western blot separation was achieved using SDS-polyacrylamide gel electrophoresis. Antibodies used for immunoblotting included: anti-3xflag (1:1000, A8592; Sigma-Aldrich); anti-MYO7A (1:500, polyclonal; Primm Srl, Milan, Italy) generated from peptides corresponding to amino acids 941–1070 of the human MYO7A protein; anti-filamentin A (1:1000, catalog number #4762; Cell Signaling Technology, Danvers, MAS, USA); and anti-Dysferlin (1:500, Dysferlin, clone Ham1 / 7B6, MONX10795; Tebu-bio, Le Perray-en-Yveline, France). The ABCA4 and MYO7A bands detected by Western blot were quantified using ImageJ software (available for free download from http: / / rsbweb.nih.gov / ij / ). For in vitro experiments using AAVs carrying heterologous ITR2 and ITR5, the intensities of full-length ABCA4 and MYO7A bands were normalized to the results of truncated protein products in the corresponding lanes or to the results of filamentin A bands, while the intensities of shorter ABCA4 and MYO7A protein bands were normalized to the results of filamentin A bands. The intensities of ABCA4 bands obtained using AAV vectors carrying degradation signals or homologous regions were normalized to the results of filamentin A bands (in vitro experiments) or Dysferlin bands (in vivo experiments).

[0381] The quantification of Western blot experiments has been performed as follows:

[0382] - Figure 2ab: The intensity of the ABCA4 band is normalized to the result of the filamentin A band in the corresponding lane. Then, the normalized ABCA4 expression is expressed as a percentage relative to the binary AAV hybrid AK vector;

[0383] - Figure 2 c: The intensity (au) of the ABCA4 bands was calculated as a fold increase relative to the average intensity detected at the same level in the negative control lanes of each gel (the negative control sample in lane 7 of the lower left figure was excluded from the analysis due to abnormally high background signal). Values ​​for each group are expressed as mean ± standard error of the mean (sem).

[0384] - Figure 3 bd: The intensity of the full-length ABCA4 and truncated protein bands divided by the intensity of the filamentous protein A band, or the intensity of the full-length ABCA4 protein band divided by the intensity of the truncated protein band in the corresponding lane. Values ​​are expressed as: mean ± sem.

[0385] Table 5: Intensities of full-length ABCA4 and truncated protein bands detected in cells co-infected with 5'- and 3' half-vectors. The ratios between the intensities of full-length ABCA4 and truncated protein bands in the presence of corresponding mimics or sequence-scrambled mimics were calculated. Values ​​represent the mean ± sem of the ratios from three independent experiments.

[0386] Table 6: Intensities of full-length ABCA4 and truncated protein bands detected in cells co-infected with 5'- and 3'-half vectors. The ratios between the intensities of full-length ABCA4 and truncated bands from vectors with and without degradation signals were calculated. Values ​​represent the mean ± sem of the ratios from three independent experiments.

[0387] - Figure 8 a: The intensity (au) of the ABCA4 band is calculated as a factor of increase relative to the average background intensity detected in the negative control lane of the corresponding gel. Values ​​are expressed as mean ± sem.

[0388] Southern Imprint Analysis

[0389] 3x10⁻⁶ particles were extracted from AAV particles. 10Viral DNA from GC. To digest the unpackaged genome, the vector solution was resuspended in 240 μl of PBS pH 7.4 (GIBCO; Invitrogen SRL, Milan, Italy) and then incubated at 37°C for 2 h with 1 U / μl of DNase I (Roche) in a total volume of 300 μl (pH 7.9) containing 40 mM TRIS–HCl, 10 mM NaCl, 6 mM MgCl2, and 1 mM CaCl2. DNase I was then inactivated with 50 mM EDTA, followed by incubation at 50°C for 45 min with proteinase K and 2.5% N-lauroyl-sarcosil solution to lyse the capsid. The DNA was extracted twice with phenol-chloroform and precipitated with two volumes of 100 ethanol and 10% sodium acetate (3 M, pH 7). Basic agarose gel electrophoresis and Western blotting were performed as previously described (Sambrook and Russell, 2001, *Molecular Cloning*). Ten μL of 1 kb DNA ladder markers (N3232L; New England Biolabs, Ipswich, MA, USA) were loaded as molecular weight markers. Two different double-stranded DNA fragments were labeled with digoxin-dUTP using the DIG High Prime DNA Labeling and Detection Initiation Kit (Roche) and used as probes. The 5′ probe (768 bp) was generated by double digestion of the pZac2.1-CMV-ABCA4_5′ plasmid with SpeI and NotI; the 3′ probe (974 bp) was generated by double digestion of the pZac2.1-ABCA4_3′_3xflag_SV40 plasmid with ClaI and MfeI. Prehybridization and hybridization were performed in Church buffer (Sambrook and Russell, 2001, *Molecular Cloning*) at 65°C for 1 hour and overnight, respectively. The membrane (Whatman Nytran N, charged nylon membrane; Sigma-Aldrich, Milan, Italy) was then washed first in SSC 2g–0.1% SDS for 30 min, then in SSC 0.5g–0.1% SDS at 65°C for 30 min, and then in SSC 0.1g–0.1% SDS at 37°C for 30 min. The membrane was then analyzed by chemiluminescence detection using an enzyme immunoassay employing a DIG DNA labeling and detection kit (Roche).

[0390] Histological analysis

[0391] Mice were euthanized, and their eyeballs were subsequently collected and fixed overnight by immersion in 4% paraformaldehyde (PFA). Before harvesting the eyeballs, the temporal portion of the sclera was cauterized to orient the eye relative to the injection site upon inclusion. The eyeballs were excised, thereby removing the lens and vitreous body while preserving the eyewash cups intact. The mouse eyewash cups were moistened with 30% sucrose for cryopreservation and embedded in tissue cryopreservation medium (OCT matrix; Kaltek, Padua, Italy). For each eye, 150–200 consecutive sections (10 μm thick) were cut along the horizontal plane and progressively distributed on 10 slides, such that each slide contained 15–20 sections, each representing a different level of the entire eye. These sections were stained with 4',6'-diimide-2-phenylindole (Vector Lab, Petersburg, UK) and monitored at different magnifications using Zeiss Axiocam (Carl Zeiss, Upper Cohen, Germany).

[0392] Pigs were euthanized, and their eyeballs were collected and fixed overnight by immersion in 4% PFA. The eyeballs were excised, removing the lens and vitreous body, leaving the eyewash cups in situ. The eyewash cups were gradually dehydrated by progressively wetting them with 10%, 20%, and 30% sucrose. Embedding was performed in cryopreservation media (OCT matrix; Kaltek). Prior to embedding, the pig eyewash cups were analyzed using a fluorescence stereomicroscope (Leica Microsystems GmbH, Wetzlar, Germany) to locate the transduction region when the EGFP-encoding vector was administered. For each eye, 200–300 consecutive sections (12 μm thick) were cut along the horizontal meridian, and these sections were progressively distributed on slides, with each slide containing 6–10 sections. Slide staining and image acquisition were performed as described for mice.

[0393] Cone immunofluorescence staining

[0394] Frozen retinal sections were washed once with PBS and then permeated for 1 hour in PBS containing 0.1% Triton X-100. They were then treated with a blocking solution containing 10% standard goat serum (Sigma-Aldrich) for 1 hour. Primary antibodies [anti-human CAR66,67, which also recognizes porcine CAR ("Luminaire founders"—hCAR, 1:10,000; courtesy of Dr. Cheryl M. Craft, Doshini Eye Institute, Los Angeles, California)] were diluted in PBS and incubated overnight at 4°C. Secondary antibodies (Alexa Fluor 594, anti-rabbit, 1:1,000; Molecular Probes, Ingenium, Carlsbad, CA) were incubated for 45 minutes. Sections stained with anti-CAR antibodies were analyzed at 63x magnification using a Leica Laser confocal microscope system (Leica Microsystems), as previously described. 64 In short, for each eye, six different z-stackings of six different transduction regions were acquired. For each z-stacking, images from a single plane were used for CAR+ / EGFP+ cell counting. During this process, the inventors carefully moved along the Z-axis to distinguish the cells from each other, thus avoiding duplicate counting of the same cells. For each retina, the inventors counted CAR-positive (CAR+) / EGFP-positive (EGFP+) cells from the total CAR+ cells. The inventors then calculated the average number of CAR+ / EGFP+ cells in the three eyes of each experimental group.

[0395] EGFP quantification

[0396] The fluorescence intensity in PR was quantified rigorously and reproducibly in a bias-free manner, as previously described. 64 Monochrome channel images were acquired using a Leica microscope (Leica Microsystems). TIFF images were processed into grayscale using image analysis software (LAS AF lite; Leica Microsystems). Six images from each eye were analyzed at 20x magnification through a masked observer. The outline of the PR (outer nuclear layer + OS) was selectively revealed in each image, and the total fluorescence within the enclosed region was calculated in an unbiased manner using the same image analysis software. The fluorescence in the PR was then averaged from six images collected from different retinal slices from each eye. The inventors then calculated the average fluorescence of the three eyes in each experimental group.

[0397] Quantitative analysis of autofluorescence in lipofuscin

[0398] For lipofuscin fluorescence analysis, eyes were collected from pigmented Abca4+ / - and Abca4- / - mice 3 months after AAV injection. Mice were acclimatized to the dark overnight and sacrificed under dim red light. For each eye, four overlaid images of the temporal side of three sections from different regions of the eye were obtained using a TX2 filter (excitation: 560±40 nm; emission: 645±75 nm). 71-75 Images were acquired using a Leica DM5000B microscope with a 20X objective. Four images from each slice were then combined in a single atlas for further fluorescence analysis. The intensity (red signal) of lipofuscin fluorescence in each slice was automatically calculated using ImageJ software, and then normalized to the length of the potential RPE in the fluorescent region.

[0399] Transmission electron microscopy

[0400] For electron microscopy analysis, eyes were collected from albino Abca4- / - and Abca4+ / + mice 3 months after AAV injection. Eyes were fixed overnight in 0.2% glutaraldehyde-2% paraformaldehyde in 0.1M PHEM buffer (pH 6.9) (240mM PIPES, 100mM HEPES, 8mM MgCl2, 40mM EGTA), followed by rinsing in 0.1M PHEM buffer. The eyes were then dissected under an optical microscope to select tyrosinase-positive portions of the eyewash cups. The transduced portions of the eyewash cups were subsequently embedded in 12% gelatin, infused with 2.3M sucrose, and frozen in liquid nitrogen. Frozen sections (50nm) were cut using a Leica Ultramicrotome EM FC7 (Leica Microsystems), and the longitudinally connected pilosomes (PRs) were arranged very carefully. To avoid attribute bias in morphological data from different experimental groups, the counting of lipofuscin granules was performed using iTEM software (Olympus SYS, Hamburg, Germany) with a masked observer (Dr. Roman Polishchuk). The "Touch count" module of iTEM software was used to count randomly distributed 25 μm granules on the RPE layer. 2 The number of lipofuscin granules in each region (at least 40) was counted. Granule density was expressed as granule number / 25 μm. 2 .

[0401] Electroretinography (ERG) recording

[0402] Electrophysiological records for mice and pigs are detailed in (68) and (69), respectively.

[0403] Statistical analysis

[0404] A p-value ≤ 0.05 was considered statistically significant. One-way ANOVA (using R statistical software) with post-hoc multiple comparisons was used for comparison. Figure 2b(pANOVA=1,2x 10 -6 ), 2c (pANOVA=0,326), 8c (pANOVA=1,5x 10 -10 The data are shown in Table 6 (pANOVA = 0.0135), 8d (pANOVA = 0.034), 9a (pANOVA a-wave: 0.5; pANOVA b-wave: 0.8), and 8d (pANOVA = 0.034). This is because the count of lipofuscin granules (…) Figure 8 d) is represented as discrete numbers, so these are generalized from the linear model of the negative binomial. 65 The biases were analyzed. The statistically significant differences between groups, determined using post-hoc multiple comparison tests, are as follows: Figure 2 b: AP compared to AK: 1.08 x 10 -5 AP1 vs. AK: 0.05; AP2 vs. AK: 0.17; AP1 vs. AP: 1.8 x 10 -6 AP2 compared to AP: 2.8 x 10 -6 AP2 compared to AP1: 0,82. Figure 8 c: Abca4+ / - not injected vs. Abca4- / - not injected: 0.00; Abca4- / - not injected vs. Abca4- / - AAV5'+3': 9.3 x 10⁻⁶ -5 ;Abca4+ / - Uninjected vs. Abca4- / - AAV5'+3': 4 x 10 -6 . Figure 8 d: Abca4- / -PBS vs. Abca4- / -AAV5'+3': 0.01; Abca4+ / +PBS vs. Abca4- / -AAV5'+3': 0.37; Abca4+ / + Uninjected vs. Abca4- / -AAV5'+3': 0.53; Abca4+ / +PBS vs. Abca4- / -PBS: 0.05; Abca4+ / + Uninjected vs. Abca4- / -PBS: 0.03; Abca4+ / + Uninjected vs. Abca4+ / +PBS: 0.76. Table 6: 3xSTOP comparison with no degradation signal: 0.97; 3xSTOP comparison with PB29: 1.0; 3xSTOP comparison with 3xPB29: 0.15; 3xSTOP comparison with ubiquitin: 0.10; PB29 comparison with no degradation signal: 1.0; PB29 comparison with 3xPB29: 0.1; PB29 comparison with ubiquitin: 0.07; 3xPB29 comparison with no degradation signal: 0.06; 3xPB29 comparison with ubiquitin: 1.0; ubiquitin comparison with no degradation signal: 0.04.

[0405] The Stokes t-test is used for comparison. Figure 3 The data shown in c, d, and f.

[0406] result

[0407] Binary AAV hybrid vectors, including AP1, AP2, or AK recombination-initiating regions, demonstrate efficient transduction.

[0408] The inventor commented Figure 1 And several multi-carrier strategies shown in 13.

[0409] Specifically, they evaluated the transduction efficacy of binary AAV hybrid vectors with different homology regions in parallel. For this purpose, the inventors created a binary AAV2 / 2 hybrid vector comprising an ABCA4-3xflag coding sequence controlled by a ubiquitous CMV promoter, and a homology-containing AK... 14 AP 14 AP1 or AP2 20 area( Figure 7 The inventors used these vectors to infect HEK293 cells [multiple of infection, moi: each vector, 5 x 10^6]. 4 [Genome copy (GC) / cell]. Cell lysates were analyzed by Western blot analysis using an anti-3xflag antibody to detect ABCA4-3xflag ( Figure 2 Each binary AAV hybrid vector resulted in the expression of the full-length protein of the expected size, which was not detected in lanes where negative controls were loaded. Figure 2 a). Quantitative analysis of ABCA4 expression ( Figure 2 b) The infection with the binary AAV heterozygous AP1 and AP2 vectors resulted in slightly higher transgene expression levels than the binary AAV heterozygous AK vector, and significantly outperformed the binary AAV heterozygous AP vector. 14 The inventors previously found that the efficacy of binary AAV vectors dependent on homologous recombination was lower in terminally differentiated cells such as PR than in cell cultures. 14 Therefore, the inventors evaluated the subretinal administration of binary AAV AK, AP1, and AP2 vectors comprising a PR-specific human G protein-coupled receptor kinase 1 (GRK1) promoter (dose / eye for each vector: 1.9 x 10⁻⁶). 9 GC; Figure 2 c) PR-specific transduction levels in C57BL / 6 mice. One month after vector administration, the inventors detected higher concordance of ABCA4 protein expression in retinas treated with binary AAV-hybrid AK compared to AP1 or AP2 vectors. Figure 2 c).

[0410] The inclusion of heterologous ITRs in AAV vectors affected their production yields but did not reduce the level of truncated protein products.

[0411] To test whether the application of heterologous ITRs improves the productive directed chaining of binary AAV vectors, the inventors developed a binary AAV2 / 2 hybrid AK vector comprising an ABCA4-3xflag or MYO7A-HA coding sequence, having heterologous ITR2 and ITR5, with a 5:2 (left ITR from AAV5 and right ITR from AAV2) or 2:5 (left ITR from AAV2 and right ITR from AAV5) configuration. Figure 1 The generation of binary AAV vectors carrying heterologous ITR2 and ITR5 requires the simultaneous expression of Rep proteins from AAV serotypes 2 and 5, which cannot cross-complement viral replication. 23 In fact, it has been shown that Rep2 and Rep5 can bind interchangeably to ITR2 or ITR5, although not as efficiently as homologous ITRs. However, they cannot cleave the terminal dissociation sites of ITRs from other serotypes. 36 Therefore, before generating binary AAV hybrid AK vectors with heterologous ITR2 and ITR5, the inventors evaluated (i) the potential competition between Rep5 and Rep2 in the generation of AAV2 / 2-CMV-EGFP vectors (i.e., vectors with homologous ITR2), and (ii) the potential competition between Rep2 and Rep5 in the generation of AAV5 / 2-CMV-EGFP vectors (i.e., vectors with homologous ITR5), using equal amounts of Rep5Cap2 and Rep2Cap2 packaging constructs (1:1 ratio). Indeed, by providing the Rep5Cap2 packaging construct in addition to Rep2Cap2, the total yield of AAV2 / 2-CMV-EGFP vectors was reduced to 42% of the results obtained with only Rep2Cap2 as the packaging construct (average of four independent preparations of each type, p-Stokes t-test <0.05). Conversely, no significant difference was found in the total yield of AAV5 / 2-CMV-EGFP preparations obtained by adding Rep2Cap2 to Rep5Cap2, which was 83% of the results obtained when Rep5Cap2 was the only packaging construct transfected (average of four independent preparations of each type, with no significant difference found using a Stein t-test). Given the competition between Rep5 and Rep2 in the generation of vectors with ITR2, the inventors tested three different ratios of Rep5 and Rep2Cap2 packaging constructs in AAVs with heterologous ITR2 and ITR5 (Rep5 / Rep2Cap2 ratio in scheme A was 1:1, in scheme B it was 1:3, and in scheme C it was 1:10). As shown in Table 3, the viral titer, as determined by PCR quantification using probes annealed to ITR2, gradually increased as the amount of Rep5 decreased, with scheme C yielding the optimal titer.

[0412] Table 3. Yields of AAV5:2 / 2 vector in the presence of different proportions of Rep5 and Rep2 packaging constructs.

[0413]

[0414] ID: AAV5:2 / 2 vector quantity identification; GC: genome copy.

[0415] These results confirm the competition between Rep5 and Rep2 in the generation of vectors containing ITR2, leading us to follow scheme C to generate AAV vectors with heterologous ITR2 and ITR5. However, several AAV preparations obtained using this strategy revealed that: (i) the titer determined for ITR2 was up to 6-fold lower than the titer determined for transgenic sequences between ITRs (Table 4), indicating impaired ITR2 integrity; and (ii) the total yield of AAV vectors with heterologous ITR2 and ITR5 was on average about 6-fold less than those containing homologous ITR2 (Table 4).

[0416] Table 4. Low yields and differences among ITR2s, and transgenic titers of AAV2s with heterologous ITR2 and ITR5.

[0417]

[0418]

[0419] ID: AAV vector number; GC: Genome copy. a The value represents the mean ± SEM.

[0420] However, Southern blot analysis of AAV preparations with heterologous ITRs revealed no significant alteration in genome integrity. Figure 3 a).

[0421] To test whether the inclusion of heterologous ITRs in binary AAV hybrid AK vectors could enhance the formation of tail-head productive multiplicons and full-length protein transduction, while reducing the generation of truncated proteins, the inventors used ITRs encoding ABCA4 or MYO7A, possessing heterologous ITR2 and ITR5 (in a 5:2 / 2:5 configuration) or homologous ITR2 (… Figure 3 HEK293 cells were infected with a binary AAV hybrid vector (b, 3e).

[0422] Given the differences between ITR2s and the transgenic titers of vectors with heterologous but non-homologous ITRs (Table 4), the inventors, based on ITR2 or transgenic titers, applied 10... 4Genome copy (GC) / cell was used to infect cells. Western blot analysis of HEK293 cells infected with the binary AAV vector was performed based on ITR2 titer, using anti-3xflag (to detect ABCA4-3xflag). Figure 3 b) or anti-Myo7a ( Figure 3 e) Antibodies showed that the inclusion of heterologous ITR2 and ITR5 resulted in higher levels of full-length and truncated proteins than homologous ITR2 ( Figure 3 (b, c, d, f). However, this was not observed when HEK293 cells were infected with the same binary AAV vector preparation based on transgenic titer. Figure 3 (b, d). In summary, the ratio between full-length and truncated protein expression was similar, regardless of the ITR (included in the vector). Figure 3 c, d, f) and the carrier titer used to deliver the cells (c, d, f) Figure 3 (b, c, d).

[0423] The CL1 degradation determinant in the 5' half-vector reduces the generation of truncated protein products.

[0424] In order to selectively reduce the number of binary AAV hybrid carriers 14 The inventors placed the putative degradation sequence between the AK and right ITR following the splice donor signal in the 5' half vector, and between the AK and splice acceptor signal in the 3' half vector. Figure 1 Therefore, the degradation signal will be included in the truncated portion but not in the full-length protein, resulting in spliced ​​mRNA. As degradation signals in the 5' half-vector, the inventors have included: (i) the CL1 degradation determinant (CL1), (ii) four copies of the miR-let7b target site (4xLet7b), (iii) four copies of the miR-26a target site (4x26a), or (iv) a combination of three copies each of the miR-204 and miR-124 target sites (3x204+3x124) (Table 2). As degradation signals in the 3' half-vector, the inventors have included: (i) a 3-stop codon (STOP), (ii) a single (PB29) or three tandem copies (3xPB29) of PB29, or (iii) ubiquitin (Table 2). The inventors created binary AAV2 / 2 hybrid AK vectors encoding ABCA4 and incorporating different degradation signals, and evaluated their performance in infecting HEK293 cells [moi: each vector, 5 x 10^6]. 4 [Efficacy following genome copy (GC) / cell] Because miR-let7b, miR-26a, miR-204, and miR-124 are expressed at low levels or are completely absent in HEK293 cells (Ambion miRNA Research Guide and...).37 To test the silencing of constructs containing these miR target sites, the inventors used miR mimics (i.e., chemically modified small double-stranded RNAs that mimic endogenous miRs). 38 Cells were transfected and then infected with an AAV2 / 2 vector containing the corresponding target sites. To determine the miR mimic concentration required to achieve silencing of genes containing the corresponding miR target sites, the inventors used a plasmid encoding a reporter EGFP protein and containing the miR target sites prior to the polyadenylation signal (data not shown). The same experimental setup was used to further evaluate the miR target sites in the binary AAV hybrid AK vector case. The inventors found that the inclusion of miR-204+124 and 26a target sequences in the 5' half of the binary AAV hybrid AK vector reduced (although not eliminated) the expression of the truncated protein product, but did not affect the expression of the full-length protein. Figure 4 Unlike other methods, the inclusion of the miR-let7b target site failed to effectively reduce the expression of the truncated protein. Figure 4 ).

[0425] Obviously, as Figure 5 As shown in a, the inventors discovered that the inclusion of the CL1 degradation signal in the 5' half-vector reduced the expression of the truncated protein to undetectable levels, but did not affect the expression of the full-length protein. Figure 5 a). Given the tissue-specific expression differences of enzymes involved in the ubiquitination pathway that mediates CL1 degradation. 31 There may be changes regarding the efficacy of CL1. The inventors further evaluated the efficacy of the CL1 degradation determinant in the porcine retina, whose size and structure are similar to those in humans. 19,30,39,40 Therefore, it is an excellent preclinical large animal model for evaluating the safety and efficacy of the vector. For this purpose, the inventors subretinally injected large white pigs with AAV2 / 8 binary AAV hybrid AK vectors encoding ABCA4 (where the 5' half of the vector may or may not include the CL1 sequence) (dose / eye: 1 x 10⁻⁶). 11 GC). Clearly, the inventors found that the inclusion of the CL1 degradation signal in the 5' half of the vector resulted in a significant reduction in truncated protein expression, below the detection limit of Western blot analysis, but did not affect full-length protein expression. Figure 5 b). In degradation signals tested in the 3' half-vector, the inventors found that the stop codon did not affect truncated protein generation. In contrast, both PB29 (in single or triple tandem copy form) and ubiquitin were effective in reducing truncated protein expression. However, although ubiquitin also eliminated full-length protein expression, PB29 had a lesser effect on full-length protein generation. Figure 6 ).

[0426] Among the degradation signals tested in the 3' half-vector, the inventors identified three (PB29, 3xPB29, and ubiquitin) that reduced both the levels of the truncated protein product and the full-length protein. Figure 6 and Tables 5 and 6).

[0427] Table 5. Quantification of full-length ABCA4 relative to truncated protein expression, from Western blot analysis of HEK293 cells infected with a binary AAV hybrid vector containing miR target sites in the 5' half vector.

[0428]

[0429] The values ​​represent the mean ± sem of the ratio between the intensity of the full-length ABCA4 and truncated protein bands (from three independent experiments) in the presence of the corresponding mimic or the sequence-scrambled mimic. A Steiner t-test was used to compare the ratios of each vector pair in the presence of the sequence-scrambled or the corresponding mimic, and no significant differences were found.

[0430] Table 6: Quantification of full-length ABCA4 and truncated protein expression from Western blot analysis of HEK293 cells infected with a binary AAV hybrid vector containing degradation signals in the 3' half vector.

[0431]

[0432] Values ​​are expressed as the mean ± sem of the ratio between the intensities of full-length ABCA4 and truncated protein bands from vectors with or without degradation signals (from three independent experiments). Further details of the statistical analyses, including specific statistical values, can be found in the Statistical Analysis section of the Materials and Methods section.

[0433] Subretinal administration of the improved binary AAV carrier reduced lipofuscin accumulation in the Abca4- / - retina.

[0434] Based on our findings, the improved binary AAV hybrid-ABCA4 vector should include homologous ITR2, the AK homologous region, and CL1. This is because ABCA4 is expressed in human rod and cone photoreceptors. 70 The inventors identified suitable promoters for ABCA4 delivery by comparing the PR transduction properties of single AAV2 / 8 vectors encoding EGFP from the human GRK1 (G protein-coupled receptor kinase 1) or IRBP (inter-photoreceptor retinol-binding protein) promoters, which have been described as driving high levels of combined rod and cone PR transduction in different species. 53-55 Utilizing a linear region including a shape similar to the human macula with a cone:rod ratio of 1:3. 56The inventors developed a porcine retinal architecture by injecting 1x10⁻¹⁰ styrene subretinal into 3-month-old Large White pigs. 11 GC / eye AAV2 / 8-GRK1- or IRBP-EGFP vector. Four weeks after injection, the inventors analyzed the corresponding frozen retinal sections under a fluorescence microscope. Quantitative fluorescence analysis of EGFP in PR cell layers ( Figure 10 ab) shows that both promoters produced considerable levels of PR transduction (primarily rods in this region). However, when the inventors compared the results with those targeting cone repressor protein (CAR) (which is also EGFP positive), the results were significantly different. 57 When counting the number of antibody-labeled cones, it was found that the GRK1 promoter produced a higher (although not statistically significant) level of cone PR transduction (material, Figure 10 Therefore, the inventors incorporated the GRK1 promoter into our improved binary AAV hybrid ABCA4 vectors and investigated their ability to express ABCA4 and reduce the aberrant levels of A2E-containing autofluorescent lipofuscin in the RPE of Abca4- / - mice. The inventors initially used the improved binary AAV vectors (dose per eye: 2 x 10⁻⁶). 9 Subretinal injection of GC into one-month-old C57 / BL6 mice and Western blot analysis revealed detectable (albeit variable) levels of full-length ABCA4 protein in 12 out of 24 (50%) injected eyes. Figure 8 a; ABCA4 protein level in ABCA4-positive eyes: 2.8 ± 0.7 a.u. (mean ± standard error of mean)]. This is similar to our previous finding that different forms of binary AAV platforms lead to 50% ABCA4-expressing eyes. 14 Then, the inventors subretinally injected the modified binary AAV carrier into the temporal region of the eye of 5.5-month-old pigmented Abca4- / - mice (carrier dose / eye: 1.8 x 10⁻⁶). 9 GC). Three months later, the inventors harvested the eye and measured the level of lipofuscin fluorescence in the temporal region of the eye on frozen retinal sections (excitation: 560±40 nm; emission: 645±75 nm) [in RPE alone or in RPE + outer segment (OS)]( Figure 8 bc and Figure 11 The inventors found that the fluorescence intensity of lipofuscin in this region of the eyes of untreated Abca4- / - mice was significantly higher than that in Abca4+ / - and - / - mice injected with a therapeutic binary AAV hybrid ABCA4 vector. Figure 8 b, c and Figure 11The inventors then used transmission electron microscopy to count the number of RPE lipofuscin granules. These were increased in 5.5–6-month-old albino Abca4– / - mice injected with PBS compared to age-matched Abca4+ / + controls. Figure 8 d) The increased levels were similar to those that the inventors had independently detected in Abca4- / - mice that were either uninjected or injected with control AAV vectors (data not shown). The number of lipofuscin granules in Abca4- / - RPE was normalized 3 months after subretinal injection of the modified binary AAV hybrid ABCA4 vector (vector dose / eye: 1x10). 9 GC, Figure 8 d).

[0435] The improved binary AAV vector was safe after subretinal administration to the retinas of mice and pigs.

[0436] To investigate the safety of the improved binary AAV2 / 8 hybrid ABCA4 vectors, the inventors injected them subretinally into wild-type C57BL / 6 mice and Large White pigs (dose per eye for each vector: 3 x 10⁻⁶). 9 and 1x10 11 GC). One month after injection, the inventors examined retinal electrical activity using Ganzfeld electroretinography (ERG) and found no significant difference in a- and b-amplitude between mouse eyes injected with the binary AAV hybrid ABCA4 vector and eyes injected with the negative control AAV vector or PBS. Figure 9 a and materials, Figure 12 a). Similarly, in pig eyes injected with the binary AAV hybrid ABCA4 vector and control eyes injected with PBS, the beta-amplitude in scotopic, photopic, maximal response, and flicker ERG tests was comparable ( Figure 9 b and materials, Figure 12 b).

[0437] discuss

[0438] The limited packaging capacity of AAV represents one of the major obstacles to expanding its application in IRD gene therapy. However, recently, several research groups have independently reported that binary AAV vectors can effectively expand the loading capacity of AAV in mouse and porcine retinas. 14,17,19,41 This expands the applicability of AAV to IRDs, which is attributed to mutations in genes that are unsuitable for single, typical AAV vectors. The inventors have designed and overcome some limitations associated with the application of binary AAV vectors, namely their relatively lower efficacy compared to single vectors, and the generation of truncated proteins that may cause safety concerns.

[0439] The aim is to increase the tail-head ligation strategy of binary AAV genomes, which should theoretically increase the full-length level and reduce the level of truncated proteins from free single-half vectors. The inventors designed and improved tail-head binary AAV heterozygous genome ligation by including optimal regions of homologous or heterologous ITRs. In a parallel evaluation of the aforementioned homologous regions, the inventors found that using binary AAV heterozygous AK vectors, as recently developed by Lostal et al. 20 The publicly disclosed AP1 and AP2 sequences and the AK sequence from F1 phage 14 The vector drives similar levels of protein expression in vitro, and in mouse retina, it drives more consistent ABCA4 expression. Independently, the availability of different homologous regions facilitates the proper chaining of the ternary AAV vector to further expand AAV loading capacity. 20,42 Heterogeneous ITR2 and ITR5 have been successfully incorporated into the binary system. 24,25 and Sanyuan 42 AAV vectors. The inventors found that AAV vectors with heterologous ITR2 and ITR5 had lower yields than those with homologous ITR2. The inventors also detected that fewer vector genomes had heterologous ITRs when their ITR2 was probed, compared to when different regions of their genomes were probed. Since the inventors showed that Rep5 interferes with the production of vectors with ITR2, this indicates an anomaly in the level of ITR2 included in AAV vectors with heterologous ITRs, which is produced in the presence of Rep5, but not in AAV vectors with homologous ITR2, which is produced only in the presence of Rep2 and shows similar titers regardless of whether the inventors probed ITR2 or different regions of the genome. These results partially distinguish them from those previously reported (where binary AAV vectors with heterologous ITR2 and ITR5 had higher transduction efficiency than vectors with homologous ITRs and no apparent production problems). 24,25 In addition to different packaging constructs and generation schemes, in this study, the inventors employed homologous regions between the two halves of the carrier (unlike the previously reported trans-splicing system, which simply relies on ITR for chaining). 24,25 ( ) binary AAV hybrid vectors. Because in binary AAV hybrid vectors, the reconstruction of the full-length gene is mainly determined by the homologous regions included in the vector. 16 (It guides the formation of polymorphs) mediates, which allows the inventors to use vectors with heterologous ITRs compared to previous studies using trans-splicing vectors. 24,25A smaller increase in transgene expression was observed. Furthermore, the inventors may have overestimated the potency of vectors with heterologous ITRs, as they used them based on ITR2 titers that were 3-6 times lower than those calculated for the transgene sequences of MYO7A- and ABCA4- expression vectors, respectively. Since the titers calculated for ITR2 and transgene sequences were similar between corresponding binary AAV vectors with homologous ITR2, the inventors used them at volumes 3-6 times lower than those using heterologous ITR2 and ITR5. This could explain the significantly higher levels of full-length and truncated protein products from binary AAV vectors with heterologous ITRs compared to those with homologous ITRs.

[0440] In the inventors' previous studies, no signs of local toxicity were observed up to eight months after subretinal administration of the binary AAV carrier. 14 However, the generation of truncated protein products from single-half vectors of binary AAVs may raise safety concerns. It has been shown that the inclusion of miR target sites in the transcripts of genes is limiting in various tissues (including the retina). 30 An effective strategy for transgenic expression in miR was proposed. However, only when the inventors incorporated target sites for miR-204+124 and 26a did they achieve a partial reduction in truncated protein production in vitro. In fact, mRNA characteristics outside the miR target site may affect the efficacy of silencing. 43,44 In this regard, because the truncated protein products derived from the 5' half are produced from vectors that do not possess typical polyadenylation signals, the resulting mRNAs may not undergo efficient miR-mediated silencing. Importantly, the inventors achieved complete degradation of the truncated protein products from the 5' half vector by incorporating a CL1 degradation determinant. The inventors showed that this signal was effective in vitro and in porcine retina, and the enzymes indicating the degradation pathway required for CL1 activity were expressed in different cell types. Since the abundance of the truncated protein products from the 3' half vector was lower than that from the 5' half vector (…),… Figure 6 The safety concerns arising from its presence should be low. Data from mouse and porcine retinas presented in this paper support the safety of the improved binary AAV vector.

[0441] The inventors found that subretinal administration of a modified binary AAV vector (controlled by the GRK1 promoter, which provides high levels of combined rod and cone transduction) resulted in efficient ABCA4 delivery in mice, albeit at varying levels. This may be attributed to the genetic variability in subretinal injection in mouse eyes and the overall lower efficacy of the binary AAV system compared to a single AAV vector. 14Regardless of this variability, the inventors found that binary AAV-mediated ABCA4 delivery resulted in a significant reduction in lipofuscin in the Abca4- / - retina, suggesting that a wide range of transgene expression levels can similarly contribute to therapeutic efficacy. This was observed using two independent techniques; however, a more significant phenotypic improvement was observed when the inventors dissected and analyzed the AAV transduction regions of the retina (which actually showed normalized lipofuscin granule numbers). In summary, the present invention provides a multi-vector with improved features suitable for clinical application (specifically for the treatment of retinal diseases). Furthermore, the present invention improves the safety and efficacy of multi-vectors with further expanded loading capacity. 20,42 .

[0442] References

[0443] 1. Trapani, I et al. (2014). Progress in retinal and eye research 43:108-128.

[0444] 2. Boye, SE, Boye, SL, Lewin, AS and Hauswirth, WW (2013). Molecular therapy: the journal of the American Society of Gene Therapy 21:509-519.

[0445] 3.Bainbridge, JW et al. (2008). The New England journal of medicine 358:2231-2239.

[0446] 4. Maguire, AM et al. (2009). Lancet 374:1597-1605.

[0447] 5. Maguire, AM et al. (2008). The New England journal of medicine 358:2240-2248.

[0448] 6. Cideciyan, AV et al. (2009). Human gene therapy 20:999-1004.

[0449] 7. Simonelli, F et al. (2010). Molecular therapy: the journal of the American Society of Gene Therapy 18:643-650.

[0450] 8. Allikmets, R et al. (1997). Nature genetics 15:236-246.

[0451] 9. Molday, RS and Zhang, K (2010). Progress in lipid research 49:476-492.

[0452] 10.Millan, JM et al. (2011). Journal of ophthalmology 2011:417217.

[0453] 11. Hasson, T, et al. (1995). PNAS 92:9815-9819.

[0454] 12. Liu, X, Ondek, B and Williams, DS (1998). Nature genetics 19:117-118.

[0455] 13.Gibbs, D et al. (2010). Investigative ophthalmology&visual science 51:1130-1135.

[0456] 14. Trapani, I, Colella, P, Sommella, A, Iodice, C, Cesi, G, de Simone, S, et al. (2014). Effective delivery of large genes to the retina by dual AAV vectors. EMBO Molecular Medicine 6:194-211.

[0457] 15.Duan, D, Yue, Y and Engelhardt, JF (2001). Molecular therapy: the journal of the American Society of Gene Therapy 4:383-391.

[0458] 16. Ghosh, A, Yue, Y, Lai, Y and Duan, D (2008). Molecular therapy: the journal of the American Society of Gene Therapy 16:124-130.

[0459] 17. Dyka, FM, et al., (2014). Human gene therapy methods 25:166 - 177.

[0460] 18. Lopes, VS, et al. (2013). Gene Ther.

[0461] 19. Colella, P, et al. (2014). Gene Ther 21:450 - 456.

[0462] 20. Lostal, W, Kodippili, K, Yue, Y and Duan, D (2014). Human gene therapy 25:552 - 562.

[0463] 21. Flotte, TR, et al. (1993). The Journal of biological chemistry 268:3781 - 3790.

[0464] 22. Ghosh, A, Yue, Y and Duan, D (2011). Human gene therapy 22:77 - 83.

[0465] 23. Chiorini, JA, et al., (1999). Journal of virology 73:1309 - 1319.

[0466] 24. Yan, Z, Zak, R, Zhang, Y and Engelhardt, JF (2005). Journal of virology 79:364 - 379.

[0467] 25. Yan, Z, et al. (2007). Human gene therapy 18:81 - 87.

[0468] 26. Karali, et al. (2010). BMC genomics 11:715.

[0469] 27. Kutty, RK, et al. (2010). Molecular vision 16:1475 - 1486.

[0470] 28. Ragusa, M, et al. (2013). Molecular vision 19:430 - 440.

[0471] 29. Sundermeier, T.R. and Palczewski, K. (2012). Cellular and Molecular Life Sciences: CMLS 69: 2739 - 2750.

[0472] 30. Karali, M. et al. (2011). PLoS ONE 6: e22166.

[0473] 31. Gilon, T., Chomsky, O. and Kulka, R.G. (1998). The EMBO Journal 17: 2759 - 2766.

[0474] 32. Bence, N.F., Sampat, R.M. and Kopito, R.R. (2001). Science 292: 1552 - 1555.

[0475] 33. Bachmair, A., Finley, D. and Varshavsky, A. (1986). Science 234: 179 - 186.

[0476] 34. Johnson, E.S. et al., (1992). The EMBO Journal 11: 497 - 505.

[0477] 35. Sadis, S. et al., (1995). Molecular and Cellular Biology 15: 4086 - 4094.

[0478] 36. Chiorini, J.A., Afione, S. and Kotin, R.M. (1999). Journal of Virology 73: 4293 - 4298.

[0479] 37. Tian, W. et al. (2012). PLoS ONE 7: e29551.

[0480] 38. Wang, Z. (2011). Methods in Molecular Biology 676: 211 - 223.

[0481] 39. Mussolino, C. et al. (2011). Gene Ther 18: 637 - 645.

[0482] 40. Hendrickson, A. and Hicks, D. (2002). Experimental Eye Research 74: 435 - 444.

[0483] 41. Reich, SJ et al. (2003). Human gene therapy 14:37 - 44.

[0484] 42. Koo, T et al., (2014). Human gene therapy 25:98 - 108.

[0485] 43. Walters, RW, Bradrick, SS and Gromeier, M (2010). Rna 16:239 - 250.

[0486] 44. Ricci, EP et al. (2011). Nucleic acids research 39:5215 - 5231.

[0487] 45. Auricchio et al. (2001). Human molecular genetics 10:3075 - 3081.

[0488] 46. Gao, G et al. (2000). Human gene therapy 11:2079 - 2091.

[0489] 47. Young, JE et al., (2003). Investigative ophthalmology & visual science 44:4076 - 4085.

[0490] 48. Doria, M et al., (2013). Human gene therapy methods 24:392 - 398.

[0491] 49. Zhang, Y et al., (2000). Journal of virology 74:8003 - 8010.

[0492] 50. Drittanti, L et al., (2000). Gene Ther 7:924 - 929.

[0493] 51. Gargiulo, S et al. (2012). ILAR journal / National Research Council, Institute of Laboratory Animal Resources 53:E70 - 81.

[0494] 52. Liang, FQ, et al., (2001). Methods in molecular medicine 47:125 - 139.

[0495] 53. Beltran, et al. (2012) Proc. Natl. Acad. Sci. U.S.A., 109, 2132 - 2137.

[0496] 54. Boye, SE, et al. (2012) Hum. Gene Ther., 23, 1101 - 1115.

[0497] 55. Khani, SC, et al., (2007) Invest. Ophthalmol. Vis. Sci., 48, 3954 - 3961.

[0498] 56. Chandler, MJ, et al., (1999) Vet. Ophthalmol., 2, 179 - 184.

[0499] 57. Li, A., Zhu, X. and Craft, CM. (2002) Invest. Ophthalmol. Vis. Sci., 43, 1375 - 1383.

[0500] 58. Allocca, M., et al. (2008) J. Clin. Invest., 118, 1955 - 1964.

[0501] 59. Parish, CA, et al., (1998) Proc. Natl. Acad. Sci. U.S.A., 95, 14609 - 14613.

[0502] 60. Ben - Shabat, S., et al., (2002) J. Biol. Chem., 277, 7183 - 7190.

[0503] 61. Gargiulo, S., et al., (2012) ILAR J, 53, E70 - 81.

[0504] 62. Liang, F.Q., et al., (2001) Methods Mol. Med., 47, 125 - 139.

[0505] 63. Gargiulo, A., et al. (2009) Mol. Ther., 17, 1347 - 1354.

[0506] 64. Manfredi, A. et al. (2013) Hum. Gene Ther., 24, 982-992.

[0507] 65. Venables VN and Ripley BD. (2002) Modern Applied Statistics with S. Springer Science + Business Media, New York, USA.

[0508] 66. Li, A., Zhu, X., Brown, B. and Craft, CM (2003) Adv. Exp. Med. Biol., 533, 361-368.

[0509] 67. Li, A. et al. (2003) Invest. Ophthalmol. Vis. Sci., 44, 996-1007.

[0510] 68. Allocca, M. et al. (2011) Invest. Ophthalmol. Vis. Sci., 52, 5713-5719.

[0511] 69. Testa, F. et al. (2011) Invest. Ophthalmol. Vis. Sci., 52, 5618-5624.

[0512] 70. Molday, LL, Rabin, AR and Molday, RS (2000) Nat. Genet., 25, 257-258.

[0513] 71.Sparrow, JR, Wu, Y., Nagasaki, T., Yoon, KD, Yamamoto, K. and Zhou, J. (2010) Photochem Photobiol Sci, 9, 1480-1489.

[0514] 72. Sparrow, JR and Duncker, T. (2014) J Clin Med, 3, 1302-1321.

[0515] 73. Finnemann, SC, Leung, LW and Rodriguez-Boulan, E. (2002) Proc. Natl. Acad. Sci. USA, 99, 3842-3847.

[0516] 74. Secondi, R., Kong, J., Blonska, AM, Staurenghi, G. and Sparrow, JR (2012) Invest. Ophthalmol. Vis. Sci., 53, 5190-5197.

[0517] 75. Delori, FC, Dorey, CK, Staurenghi, G., Arend, O., Goger, DG and Weiter, JJ (1995) Invest. Ophthalmol. Vis. Sci., 36, 718-729. sequence list <110> Fondazione Telethon Foundation <120> Multi-carrier systems and their applications <130> PCT 129062 <150> US62 / 127,463 <151> 2015-03-03 <160> 78 <170> PatentIn version 3.5 <210> 1 <211> 16 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 1 Ala Cys Lys Asn Trp Phe Ser Ser Leu Ser His Phe Val Ile His Leu 1 5 10 15 <210> 2 <211> 35 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 2 Ser Leu Ile Ser Leu Pro Leu Pro Thr Arg Val Lys Phe Ser Ser Leu 1 5 10 15 Leu Leu Ile Arg Ile Met Lys Ile Ile Thr Met Thr Phe Pro Lys Lys 20 25 30 Leu Arg Ser 35 <210> 3 <211> 16 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 3 Phe Tyr Tyr Pro Ile Trp Phe Ala Arg Val Leu Leu Val His Tyr Gln 1 5 10 15 <210> 4 <211> 46 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 4 Ser Asn Pro Phe Ser Ser Leu Phe Gly Ala Ser Leu Leu Ile Asp Ser 1 5 10 15 Val Ser Leu Lys Ser Asn Trp Asp Thr Ser Ser Ser Ser Cys Leu Ile 20 25 30 Ser Phe Phe Ser Ser Val Met Phe Ser Ser Thr Thr Arg Ser 35 40 45 <210> 5 <211> 39 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 5 Cys Arg Gln Arg Phe Ser Cys His Leu Thr Ala Ser Tyr Pro Gln Ser 1 5 10 15 Thr Val Thr Pro Phe Leu Ala Phe Leu Arg Arg Asp Phe Phe Phe Leu 20 25 30 Arg His Asn Ser Ser Ala Asp 35 <210> 6 <211> 46 <212> PRT <213> artificial sequence <220> <223> synthesis <400> 6 Gly Ala Pro His Val Val Leu Phe Asp Phe Glu Leu Arg Ile Thr Asn 1 5 10 15 Pro Leu Ser His Ile Gln Ser Val Ser Leu Gln Ile Thr Leu Ile Phe 20 25 30 Cys Ser Leu Pro Ser Leu Ile Leu Ser Lys Phe Leu Gln Val 35 40 45 <210> 7 <211> 39 <212> PRT <213> artificial sequence <220> <223> synthesis <400> 7 Asn Thr Pro Leu Phe Ser Lys Ser Phe Ser Thr Thr Cys Gly Val Ala 1 5 10 15 Lys Lys Thr Leu Leu Leu Ala Gln Ile Ser Ser Leu Phe Phe Leu Leu 20 25 30 Leu Ser Ser Asn Ile Ala Val 35 <210> 8 <211> 45 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 8 Pro Thr Val Lys Asn Ser Pro Lys Ile Phe Cys Leu Ser Ser Ser Pro 1 5 10 15 Tyr Leu Ala Phe Asn Leu Glu Tyr Leu Ser Leu Arg Ile Phe Ser Thr 20 25 30 Leu Ser Lys Cys Ser Asn Thr Leu Leu Thr Ser Leu Ser 35 40 45 <210> 9 <211> 30 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 9 Ser Asn Gln Leu Lys Arg Leu Trp Leu Trp Leu Leu Glu Val Arg Ser 1 5 10 15 Phe Asp Arg Thr Leu Arg Arg Pro Trp Ile His Leu Pro Ser 20 25 30 <210> 10 <211> 50 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 10 Ser Ile Ser Phe Val Ile Arg Ser His Ala Ser Ile Arg Met Gly Ala 1 5 10 15 Ser Asn Asp Phe Phe His Lys Leu Tyr Phe Thr Lys Cys Leu Thr Ser 20 25 30 Val Ile Leu Ser Lys Phe Leu Ile His Leu Leu Leu Arg Ser Thr Pro 35 40 45 Arg Val 50 <210> 11 <211> twenty two <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 11 aggcatagga tgacaaaggg aa 22 <210> 12 <211> 20 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 12 ggcattcacc gcgtgcctta 20 <210> 13 <211> twenty two <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 13 agcctatcct ggattacttg aa 22 <210> 14 <211> 9 <212> PRT <213> Artificial sequence <220> <223> synthesis <400> 14 Ser Trp Asn Phe Lys Leu Tyr Val Met 1 5 <210> 15 <211> 14 <212> PRT <213> Artificial sequence <220> <223> Synthetic <400> 15 Met His Ser Trp Asn Phe Lys Leu Tyr Val Met Gly Ser Gly 1 5 10 <210> 16 <211> 48 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 16 gcctgcaaga actggttcag cagcctgagc cacttcgtga tccacctg 48 <210> 17 <211> 158 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 17 aggcatagga tgacaaaggg aacgataggc ataggatgac aaagggaaaa gcttaggcat 60 aggatgacaa agggaaggta ccagatctgg cattcaccgc gtgccttacg atggcattca 120 ccgcgtgcct taaagcttgg cattcaccgc gtgcctta 158 <210> 18 <211> 102 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 18 agcctatcct ggattacttg aacgatagcc tatcctggat tacttgaaaa gcttagccta 60 tcctggatta cttgaatcac agcctatcct ggattacttg aa 102 <210> 19 <211> 42 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 19 atgcacagct ggaacttcaa gctgtacgtc atgggcagcg gc 42 <210> 20 <211> 27 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 20 agctggaact tcaagctgta cgtcatg 27 <210> twenty one <211> 136 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> twenty one atgcacagct ggaacttcaa gctgtacgtc atgggcagcg gcggggtacc atgcacagct 60 ggaacttcaa gctgtacgtc atgggcagcg gcggatgcac agctggaact tcaagctgta 120 cgtcatgggc agcggc 136 <210> twenty two <211> 77 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> twenty two gggatttttc cgatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac 60 gcgaatttta acaaaat 77 <210> twenty three <211> 77 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> twenty three gggattttgc cgatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac 60 gcgaatttta acaaaat 77 <210> twenty four <211> 287 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> twenty four ccccgggtgc gcggcgtcgg tggtgccggc gggggcgcc aggtcgcagg cggtgtaggg 60 ctccaggcag gcggcgaagg ccatgacgtg cgctatgaag gtctgctcct gcacgccgtg 120 aaccaggtgc gcctgcgggc cgcgcgcgaa caccgccacg tcctcgcctg cgtgggtctc 180 ttcgtccagg ggcactgctg actgctgccg atactcgggg ctcccgctct cgctctcggt 240 aacatccggc cgggcgccgt ccttgagcac atagcctgga ccgtttc 287 <210> 25 <211> 288 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 25 cgcagggcag cctctgtcat ctccatcagg gaggggtcca gtgtggagtc tcggtggatc 60 tcgtatttca tgtctccagg ctcaaagaga cccatgagat gggtcacaga cgggtccagg 120 gaagcctgca tgagctcagt gcggttccac acataccggg caccctggcg cttcgccagc 180 cattcctgca ccagattctt cccgtccagc ctggtcccac cttggctgta gtcatctggg 240 tactcagggt ctggggttcc catgcgaaac atgtactttc ggcctcca 288 <210> 26 <211> 278 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 26 gtgatcctag gtggaggccg aaagtacatg tttcgcatgg gaaccccaga ccctgagtac 60 ccagatgact acagccaagg tgggaccagg ctggacggga agaatctggt gcaggaatgg 120 ctggcgaagc gccagggtgc ccggtacgtg tggaaccgca ctgagctcat gcaggcttcc 180 ctggacccgt ctgtgaccca tctcatgggt ctctttgagc ctggagacat gaaatacgag 240 atccaccgag actccacact ggacccctcc ctgatgga 278 <210> 27 <211> 82 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 27 gtaagtatca aggttacaag acaggtttaa ggagaccaat agaaactggg cttgtcgaga 60 cagagaagac tcttgcgttt ct 82 <210> 28 <211> 51 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 28 gataggcacc tattggtctt actgacatcc actttgcctt tctctccaca g 51 <210> 29 <211> 130 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 29 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct 130 <210> 30 <211> 130 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 30 aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg ctcactgagg 60 ccgggcgacc aaaggtcgcc cgacgcccgg gctttgcccg ggcggcctca gtgagcgagc 120 gagcgcgcag 130 <210> 31 <211> 175 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 31 ctctcccccc tgtcgcgttc gctcgctcgc tggctcgttt gggggggtgg cagctcaaag 60 agctgccaga cgacggccct ctggccgtcg cccccccaaa cgagccagcg agcgagcgaa 120 cgcgacaggg gggagagtgc cacactctca agcaaggggg ttttgtaagc agtga 175 <210> 32 <211> 175 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 32 tcactgctta caaaaccccc ttgcttgaga gtgtggcact ctcccccctg tcgcgttcgc 60 tcgctcgctg gctcgtttgg gggggcgacg gccagagggc cgtcgtctgg cagctctttg 120 agctgccacc cccccaaacg agccagcgag cgagcgaacg cgacaggggg gagag 175 <210> 33 <211> 153 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 33 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gac 153 <210> 34 <211> 583 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 34 tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg 60 cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt 120 gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 180 atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc 240 aagtccgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta 300 It should be noted that there seems to be a small error in the Chinese in line 27 where it says "<212> DNA" which should probably be "<212> DNA". This has been corrected in the translation. catgacctta cgggactttc ctacttggca gtacatctac gtattagtca tcgctattac 360 catggtgatg cggttttggc agtacaccaa tgggcgtgga tagcggtttg actcacgggg 420 atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg 480 ggactttcca aaatgtcgta ataaccccgc cccgttgacg caaatgggcg gtaggcgtgt 540 acggtgggag gtctatataa gcagagctcg tttagtgaac cgt 583 <210> 35 <211> 133 <212> DNA <213> Artificial sequence <220> '<223> Synthetic <400> 35 gtaagtatca aggttacaag acaggtttaa ggagaccaat agaaactggg cttgtcgaga 60 cagagaagac tcttgcgttt ctgataggca cctattggtc ttactgacat ccactttgcc 120 tttctctcca cag 133 <210> 36 <211> 299 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 36 ctagtgggcc ccagaagcct ggtggttgtt tgtccttctc aggggaaaag tgaggcggcc 60 ccttggagga aggggccggg cagaatgatc taatcggatt ccaagcagct caggggattg 120 tctttttcta gcaccttctt gccactccta agcgtcctcc gtgaccccgg ctgggattta 180 gcctggtgct gtgtcagccc cgggctccca ggggcttccc agtggtcccc aggaaccctc 240 gacagggcca gggcgtctct ctcgtccagc aagggcaggg acgggccaca ggcaagggc 299 <210> 37 <211> 365 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 37 ctagttatta atagtaatca attacggggt cattagttca tagcccatat atggagttcc 60 gcgttacata acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat 120 tgacgtcaat aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc 180 aatgggtgga gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc 240 caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt 300 acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta 360 ccatg 365 <210> 38 <211> 229 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 38 tcgaggtgag ccccacgttc tgcttcactc tccccatctc ccccccctcc ccacccccaa 60 ttttgtattt atttattttt taattatttt gtgcagcgat gggggcgggg cggggcgagg 120 cggagaggtg cggcggcagc caatcggagc ggcgcgctcc gaaagtttcc ttttatggcg 180 aggcggcggc ggcggcggct ctataaaaag cgaagcgcgc ggcgggcgg 229 <210> 39 <211> 235 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 39 agcacagtgt ctggcatgta gcaggaacta aaataatggc agtgattaat gttatgatat 60 gcagacacaa cacagcaaga taagatgcaa tgtaccttct gggtcaaacc accctggcca 120 ctcctccccg atacccaggg ttgatgtgct tgaattagac aggattaaag gcttactgga 180 gctggaagcc ttgccccaac tcaggagttt agccccagac cttctgtcca ccagc 235 <210> 40 <211> 22 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 40 aaccacacaa cctactacct ca 22 <210> 41 <211> 102 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 41 aaccacacaa cctactacct cacgataacc acacaaccta ctacctcaaa gcttaaccac 60 acaacctact acctcatcac aaccacacaa cctactacct ca 102 <210> 42 <211> 105 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 42 agcctgatca gcctgcccct gcccacccgg gtgaagttca gcagcctgct gctgatccgg 60 atcatgaaga tcatcaccat gaccttcccc aagaagctgc ggagc 105 <210> 43 <211> 48 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 43 ttctactaccccatctggtt cgcccgggtg ctgctggtgc actaccag 48 <210> 44 <211> 138 <212> DNA <213> Artificial sequence <220> <223> Synthesis <400> 44 agcaacccct tcagcagcct gttcggcgcc agcctgctga tcgacagcgt gagcctgaag 60 agcaactggg acaccagcag cagcagctgc ctgatcagct tcttcagcag cgtgatgttc 120 agcagcacca cccggagc 138 <210> 45 <211> 117 <212> DNA <213> Artificial Sequence <220> <223> Synthesis <400> 45 tgccggcagc ggttcagctg ccacctgacc gccagctacc cccagagcac cgtgaccccc 60 ttcctggcct tcctgcggcg ggacttcttc ttcctgcggc acaacagcag cgccgac 117 <210> 46 <211> 138 <212> DNA <213> Artificial Sequence <220> <223> Synthesis <400> 46 ggcgcccccc acgtggtgct gttcgacttc gagctgcgga tcaccaaccc cctgagccac 60[[ID=​​​​​​​​​​​​<220> <223> synthesis <400> 47 aacaccccccc tgttcagcaa gagcttcagc accacctgcg gcgtggccaa gaagaccctg 60 ctgctggccc agatcagcag cctgttcttc ctgctgctga gcagcaacat cgccgtg 117 <210> 48 <211> 135 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 48 cccaccgtga agaacagccc caagatcttc tgcctgagca gcagccccta cctggccttc 60 aacctggagt acctgagcct gcggatcttc agcaccctga gcaagtgcag caacaccctg 120 ctgaccagcc tgagc 135 <210> 49 <211> 90 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 49 agcaaccagc tgaagcggct gtggctgtgg ctgctggagg tgcggagctt cgaccggacc 60 ctgcggcggc cctggatcca cctgcccagc 90 <210> 50 <211> 150 <212> DNA <213> Artificial sequence <220> <223> synthesis <400> 50 agcatcagct tcgtgatccg gagccacgcc agcatccgga tgggcgccag caacgacttc 60 ttccacaagc tgtacttcac caagtgcctg accagcgtga tcctgagcaa gttcctgatc 120 cacctgctgc tgcggagcac cccccgggtg 150 <210> 51 <211> 11 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 51 tgaatgaatg a 11 <210> 52 <211> 243 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 52 ttcgagcaga catgataaga tacattgatg agtttggaca aaccacaact agaatgcagt 60 gaaaaaaatg ctttatttgt gaaatttgtg atgctattgc tttatttgta accattataa 120 gctgcaataa acaagttaac aacaacaatt gcattcattt tatgtttcag gttcaggggg 180 agatgtggga ggttttttaa agcaagtaaa acctctacaa atgtggtaaa atcgataagg 240 atc 243 <210> 53 <211> 2918 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 53 atgggcttcg tgagacagat acagcttttg ctctggaga actggaccct gcggaaaagg caaaagattc gctttgtggt ggaactcgtg tggcctttat ctttatttct ggtcttgatc 120 tggttaagga atgccaaccc gctctacagc catcatgaat gccatttccc caacaaggcg atgccctcag caggaatgct gccgtggctc caggggatct tctgcaatgt gaacaatccc tgttttcaaa gccccacccc aggagaatct cctggaattg tgtcaaacta taacaactcc atcttggcaa gggtatatcg agttttcaa gaactcctca tgaatgcacc agagagccag caccttggcc gtatttggac agagctacac atcttgtccc aattcatgga caccctccgg 420 actcacccgg agagaattgc aggagagga attcgaata gggatatctt gaagatga gaaacactga cactatttct cattaaaaac atcggcctgt ctgactcagt ggtctacctt ctgatcaact ctcaagtccg tccagagcag ttcgctcatg gagtcccgga cctggcgctg 600 aaggacatcg cctgcagcga ggccctcctg gagcgcttca tcatcttcag ccagagacgc 660 ggggcaaaga cggtgcgcta tgccctgtgc tccctctccc agggcaccct acagtggata 720 gaagacactc tgtatgccaa cgtggacttc ttcaagctct tccgtgtgct tcccacactc 780 ctagacagcc gttctcaagg tatcaatctg agatcttggg gaggaatatt atctgatatg 840 tcaccaagaa ttcaagagtt tatccatcgg ccgagtatgc aggacttgct gtgggtgacc 900 aggcccctca tgcagaatgg tggtccagag acctttacaa agctgatggg catcctgtct 960 gacctcctgt gtggctaccc cgagggaggt ggctctcggg tgctctcctt caactggtat 1020 gaacaata actataaggc ctttctgggg attgactcca caaggaagga tcctatctat 1080 tcttatgaca gaagaacaac atccttttgt aatgcattga tccagagcct ggagtcaaat 1140 cctttaacca aaatcgcttg gagggcggca aagcctttgc tgatgggaaa aatcctgtac 1200 1260 ctggaacacg ttaggaagtt ggtcaaagcc tgggaagaag tagggcccca gatctggtac 1320 ttctttgaca acagcacaca gatgaacatg atcagagata ccctggggaa cccaacagta 1380 aaagactttt tgaataggca gcttggtgaa gaaggtatta ctgctgaagc catcctaaac 1440 ttcctctaca agggccctcg ggaaagccag gctgacgaca tggccaactt cgactggagg 1500 gacatattta acatcactga tcgcaccctc cgccttgtca atcaatacct ggagtgcttg 1560 gtcctggata agtttgaaag ctacaatgat gaaactcagc tcacccaacg tgccctctct 1620 ctactggagg aaaacatgtt ctgggccgga gtggtattcc ctgacatgta tccctggacc 1680 agctctctac caccccacgt gaagtataag atccgaatgg acatagacgt ggtggagaaa 1740 accaataaga ttaaagacag gtattgggat tctggtccca gagctgatcc cgtggaagat 1800 ttccggtaca tctggggcgg gtttgcctat ctgcaggaca tggttgaaca ggggatcaca 1860 aggagccagg tgcaggcgga ggctccagtt ggaatctacc tccagcagat gccctacccc 1920 tgcttcgtgg acgattcttt catgatcatc ctgaaccgct gtttccctat cttcatggtg 1980 ctggcatgga tctactctgt ctccatgact gtgaagagca tcgtcttgga gaaggagttg 2040 cgactgaagg agaccttgaa aaatcagggt gtctccaatg cagtgatttg gtgtacctgg 2100 ttcctggaca gcttctccat catgtcgatg agcatcttcc tcctgacgat attcatcatg 2160 catggaagaa tcctacatta cagcgaccca ttcatcctct tcctgttctt gttggctttc 2220 tccactgcca ccatcatgct gtgctttctg ctcagcacct tcttctccaa ggccagtctg 2280 gcagcagcct gtagtggtgt catctatttc accctctacc tgccacacat cctgtgcttc 2340 gcctggcagg accgcatgac cgctgagctg aagaaggctg tgagcttact gtctccggtg 2400 gcatttggat ttggcactga gtacctggtt cgctttgaag agcaaggcct ggggctgcag 2460 tggagcaaca tcgggaacag tcccacggaa ggggacgaat tcagcttcct gctgtccatg 2520 cagatgatgc tccttgatgc tgctgtctat ggcttactcg cttggtacct tgatcaggtg 2580 tttccaggag actatggaac cccacttcct tggtactttc ttctacaaga gtcgtattgg 2640 cttggcggtg aagggtgttc aaccagagaa gaaagagccc tggaaaagac cgagccccta 2700 acagaggaaa cggaggatcc agagcaccca gaaggaatac acgactcctt ctttgaacgt 2760 gagcatccag ggtgggttcc tggggtatgc gtgaagaatc tggtaaagat ttttgagccc 2820 tgtggccggc cagctgtgga ccgtctgaac atcaccttct acgagaacca gatcaccgca 2880 ttcctgggcc acaatggagc tgggaaaacc accacctt 2918 <210> 54 <211> 3945 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 54 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca taatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagtggg ccccagaagc 360 ctggtggttg tttgtccttc tcaggggaaa agtgaggcgg ccccttggag gaaggggccg 420 ggcagaatga tctaatcgga ttccaagcag ctcaggggat tgtctttttc tagcaccttc 480 ttgccactcc taagcgtcct ccgtgacccc ggctgggatt tagcctggtg ctgtgtcagc 540 cccgggctcc caggggcttc ccagtggtcc ccaggaaccc tcgacagggc cagggcgtct 600 ctctcgtcca gcaagggcag ggacgggcca caggcaaggg cgcggccgcc atgggcttcg 660 tgagacagat acagcttttg ctctggaaga actggaccct gcggaaaagg caaaagattc 720 gctttgtggt ggaactcgtg tggcctttat ctttatttct ggtcttgatc tggttaagga 780 atgccaaccc gctctacagc catcatgaat gccatttccc caacaaggcg atgccctcag 840 caggaatgct gccgtggctc caggggatct tctgcaatgt gaacaatccc tgttttcaaa 900 gccccacccc aggagaatct cctggaattg tgtcaaacta taacaactcc atcttggcaa 960 gggtatatcg agattttcaa gaactcctca tgaatgcacc agagagccag caccttggcc 1020 gtatttggac agagctacac atcttgtccc aattcatgga caccctccgg actcacccgg 1080 agagaattgc aggaagagga attcgaataa gggatatctt gaaagatgaa gaaacactga 1140 cactatttct cattaaaaac atcggcctgt ctgactcagt ggtctacctt ctgatcaact 1200 ctcaagtccg tccagagcag ttcgctcatg gagtcccgga cctggcgctg areacatcg 1260 cctgcagcga ggccctcctg gagcgcttca tcatcttcag ccagagacgc ggggcaaaga 1320 cggtgcgcta tgccctgtgc tccctctccc agggcaccct acagtggata gaagacactc 1380 tgtatgccaa cgtggacttc ttcaagctct tccgtgtgct tcccacactc ctagacagcc 1440 gttctcaagg tatcaatctg agatcttggg gaggaatatt atctgatatg tcaccaagaa 1500 ttcaagagtt tatccatcgg ccgagtatgc aggacttgct gtgggtgacc aggcccctca 1560 tgcagaatgg tggtccagag acctttacaa agctgatggg catcctgtct gacctcctgt 1620 gtggctaccc cgagggaggt ggctctcggg tgctctcctt caactggtat gaagacaata 1680 actataaggc ctttctgggg attgactcca caaggaagga tcctatctat tcttatgaca 1740 gaaacaac atccttttgt aatgcattga tccagagcct ggagtcaaat cctttaacca 1800 aaatcgcttg gagggcggca aagcctttgc tgatgggaaa aatcctgtac actcctgatt 1860 cacctgcagc acgaaggata ctgaagaatg ccaactcaac ttttgaagaa ctggaacagg 1920 ttaggaagtt ggtcaaagcc tgggaagaag tagggcccca gatctggtac ttctttgaca 1980 acagcacaca gatgaacatg atcagagata ccctggggaa cccaacagta aaagactttt 2040 tgaataggca gcttggtgaa gaaggtatta ctgctgaagc catcctaaac ttcctctaca 2100 agggccctcg ggaaagccag gctgacgaca tggccaactt cgactggagg gacatattta 2160 acatcactga tcgcaccctc cgccttgtca atcaatacct ggagtgcttg gtcctggata 2220 agtttgaaag ctacaatgat gaaactcagc tcacccaacg tgccctctct ctactggagg 2280 aaaacatgtt ctgggccgga gtggtattcc ctgacatgta tccctggacc agctctctac 2340 caccccacgt gaagtataag atccgaatgg acatagacgt ggtggagaaa accaataaga 2400 ttaaagacag gtattgggat tctggtccca gagctgatcc cgtggaagat ttccggtaca 2460 tctggggcgg gtttgcctat ctgcaggaca tggttgaaca ggggatcaca aggagccagg 2520 tgcaggcgga ggctccagtt ggaatctacc tccagcagat gccctacccc tgcttcgtgg 2580 acgattcttt catgatcatc ctgaaccgct gtttccctat cttcatggtg ctggcatgga 2640 tctactctgt ctccatgact gtgaagagca tcgtcttgga gaaggagttg cgactgaagg 2700 agaccttgaa aaatcagggt gtctccaatg cagtgatttg gtgtacctgg ttcctggaca 2760 gcttctccat catgtcgatg agcatcttcc tcctgacgat attcatcatg catggaagaa 2820 tcctacatta cagcgaccca ttcatcctct tcctgttctt gttggctttc tccactgcca 2880 ccatcatgct gtgctttctg ctcagcacct tcttctccaa ggccagtctg gcagcagcct 2940 gtagtggtgt catctatttc accctctacc tgccacacat cctgtgcttc gcctggcagg 3000 accgcatgac cgctgagctg aagaaggctg tgagcttact gtctccggtg gcatttggat 3060 ttggcactga gtacctggtt cgctttgaag agcaaggcct ggggctgcag tggagcaaca 3120 tcgggaacag tcccacggaa ggggacgaat tcagcttcct gctgtccatg cagatgatgc 3180 tccttgatgc tgctgtctat ggcttactcg cttggtacct tgatcaggtg tttccaggag 3240 actatggaac cccacttcct tggtactttc ttctacaaga gtcgtattgg cttggcggtg 3300 aagggtgtc aaccagagaa gaaagagccc tggaaaagac cgagcccta acagaggaaa 3360 cggaggatcc agagcaccca gaaggaatac acgactcctt ctttgaacgt gagcatccag 3420 ggtgggttcc tggggtatgc gtgaagaatc tggtaaagat ttttgagccc tgtggccggc 3480 cagctgtgga ccgtctgaac atcaccttct acgagaacca gatcaccgca ttcctgggcc 3540 acaatggagc tgggaaaacc accaccttgt aagtatcaag gttacaagac aggtttaagg 3600 agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct gggatttttc 3660 cgatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac gcgaatttta 3720 acaaaatatt aacgtttata atttcaggtg gcatctttcc cgcctgcaag aactggttca 3780 gcagcctgag ccacttcgtg atccacctgc aattgaggaa cccctagtga tggagttggc 3840 cactccctct ctgcgcgctc gctcgctcac tgaggccggg cgaccaaagg tcgcccgacg 3900 cccgggcttt gcccgggcgg cctcagtgag cgagcgagcg cgcag 3945 <210> 55 <211> 3904 <212> DNA <213> Artificial Sequence [[ID=2,8]]<220> <223> Synthetic <400> 55 gtccatcctg acgggtctgt tgccaccaac ctctgggact gtgctcgttg ggggaggga cattgaacc agcctggatg cagtccggca gagccttggc atgtgtccac agcacaacat cctgttccac cacctcacgg tggctgagca catgctgttc tatgcccagc tgaaggaa gtcccaggag gaggcccagc tggagatgga agccatgttg gaggacacag gcctccacca 240 caagcggat gaagaggctc aggacctatc aggtggcatg cagagaagc tgtcggttgc cattgccttt gtgggagatg ccaaggtggt gattctggac gaacccacct ctggggtgga 360 cccttactcg agacgctcaa tctgggatct gctcctgag tatcgctcag gcagaaccat 420 catcatgtcc actcaccaca tggacgaggc cgacctcctt ggggaccgca ttgccatcat 480 540. tgcccaggga aggctctact gctcaggcac cccactcttc ctgaagaact gctttggcac aggcttgtac ttaaccttgg tgcgcaagat gaaaaacatc cagagccaaa ggaaaggcag tgaggggacc tgcagctgct cgtctaaggg tttctccacc acgtgtccag cccacgtcga 660 tgacctaact ccagaacaag tcctggatgg ggatgtaaat gagctgatgg atgtagttct ccaccatgtt ccagaggcaa agctggtgga gtgcattggt caagaactta tcttcttctct 780 tccaataag aacttcaagc acagagcata tgccagcctt ttcagagagc tggagagac 840 gctggctgac cttggtctca gcagttttgg aatttctgac actcccctgg aagagatttt 900 tctgaaggtc acggaggatt ctgattcagg acctctgttt gcgggtggcg ctcagcagaa 960 aagaaaac gtcaaccccc gacacccctg cttgggtccc agagagaagg ctggacagac 1020 accccaggac tccaatgtct gctccccagg ggcgccggct gctcacccag agggccagcc 1080 tcccccagag ccagagtgcc caggcccgca gctcaacacg gggacacagc tggtcctcca 1140 gcatgtgcag gcgctgctgg tcaagagatt ccaacacacc atccgcagcc acaaggactt 1200 cctggcgcag atcgtgctcc cggctacctt tgtgtttttg gctctgatgc tttctattgt 1260 tatccctcct tttggcgaat accccgcttt gacccttcac ccctggatat atgggcagca 1320 gtacaccttc ttcagcatgg atgaaccagg cagtgagcag ttcacggtac ttgcagacgt 1380 cctcctgaat aagccaggct ttggcaaccg ctgcctgaag gaagggtggc ttccggagta 1440 cccctgtggc aactcaacac cctggaagac tccttctgtg tccccaaaca tcacccagct 1500 gttccagaag cagaaatgga cacaggtcaa cccttcacca tcctgcaggt gcagcaccag 1560 ggagaagctc accatgctgc cagagtgccc cgagggtgcc gggggcctcc cgccccccca 1620 gagaacacag cgcagcacgg aaattctaca agacctgacg gacaggaaca tctccgactt 1680 cttggtaaaa acgtatcctg ctcttataag aagcagctta aagagcaaat tctgggtcaa 1740 tgaacagagg tatggaggaa tttccattgg aggaaagctc ccagtcgtcc ccatcacggg 1800 ggaagcactt gttgggtttt taagcgacct tggccggatc atgaatgtga gcgggggccc 1860 tatcactaga gaggcctcta aagaaatacc tgatttcctt aaacatctag aaactgaaga 1920 caacattaag gtgtggttta ataacaaagg ctggcatgcc ctggtcagct ttctcaatgt 1980 ggcccacaac gccatcttac gggccagcct gcctaaggac agaagccccg aggagtatgg 2040 aatcaccgtc attagccaac ccctgaacct gaccaaggag cagctctcag agattacagt 2100 gctgaccact tcagtggatg ctgtggttgc catctgcgtg attttctcca tgtccttcgt 2160 cccagccagc tttgtccttt atttgatcca ggagcgggtg aacaaatcca agcacctcca 2220 gtttatcagt ggagtgagcc ccaccaccta ctgggtaacc aacttcctct gggacatcat 2280 gaattattcc gtgagtgctg ggctggtggt gggcatcttc atcgggtttc agaagaaagc 2340 ctacacttct ccagaaaacc ttcctgccct tgtggcactg ctcctgctgt atggatgggc 2400 ggtcattccc atgatgtacc cagcatcctt cctgtttgat gtccccagca cagcctatgt 2460 ggctttatct tgtgctaatc tgttcatcgg catcaacagc agtgctatta ccttcatctt 2520 ggaattattt gagaataacc ggacgctgct caggttcaac gccgtgctga ggaagctgct 2580 cattgtcttc ccccacttct gcctgggccg gggcctcatt gaccttgcac tgagccaggc 2640 tgtgacagat gtctatgccc ggtttggtga ggagcactct gcaaatccgt tccactggga 2700 cctgattggg aagaacctgt ttgccatggt ggtggaaggg gtggtgtact tcctcctgac 2760 cctgctggtc cagcgccact tcttcctctc ccaatggatt gccgagccca ctaaggagcc 2820 cattgttgat gaagatgatg atgtggctga agaaagacaa agaattatta ctggtggaaa 2880 taaaactgac atcttaaggc tacatgaact aaccaagatt tatccaggca cctccagccc 2940 agcagtggac aggctgtgtg tcggagttcg ccctggagag tgctttggcc tcctgggagt 3000 gaatggtgcc ggcaaaacaa ccacattcaa gatgctcact ggggacacca cagtgacctc 3060 aggggatgcc accgtagcag gcaagagtat tttaaccaat atttctgaag tccatcaaaa 3120 tatgggctac tgtcctcagt ttgatgcaat cgatgagctg ctcacaggac gagaacatct 3180 ttacctttat gcccggcttc gaggtgtacc agcagaagaa atcgaaaagg ttgcaaactg 3240 gagtattaag agcctgggcc tgactgtcta cgccgactgc ctggctggca cgtacagtgg 3300 gggcaacaag cggaaactct ccacagccat cgcactcatt ggctgcccac cgctggtgct 3360 gctggatgag cccaccacag ggatggaccc ccaggcacgc cgcatgctgt ggaacgtcat 3420 cgtgagcatc atcagagaag ggagggctgt ggtcctcaca tcccacagca tggaagaatg 3480 tgaggcactg tgtacccggc tggccatcat ggtaaagggc gcctttcgat gtatgggcac 3540 cattcagcat ctcaagtcca aatttggaga tggctatatc gtcacaatga agatcaaatc 3600 cccgaaggac gacctgcttc ctgacctgaa ccctgtggag cagttcttcc aggggaactt 3660 cccaggcagt gtgcagaggg agaggcacta caacatgctc cagttccagg tctcctcctc 3720 ctccctggcg aggatcttcc agctcctcct ctcccacaag gacagcctgc tcatcgagga 3780 gtactcagtc acacagacca cactggacca ggtgtttgta aattttgcta aacagcagac 3840 tgaaagtcat gacctccctc tgcaccctcg agctgctgga gccagtcgac aagcccagga 3900 ctga 3904 <210> 56 <211> 4636 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 56 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga tttttccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaacaa aatattaacg tttataattt caggtggcat 240 ctttcgatag gcacctattg gtcttactga catccacttt gcctttctct ccacaggtcc 300 atcctgacgg gtctgttgcc accaacctct gggactgtgc tcgttggggg aagggacatt 360 gaaaccagcc tggatgcagt ccggcagagc cttggcatgt gtccacagca caacatcctg 420 ttccaccacc tcacggtggc tgagcacatg ctgttctatg cccagctgaa aggaaagtcc 480 caggaggagg cccagctgga gatggaagcc atgttggagg acacaggcct ccaccacaag 540 cggaatgaag aggctcagga cctatcaggt ggcatgcaga gaaagctgtc ggttgccatt 600 gcctttgtgg gagatgccaa ggtggtgatt ctggacgaac ccacctctgg ggtggaccct 660 tactcgagac gctcaatctg ggatctgctc ctgaagtatc gctcaggcag aaccatcatc 720 atgtccactc accacatgga cgaggccgac ctccttgggg accgcattgc catcattgcc 780 cagggaaggc tctactgctc aggcacccca ctcttcctga agaactgctt tggcacaggc 840 ttgtacttaa ccttggtgcg caagatgaaa aacatccaga gccaaaggaa aggcagtgag 900 gggacctgca gctgctcgtc taagggtttc tccaccacgt gtccagccca cgtcgatgac 960 ctaactccag aacaagtcct ggatggggat gtaaatgagc tgatggatgt agttctccac 1020 catgttccag aggcaaagct ggtggagtgc attggtcaag aacttatctt ccttcttcca 1080 aataagaact tcaagcacag agcatatgcc agccttttca gagagctgga ggagacgctg 1140 gctgaccttg gtctcagcag ttttggaatt tctgacactc ccctggaaga gatttttctg 1200 aaggtcacgg aggattctga ttcaggacct ctgtttgcgg gtggcgctca gcagaaaaga 1260 gaaaacgtca acccccgaca cccctgcttg ggtcccagag agaaggctgg acagacaccc 1320 caggactcca atgtctgctc cccaggggcg ccggctgctc acccagaggg ccagcctccc 1380 ccagagccag agtgcccagg cccgcagctc aacacgggga cacagctggt cctccagcat 1440 gtgcaggcgc tgctggtcaa gagattccaa cacaccatcc gcagccacaa ggacttcctg 1500 gcgcagatcg tgctcccggc tacctttgtg tttttggctc tgatgctttc tattgttatc 1560 cctccttttg gcgaataccc cgctttgacc cttcacccct ggatatatgg gcagcagtac 1620 accttcttca gcatggatga accaggcagt gagcagttca cggtacttgc agacgtcctc 1680 ctgaataagc caggctttgg caaccgctgc ctgaaggaag ggtggcttcc ggagtacccc 1740 tgtggcaact caacaccctg gaagactcct tctgtgtccc caaacatcac ccagctgttc 1800 cagaagcaga aatggacaca ggtcaaccct tcaccatcct gcaggtgcag caccagggag 1860 1920 acacagcgca gcacggaaat tctacaagac ctgacggaca ggaacatctc cgacttcttg 1980 gtaaaaacgt atcctgctct tataagaagc agctaaaga gcaaattctg ggtcaatgaa 2040 cagaggtatg gaggaatttc cattggagga aagctcccag tcgtccccat cacgggggaa 2100 gcacttgttg ggtttttaag cgaccttggc cggatcatga atgtgagcgg gggccctatc 2160 2220 attaaggtgt ggtttaataa caaaggctgg catgccctgg tcagctttct caatgtggcc 2280 cacaacgcca tcttacgggc cagcctgcct aagcagaa gccccgaga gtatggaatc 2340 accgtcatta gccaacccct gaacctgacc aaggagcagc tctcagagat tacagtgctg 2400 accacttcag tggatgctgt ggttgccatc tgcgtgattt tctccatgtc cttcgtccca 2460 gccagctttg tcctttattt gatccaggag cgggtgaaca aatccaagca cctccagttt 2520 atcagtggag tgagccccac cacctactgg gtaaccaact tcctctggga catcatgaat 2580 tattccgtga gtgctgggct ggtggtgggc atcttcatcg ggtttcagaa gaaagcctac 2640 acttctccag aaaaccttcc tgcccttgtg gcactgctcc tgctgtatgg atgggcggtc 2700 attcccatga tgtacccagc atccttcctg tttgatgtcc ccagcacagc ctatgtggct 2760 ttatcttgtg ctaatctgtt catcggcatc aacagcagtg ctattacctt catcttggaa 2820 ttatttgaga ataaccggac gctgctcagg ttcaacgccg tgctgaggaa gctgctcatt 2880 gtcttccccc acttctgcct gggccggggc ctcattgacc ttgcactgag ccaggctgtg 2940 acagatgtct atgcccggtt tggtgaggag cactctgcaa atccgttcca ctgggacctg 3000 attgggaaga acctgtttgc catggtggtg gaaggggtgg tgtacttcct cctgaccctg 3060 ctggtccagc gccacttctt cctctcccaa tggattgccg agcccactaa ggagcccatt 3120 gttgatgaag atgatgatgt ggctgaagaa agacaaagaa ttattactgg tggaaataaa 3180 actgacatct taaggctaca tgaactaacc aagatttatc caggcacctc cagcccagca 3240 gtggacaggc tgtgtgtcgg agttcgccct ggagagtgct ttggcctcct gggagtgaat 3300 ggtgccggca aaacaaccac attcaagatg ctcactgggg acaccacagt gacctcaggg 3360 gatgccaccg tagcaggcaa gagtatttta accaatattt ctgaagtcca tcaaaatatg 3420 ggctactgtc ctcagtttga tgcaatcgat gagctctctca caggacgaga acatctttac 3480 ctttatgccc ggcttcgagg tgtaccagca gaagaaatcg aaaaggttgc aaactggagt 3540 attaagagcc tgggcctgac tgtctacgcc gactgcctgg ctggcacgta cagtgggggc 3600 aaaagcgga aactctccac agccatcgca ctcattggct gcccaccgct ggtgctgctg 3660 gatgagccca ccacagggat ggacccccag gcacgccgca tgctgtggaa cgtcatcgtg 3720 agcatcatca gagagggag ggctgtggtc ctcacatccc acagcatgga agaatgtgag 3780 gcactgtgta cccggctggc catcatggta aagggcgcct ttcgatgtat gggcaccatt 3840 cagcatctca agtccaaatt tggagatggc tatatcgtca caatgaagat caaatccccg 3900 aaggacgacc tgcttcctga cctgaaccct gtggagcagt tcttccaggg gaacttccca 3960 ggcagtgtgc agagggagag gcactacaac atgctccagt tccaggtctc ctcctcctcc 4020 ctggcgagga tcttccagct cctcctctcc cacaaggaca gcctgctcat cgaggagtac 4080 tcagtcacac agaccacact ggaccaggtg tttgtaaatt ttgctaaaca gcagactgaa 4140 agtcatgacc tccctctgca ccctcgagct gctggagcca gtcgacaagc ccaggactga 4200 gcggccgctt cgagcagaca tgataagata cattgatgag tttggacaaa ccacaactag 4260 aatgcagtga aaaaaatgct ttatttgtga aatttgtgat gctattgctt tatttgtaac 4320 cattataagc tgcaataaac aagttaacaa caacaattgc attcatttta tgtttcaggt 4380 tcagggggag atgtgggagg ttttttaaag caagtaaaac ctctacaaat gtggtaaaat 4440 cgataaggat cttcctagag catggctacg tagataagta gcatggcggg ttaatcatta 4500 actacaagga accctagtg atggagttgg ccactccctc tctgcgcgct cgctcgctca 4560 ctgaggccgg gcgaccaaag gtcgcccgac gcccgggctt tgcccgggcg gcctcagtga 4620 gcgagcgagc gcgcag 4636 <210> 57 <211> 4540 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 57 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca taatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggccc gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtacac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 ccattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 accaatagaa actgggcttg tcgagacaga gaagactctt gcgtttctga taggcaccta 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc acccccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gagaattcg ataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggataaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 tggggattga ctccacaagg aggatccta tctattctta tgacagaaga acaacatcct 2400 tttgtaatgc attgatccag agcctggagt caaatccttt aaccaaaatc gcttggaggg 2460 cggcaaagcc tttgctgatg ggaaaaatcc tgtacactcc tgattcacct gcagcacgaa 2520 ggatactgaa gaatgccaac tcaacttttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agaagtaggg ccccagatct ggtacttctt tgacaacagc acacagatga 2640 acatgatcag agataccctg gggaacccaa footaaaga ctttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact ggagggacat atttaacatc actgatcgca 2820 ccctccgcct tgtcaatcaa tacctggagt gcttggtcct ggataagtttt gaaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaaac atgttctggg 2940 ccggagtggt attccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 ataagatccg aatggacata gacgtggtgg agaaaaccaa taagattaaa gacaggtatt 3060 gggattctgg tcccagagct gatcccgtgg aagatttccg gtacatctgg ggcgggtttg 3120 cctatctgca ggacatggtt gaacagggga tcacaaggag ccaggtgcag gcggaggctc 3180 cagttggaat ctacctccag cagatgccct acccctgctt cgtggacgat tctttcatga 3240 tcatcctgaa ccgctgtttc cctatcttca tggtgctggc atggatctac tctgtctcca 3300 tgactgtgaa gagcatcgtc ttggagaagg agttgcgact gaaggagacc ttgaaaaatc 3360 agggtgtctc caatgcagtg atttggtgta cctggttcct ggacagcttc tccatcatgt 3420 cgatgagcat cttcctcctg acgatattca tcatgcatgg aagaatccta cattacagcg 3480 acccattcat cctcttcctg ttcttgttgg ctttctccac tgccaccatc atgctgtgct 3540 ttctgctcag caccttcttc tccaaggcca gtctggcagc agcctgtagt ggtgtcatct 3600 atttcaccct ctacctgcca cacatcctgt gcttcgcctg gcaggaccgc atgaccgctg 3660 agctgaagaa ggctgtgagc ttactgtctc cggtggcatt tggatttggc actgagtacc tggttcgctt tgaagagcaa ggcctggggc tgcagtggag caacatcggg aacagtccca 3780 cggaagggga cgaattcagc ttcctgctgt ccatgcagat gatgctcctt gatgctgctg 3840 tctatggctt actcgcttgg taccttgatc aggtgtttcc aggagactat ggaaccccac ttccttggta ctttcttcta caagagtcgt attggcttgg cggtgaaggg tgttcaacca gagaagaaag agccctggaa aagaccgagc ccctaacaga ggaaacggag gatccagagc acccagaagg aatacacgac tccttctttg aacgtgagca tccagggtgg gttcctgggg tatgcgtgaa gaatctggta aagatttttg agccctgtgg ccggccagct gtggaccgtc tgaacatcac cttctacgag aaccagatca ccgcattcct gggccacaat ggagctggga aaaccaccac cttgtaagta tcaaggttac aagacaggtt taaggagacc aatagaaact 4260 gggcttgtcg agacagagaa gactcttgcg tttctgggat ttttccgatt tcggcctatt 4320 ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgt 4380 ttataatttc aggtggcatc tttccaattg aggaacccct agtgatggag ttggccactc 4440 cctctctgcg cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg 4500 gctttgcccg ggcggcctca gtgagcgagc gagcgcgcag 4540 <210> 58 <211> 4702 <212> DNA <213> Artificial sequence [[ID=第20行]]<220> <223> Synthetic <400> 58 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga tttttccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaacaa aatattaacg tttataattt caggtggcat 240 请注意,你提供的原文中第20行的标签内容为空,我按照原样保留并翻译了该行。你可以检查一下是否有遗漏或错误信息。ctttcgatag gcacctattg gtcttactga catccacttt gcctttctct ccacaggtcc 300 atcctgacgg gtctgttgcc accaacctct gggactgtgc tcgttggggg aagggacatt 360 gaaaccagcc tggatgcagt ccggcagagc cttggcatgt gtccacagca caacatcctg 420 ttccaccacc tcacggtggc tgagcacatg ctgttctatg cccagctgaa aggaaagtcc 480 caggaggagg cccagctgga gatggaagcc atgttggagg acacaggcct ccaccacaag 540 cggaatgaag aggctcagga cctatcaggt ggcatgcaga gaaagctgtc ggttgccatt 600 gcctttgtgg gagatgccaa ggtggtgatt ctggacgaac ccacctctgg ggtggaccct 660 tactcgagac gctcaatctg ggatctgctc ctgaagtatc gctcaggcag aaccatcatc 720 atgtccactc accacatgga cgaggccgac ctccttgggg accgcattgc catcattgcc 780 cagggaaggc tctactgctc aggcacccca ctcttcctga agaactgctt tggcacaggc 840 ttgtacttaa ccttggtgcg caagatgaaa aacatccaga gccaaaggaa aggcagtgag 900 gggacctgca gctgctcgtc taagggtttc tccaccacgt gtccagccca cgtcgatgac 960 ctaactccag aacaagtcct ggatggggat gtaaatgagc tgatggatgt agttctccac 1020 catgttccag aggcaaagct ggtggagtgc attggtcaag aacttatctt ccttcttcca 1080 aataagaact tcaagcacag agcatatgcc agccttttca gagagctgga ggagacgctg 1140 gctgaccttg gtctcagcag ttttggaatt tctgacactc ccctggaaga gatttttctg 1200 aaggtcacgg aggattctga ttcaggacct ctgtttgcgg gtggcgctca gcagaaaaga 1260 gaaaacgtca acccccgaca cccctgcttg ggtcccagag agaaggctgg acagacaccc 1320 caggactcca atgtctgctc cccaggggcg ccggctgctc acccagaggg ccagcctccc 1380 ccagagccag agtgcccagg cccgcagctc aacacgggga cacagctggt cctccagcat 1440 gtgcaggcgc tgctggtcaa gagattccaa cacaccatcc gcagccacaa ggacttcctg 1500 gcgcagatcg tgctcccggc tacctttgtg tttttggctc tgatgctttc tattgttatc 1560 cctccttttg gcgaataccc cgctttgacc cttcacccct ggatatatgg gcagcagtac 1620 accttcttca gcatggatga accaggcagt gagcagttca cggtacttgc agacgtcctc 1680 ctgaataagc caggctttgg caaccgctgc ctgaaggaag ggtggcttcc ggagtacccc 1740 tgtggcaact caacaccctg gaagactcct tctgtgtccc caaacatcac ccagctgttc 1800 cagaagcaga aatggacaca ggtcaaccct tcaccatcct gcaggtgcag caccagggag 1860 1920 acacagcgca gcacggaaat tctacaagac ctgacggaca ggaacatctc cgacttcttg 1980 gtaaaaacgt atcctgctct tataagaagc agctaaaga gcaaattctg ggtcaatgaa 2040 cagaggtatg gaggaatttc cattggagga aagctcccag tcgtccccat cacgggggaa 2100 gcacttgttg ggtttttaag cgaccttggc cggatcatga atgtgagcgg gggccctatc 2160 2220 attaaggtgt ggtttaataa caaaggctgg catgccctgg tcagctttct caatgtggcc 2280 cacaacgcca tcttacgggc cagcctgcct aagcagaa gccccgaga gtatggaatc 2340 accgtcatta gccaacccct gaacctgacc aaggagcagc tctcagagat tacagtgctg 2400 accacttcag tggatgctgt ggttgccatc tgcgtgattt tctccatgtc cttcgtccca 2460 gccagctttg tcctttattt gatccaggag cgggtgaaca aatccaagca cctccagttt 2520 atcagtggag tgagccccac cacctactgg gtaaccaact tcctctggga catcatgaat 2580 tattccgtga gtgctgggct ggtggtgggc atcttcatcg ggtttcagaa gaaagcctac 2640 acttctccag aaaaccttcc tgcccttgtg gcactgctcc tgctgtatgg atgggcggtc 2700 attcccatga tgtacccagc atccttcctg tttgatgtcc ccagcacagc ctatgtggct 2760 ttatcttgtg ctaatctgtt catcggcatc aacagcagtg ctattacctt catcttggaa 2820 ttatttgaga ataaccggac gctgctcagg ttcaacgccg tgctgaggaa gctgctcatt 2880 gtcttccccc acttctgcct gggccggggc ctcattgacc ttgcactgag ccaggctgtg 2940 acagatgtct atgcccggtt tggtgaggag cactctgcaa atccgttcca ctgggacctg 3000 attgggaaga acctgtttgc catggtggtg gaaggggtgg tgtacttcct cctgaccctg 3060 ctggtccagc gccacttctt cctctcccaa tggattgccg agcccactaa ggagcccatt 3120 gttgatgaag atgatgatgt ggctgaagaa agacaaagaa ttatactgg tggaaataa 3180 actgacatct taaggctaca tgaactaacc aagatttatc caggcacctc cagcccagca 3240 gtggacaggc tgtgtgtcgg agttcgccct ggagagtgct ttggcctcct gggagtgaat 3300 ggtgccggca aaacaaccac attcaagatg ctcactgggg acaccacagt gacctcaggg 3360 gatgccaccg tagcaggcaa gagtatttta accaatattt ctgaagtcca tcaaaatatg 3420 ggctactgtc ctcagtttga tgcaatcgat gagctctctca caggacgaga acatctttac 3480 ctttatgccc ggcttcgagg tgtaccagca gaagaaatcg aaaaggttgc aaactggagt 3540 attaagagcc tgggcctgac tgtctacgcc gactgcctgg ctggcacgta cagtgggggc 3600 aaaagcgga aactctccac agccatcgca ctcattggct gcccaccgct ggtgctgctg 3660 gatgagccca ccacagggat ggacccccag gcacgccgca tgctgtggaa cgtcatcgtg 3720 agcatcatca gagagggag ggctgtggtc ctcacatccc acagcatgga agaatgtgag 3780 gcactgtgta cccggctggc catcatggta aagggcgcct ttcgatgtat gggcaccatt 3840 cagcatctca agtccaatt tggagatggc tatatcgtca caatgaat caatccccg 3900 aaggacgacc tgcttcctga cctgaaccct gtggagcagt tctccaggg gaacttccca 3960 ggcagtgtgc agagggag gcactacaac atgctccagt tccaggtctc ctcctcctcc 4020 ctggcgagga tcttccagct cctcctcc cacaaggaca gcctgctcat cgaggagtac 4080 tcagtcacac agaccacact ggaccaggtg ttgtaaatt ttgctaaca gcagactgaa 4140 agtcatgacc tccctctgca cccgagct gctggagcca gtcgacaagc ccaggacgac 4200 tacaaagacc atgacggtga tataaagat catgacatcg actahaagga tgacgatgac 4260 aagtgagcgg ccgcttcgag cagacatgat aagatacatt gatgagtttg gaaaaccac 4320 aactagaatg cagtgaaaaa aatgctttat ttgtgaaatt tgtgatgcta ttgctttatt 4380 tgtaaccatt ataagctgca aaaacaagt ataacaac aattgcattc attttatgtt 4440 tcaggttcag ggggagatgt gggaggtttt ttaaagcaag taaaacctct acaaatgtgg 4500 taaaatcgat aaggatctc ctagagcatg gctacgtaga taagtagcat ggcgggttaa 4560 tcattaacta caaggaaccc ctagtgatgg agttggccac tccctctctg cgcgctcgct 4620 cgctcactga ggccgggcga ccaaaggtcg cccgacgccc gggctttgcc cgggcggcct 4680 cagtgagcga gcgagcgcgc ag 4702 <210> 59 <211> 4718 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 59 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca taatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggccc gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtacac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 ccattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 accaatagaa actgggcttg tcgagacaga gaagactctt gcgtttctga taggcaccta 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc accccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gaggaattcg aataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggatagaaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 tggggattga ctccacaagg aggatccta tctattctta tgacagaaga acaacatcct 2400 tttgtaatgc attgatccag agcctggagt caaatccttt aaccaaaatc gcttggaggg 2460 cggcaaagcc tttgctgatg ggaaaaatcc tgtacactcc tgattcacct gcagcacgaa 2520 ggatactgaa gaatgccaac tcaacttttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agaagtaggg ccccagatct ggtacttctt tgacaacagc acacagatga 2640 acatgatcag agataccctg gggaacccaa cagtaaaga cttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact gggaggacat atttacatc actgatcgca 2820 cccctccgcct tgtcaatca tacctggagt gcttggtcct ggataagttt gaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaac atgttctggg 2940 ccggagtggt atccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 aatagatccg atggacata gacgtggtgg agaaaaccaa agattaa gaggaggtatt 3060 gggattctgg tcccagagct gatcccgtgg aagatttccg gtacatctgg ggcggttg 3120 cctatctgca ggacatggtt gaacagggga tcacaggag ccaggtgcag gcggaggctc 3180 cagttggaat ctacctccag cagatgccct acccctgct cgtggacgat tctttcatga 3240 tcatcctgaa ccgctgttc cctatcttca tgtgctgc atggatc tctgtctcca 3300 tgactgtgaa gagcatcgtc ttggagagg agttgcgact gaaggagacc ttgaaaaatc 3360 agggtgtctc caatgcagtg atttggtgta cctggttcct ggacagcttc tccatcatgt 3420 cgatgagcat cttcctcctg acgatattca tcatgcatgg aagaatccta cattacagcg 3480 acccattcat cctcttcctg ttcttgttgg ctttctccac tgccaccatc atgctgtgct 3540 ttctgctcag caccttcttc tccaaggcca gtctggcagc agcctgtagt ggtgtcatct 3600 atttcaccct ctacctgcca cacatcctgt gcttcgcctg gcaggaccgc atgaccgctg 3660 agctgaagaa ggctgtgagc ttactgtctc cggtggcatt tggatttggc actgagtacc 3720 tggttcgctt tgaagagcaa ggcctggggc tgcagtggag caacatcggg aacagtccca 3780 cggaagggga cgaattcagc ttcctgctgt ccatgcagat gatgctcctt gatgctgctg 3840 tctatggctt actcgcttgg taccttgatc aggtgtttcc aggagactat ggaaccccac 3900 ttccttggta ctttcttcta caagagtcgt attggcttgg cggtgaaggg tgttcaacca 3960 gagaagaaag agccctggaa aagaccgagc ccctaacaga ggaaacggag gatccagagc 4020 acccagaagg aatacacgac tccttctttg aacgtgagca tccagggtgg gttcctgggg 4080 tatgcgtgaa gaatctggta aagatttttg agccctgtgg ccggccagct gtggaccgtc tgaacatcac cttctacgag aaccagatca ccgcattcct gggccacaat ggagctggga aaaccaccac cttgtaagta tcaaggttac aagacaggtt taaggagacc aatagaaact gggcttgtcg agacagagaa gactcttgcg tttctccccg ggtgcgcggc gtcggtggtg ccggcggggg gcgccaggtc gcaggcggtg tagggctcca ggcaggcggc gaaggccatg 4380 acgtgcgcta tgaaggtctg ctcctgcacg ccgtgaacca ggtgcgcctg cgggccgcgc 4440 gcgaacaccg ccacgtcctc gcctgcgtgg gtctcttcgt ccaggggcac tgctgactgc 4500 tgccgatact cggggctccc gctctcgctc tcggtaacat ccggccgggc gccgtccttg 4560 agcacatagc ctggaccgtt tccaattgag gaacccctag tgatggagtt ggccactccc tctctgcgcg ctcgctcgct cactgaggcc gggcgaccaa aggtccccg acgcccgggc 4680 tttgcccggg cggcctcagt gagcgagcga gcgcgcag 4718 <210> 60 <211> 4880 <212> DNA <213> The snowstorm <220> <223> Synthesis <400> 60 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatcccccc gggtgcgcgg cgtcggtggt gccggcgggg ggcgccaggt 180 cgcaggcggt gtagggctcc aggcaggcgg cgaaggccat gacgtgcgct atgaaggtct 240 gctcctgcac gccgtgaacc aggtgcgcct gcgggccgcg cgcgaacacc gccacgtcct 第300行 cgcctgcgtg ggtctcttcg tccaggggca ctgctgactg ctgccgatac tcggggctcc 360 cgctctcgct ctcggtaaca tccggccggg cgccgtcctt gagcacatag cctggaccgt 420 ttcgataggc acctattggt cttactgaca tccactttgc ctttctctcc acaggtccat 480 cctgacgggt ctgttgccac caacctctgg gactgtgctc gttgggggaa gggacattga 540 aaccagcctg gatgcagtcc ggcagagcct tggcatgtgt ccacagcaca acatcctgtt 600 ccaccacctc acggtggctg agcacatgct gttctatgcc cagctgaaag gaaagtccca 660 Note: There seems to be a small error in the original Chinese text where "第300行" was left untranslated. I've translated it as "300th line" in the most appropriate place in the English translation. If this is not what you intended, please clarify.ggaggaggcc cagctggaga tggaagccat gttggaggac acaggcctcc accacaagcg 720 gatgaagag gctcaggacc tatcaggtgg catgcagaga aagctgtcgg ttgccattgc 780 ctttgtggga gatgccaagg tggtctct gacgaaccc acctctgggg tggaccctta 840 ctcgagacgc tcaatctggg atctgctct gaagtatcgc tcaggcagaa ccatcatcat 900 gtccactcac cacatggacg aggccgacct ccttggggac cgcattgcca tcattgccca 960 gggaaggctc tactgctcag gcacccact cttcctgaag aactgctttg gcacaggctt 1020 gtacttaacc ttggtgcgca agatgaaaaa catccagagc caaggaag gcagtgaggg 1080 gacctgcagc tgctcgtcta agggtttctc caccacgtgt ccagcccacg tcgatgacct 1140 aactccagaa caagtcctgg atgggatgt aaatgagctg atggatgtag ttctccacca 1200 tgttccagag gcaaagctgg tggagtgcat tggtcaagaa cttatctcc ttctccaaa 1260 taagaacttc aagcacagag catatgccag cctttcaga gagctggagg agacgctggc 1320 tgaccttggt ctcagcagtt tggaatttc tgacactccc ctggaagaga ttttctga 1380 ggtcacggag gattctgatt caggacctct gtttgcgggt ggcgctcagc agaaaagaga 1440 aaacgtcaac ccccgacacc cctgcttggg tcccagagag aaggctggac agacacccca 1500 ggactccaat gtctgctccc caggggcgcc ggctgctcac ccagagggcc agcctccccc 1560 agagccagag tgcccaggcc cgcagctcaa cacggggaca cagctggtcc tccagcatgt 1620 gcaggcgctg ctggtcaaga gattccaaca caccatccgc agccacaagg acttcctggc 1680 gcagatcgtg ctcccggcta cctttgtgtt tttggctctg atgctttcta ttgttatccc 1740 tccttttggc gaataccccg ctttgaccct tcacccctgg atatatgggc agcagtacac 1800 cttcttcagc atggatgaac caggcagtga gcagttcacg gtacttgcag acgtcctcct 1860 gaataagcca ggctttggca accgctgcct gaaggaaggg tggcttccgg agtacccctg 1920 tggcaactca acaccctgga agactccttc tgtgtcccca aacatcaccc agctgttcca 1980 gaagcagaaa tggacacagg tcaacccttc accatcctgc aggtgcagca ccagggagaa 2040 gctcaccatg ctgccagagt gccccgaggg tgccgggggc ctcccgcccc cccagagaac 2100 acagcgcagc acggaaattc tacaagacct gacggacagg aacatctccg acttcttggt 2160 aaaaacgtat cctgctctta taagaagcag cttaaagagc aaattctggg tcaatgaaca 2220 gaggtatgga ggaatttcca ttggaggaaa gctcccagtc gtccccatca cgggggaagc 2280 acttgttggg tttttaagcg accttggccg gatcatgaat gtgagcgggg gccctatcac 2340 tagagaggcc tctaaagaaa tacctgattt ccttaaacat ctagaaactg aagaacaacat 2400 taaggtgtgg tttaataaca aaggctggca tgccctggtc agctttctca atgtggccca 2460 caacgccatc ttacgggcca gcctgcctaa ggacagaagc cccgaggagt atggaatcac 2520 cgtcattagc caacccctga acctgaccaa ggagcagctc tcagagatta cagtgctgac 2580 cacttcagtg gatgctgtgg ttgccatctg cgtgattttc tccatgtcct tcgtcccagc 2640 cagctttgtc ctttatttga tccaggagcg ggtgaacaaa tccaagcacc tccagtttat 2700 cagtggagtg agccccacca cctactgggt aaccaacttc ctctgggaca tcatgaatta 2760 ttccgtgagt gctgggctgg tggtgggcat cttcatcggg tttcaagaaga aagcctacac 2820 ttctccagaa aaccttcctg cccttgtggc actgctcctg ctgtatggat gggcggtcat 2880 tcccatgatg tacccagcat ccttcctgtt tgatgtcccc agcacagcct atgtggcttt 2940 atcttgtgct aatctgttca tcggcatca cagcagtgct attack tcttggaatt atttgagaat aaccggacgc tgctcaggtt caacgccgtg ctgaggaagc tgctcattgt cttcccccac ttctgcctgg gccggggcct cattgacctt gcactgagcc aggctgtgac 3120 agatgtctat gcccggtttg gtgaggagca ctctgcaaat ccgttccact gggacctgat 3180. 3240. tggggagac ctgtttgcca tggtggtgga aggggtggtg tacttcctcc tgaccctgct ggtccagcgc cacttcttcc tctcccaatg gattgccgag cccactaagg agcccattgt 3300 tgatgaagat gatgatgtgg ctgaagaaag acaaagaatt attactggtg gaaataaaac 3420. tgacatctta aggctacatg aactacca gatttatcca ggcacctcca gcccagcagt ggacaggctg tgtgtcggag ttcgccctgg agagtgcttt ggcctcctgg gagtgaatgg 3480 tgccggcaaa acaaccacat tcaagatgct cactggggac accacagtga cctcagggga tgccaccgta gcaggcaaga gtattttaac caatatttct gaagtccatc aaaatatggg 3600 ctactgtcct cagtttgatg caatcgatga gctgctcaca ggacgagaac atctttacct 3660 ttatgcccgg cttcgaggtg taccagcaga agaaatcgaa aaggttgcaa actggagtat 3720 tagagagcctg ggcctgactg tctacgccga ctgcctggct ggcacgtaca gtgggggcaa 3780 caggggaaa ctctccacag ccatcgcact cattggctgc ccaccgctgg tgctgctgga 3840 tgagcccacc acagggatgg acccccaggc acgccgcatg ctgtggaacg tcatcgtgag 3900 catcatcaga gaaggggggg ctgtggtcct cacatcccac agcatggaag aatgtgaggc 3960 actgtgtacc cggctggcca tcatggtaaa gggcgccttt cgatgtatgg gcaccattca 4020 gcatctcaag tccaaatttg gagatggcta tatcgtcaca atgaagatca aatccccgaa 4080 ggacgacctg cttcctgacc tgaaccctgt ggagcagttc ttccaggga acttcccagg 4140 cagtgtgcag agggagaggc actacaacat gctccagttc caggtctcct cctcctccct 4200 ggcgaggatc ttccagctcc tcctctccca caaggacagc ctgctcatcg aggagtactcg 4260 agtcacacag accacactgg accaggtgtt tgtaaatttt gctaaacagc agactgaaag 4320 tcatgacctc cctctgcacc ctcgagctgc tggagccagt cgacaagccc aggacgacta 4380 caaagaccat gacggtgatt ataaagatca tgacatcgac tacaaggatg acgatgacaa 4440 gtgagcggcc gcttcgagca gacatgataa gatacattga tgagtttgga caaaccacaa 4500 ctagaatgca gtgaaaaaaa tgctttattt gtgaaatttg tgatgctatt gctttatttg 4560 taaccattat aagctgcaat aaacaagtta acaacaacaa ttgcattcat tttatgtttc 4620 aggttcaggg ggagatgtgg gaggtttttt aaagcaagta aaacctctac aaatgtggta 4680 aaatcgataa ggatcttcct agagcatggc tacgtagata agtagcatgg cgggttaatc 4740 attaactaca aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg 4800 ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg gctttgcccg ggcggcctca 4860 gtgagcgagc gagcgcgcag 4880 <210> 61 <211> 4719 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 61 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc cgggcgtcg ggcgacctt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca tatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat tatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggcc gcctggctga ccgcccaacg acccccgcc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtcac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 ccattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc accccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gaggaattcg aataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggatagaaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 2400 tttgtaatgc attgatccag agcctggagt caaatccttt aaccaaaatc gcttggaggg 2460 cggcaaagcc tttgctgatg ggaaaaatcc tgtacactcc tgattcacct gcagcacgaa 2520 ggaatgaa gaatgccaac tcaacttttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agagtggg ccccagatct ggtacttctt tgacaacagc acacagatga 2640 acatgatcag agataccctg gggaacccaa cagtaaaaga cttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact ggagggacat atttaacatc actgatcgca 2820 ccctccgcct tgtcaatcaa tacctggagt gcttggtcct ggataagttt gaaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaaac atgttctggg 2940 ccggagtggt attccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 ataagatccg aatggacata gacgtggtgg agaaaaccaa taagattaaa gacaggtatt 3060 gggattctgg tcccagagct gatcccgtgg aagatttccg gtacatctgg ggcgggtttg 3120 cctatctgca ggacatggtt gaacagggga tcacaaggag ccaggtgcag gcggaggctc 3180 cagttggaat ctacctccag cagatgccct acccctgctt cgtggacgat tctttcatga 3240 tcatcctgaa ccgctgtttc cctatcttca tggtgctggc atggatctac tctgtctcca 3300 tgactgtgaa gagcatcgtc ttggagaagg agttgcgact gaaggagacc ttgaaaaatc 3360 agggtgtctc caatgcagtg atttggtgta cctggttcct ggacagcttc tccatcatgt 3420 cgatgagcat cttcctcctg acgatattca tcatgcatgg aagaatccta cattacagcg 3480 acccattcat cctcttcctg ttcttgttg cttctccac tgccaccatc atgctgtgct 3540 ttctgctcag caccttcttc tccaaggcca gtctggcagc agcctgtagt ggtgtcatct 3600 atttcaccct ctacctgcca cacatcctgt gcttcgcctg gcaggaccgc atgaccgctg 3660 agctgaagaa ggctgtgagc ttactgtctc cggtggcatt tggatttggc actgagtacc 3720 tggttcgctt tgaagagcaa ggcctggggc tgcagtggag caacatcggg aacagtccca 3780 cggaagggga cgaattcagc ttcctgctgt ccatgcagat gatgctcctt gatgctgctg 3840 tctatggctt actcgcttgg taccttgatc aggtgtttcc aggagactat ggaaccccac 3900 ttccttggta ctttcttcta caagagtcgt attggcttgg cggtgaaggg tgttcaacca 3960 gagaagaaag agccctggaa aagaccgagc ccctaacaga ggaaacggag gatccagagc 4020 acccagaagg aatacacgac tccttctttg aacgtgagca tccagggtgg gttcctgggg 4080 tatgcgtgaa gaatctggta aagatttttg agccctgtgg ccggccagct gtggaccgtc 4140 tgaacatcac cttctacgag aaccagatca ccgcattcct gggccacaat ggagctggga 4200 aaaccaccac cttgtaagta tcaaggttac aagacaggtt taaggagacc aatagaaact 4260 gggcttgtcg agacagagaa gactcttgcg tttctcgcag ggcagcctct gtcatctcca 4320 tcagggaggg gtccagtgtg gagtctcggt ggatctcgta tttcatgtct ccaggctcaa 4380 agagacccat gagatgggtc acagacgggt ccagggaagc ctgcatgagc tcagtgcggt 4440 tccacacata ccgggcaccc tggcgcttcg ccagccattc ctgcaccaga ttcttcccgt 4500 ccagcctggt cccaccttgg ctgtagtcat ctgggtactc agggtctggg gttcccatgc 4560 gaaacatgta ctttcggcct ccacaattga ggaaccccta gtgatggagt tggccactcc 4620 ctctctgcgc gctcgctcgc tcactgaggc cgggcgacca aaggtcgccc gacgcccggg 4680 ctttgcccgg gcggcctcag tgagcgagcg agcgcgcag 4719 <210> 62 <211> 4881 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 62 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatcccgca gggcagcctc tgtcatctcc atcagggagg ggtccagtgt 180 ggagtctcgg tggatctcgt atttcatgtc tccaggctca aagagaccca tgagatgggt 240 cacagacggg tccagggag cctgcatgag ctcagtgcgg ttccacacat accgggcacc 300 ctggcgcttc gccagccatt cctgcaccag attcttcccg tccagcctgg tcccaccttg 360 gctgtagtca tctgggtact cagggtctgg ggttcccatg cgaaacatgt actttcggcc 420 480. tccagatagg cacctattgg tcttactgac atccactttg cctttctctc cacaggtcca tcctgacggg tctgttgcca ccaacctctg ggactgtgct cgttgggga agggacattg 540 aaaccagcct ggatgcagtc cggcagagcc ttggcatgtg tccacagcac aacatcctgt 600 tccaccacct cacggtggct gagcacatgc tgttctatgc ccagctgaaa ggaagtccc aggaggaggc ccagctggag atggaagcca tgttggagga cacaggcctc caccacaagc ggaatgaaga ggctcaggac ctatcaggtg gcatgcagag aaagctgtcg gttgccattg cctttgtggg agatgccaag gtggtgattc tggacgaacc cacctctggg gtggaccctt 840 actcgagacg ctcaatctgg gatctgctcc tgaagtatcg ctcaggcaga accatcatca tgtccactca ccacatggac gaggccgacc tccttgggga ccgcattgcc atcattgccc 960 agggaaggct ctactgctca ggcaccccac tcttcctgaa gaactgcttt ggcacaggct 1020 tgtacttaac cttggtgcgc aagatgaaaa acatccagag ccaaaggaaa ggcagtgagg 1080 ggacctgcag ctgctcgtct aagggtttct ccaccacgtg tccagcccac gtcgatgacc 1140 taactccaga acaagtcctg gatggggatg taaatgagct gatggatgta gttctccacc 1200 atgttccaga ggcaaagctg gtggagtgca ttggtcaaga acttatcttc cttcttccaa 1260 ataagaactt caagcacaga gcatatgcca gccttttcag agagctggag gagacgctgg 1320 ctgaccttgg tctcagcagt tttggaattt ctgacactcc cctggaagag atttttctga 1380 aggtcacgga ggattctgat tcaggacctc tgtttgcggg tggcgctcag cagaaaagag 1440 aaaacgtcaa cccccgacac ccctgcttgg gtcccagaga gaaggctgga cagacacccc 1500 aggactccaa tgtctgctcc ccaggggcgc cggctgctca cccagagggc cagcctcccc 1560 cagagccaga gtgcccaggc ccgcagctca acacggggac acagctggtc ctccagcatg 1620 tgcaggcgct gctggtcaag agattccaac acaccatccg cagccacaag gacttcctgg 1680 cgcagatcgt gctcccggct acctttgtgt ttttggctct gatgctttct attgttatcc 1740 ctccttttgg cgaatacccc gctttgaccc ttcacccctg gatatatggg cagcagtaca 1800 ccttcttcag catggatgaa ccaggcagtg agcagttcac ggtacttgca gacgtcctcc 1860 tgaataagcc aggctttggc aaccgctgcc tgaaggaagg gtggcttccg gagtacccct 1920 gtggcaactc aacaccctgg aagactcctt ctgtgtcccc aaacatcacc cagctgttcc 1980 agaagcagaa atggacacag gtcaaccctt caccatcctg caggtgcagc accagggaga 2040 agctcaccat gctgccagag tgccccgagg gtgccggggg cctcccgccc ccccagagaa 2100 cacagcgcag cacggaaatt ctacaagacc tgacggacag gaacatctcc gacttcttgg 2160 taaaaacgta tcctgctctt ataagaagca gcttaaagag caaattctgg gtcaatgaac 2220 agaggtatgg aggaatttcc attggaggaa agctcccagt cgtccccatc acgggggaag 2280 cacttgttgg gtttttaagc gaccttggcc ggatcatgaa tgtgagcggg ggccctatca 2340 ctagagaggc ctctaaagaa atacctgatt tccttaaaca tctagaaact gaagacaaca 2400 ttaaggtgtg gtttaataac aaaggctggc atgccctggt cagctttctc aatgtggccc 2460 acaacgccat cttacgggcc agcctgccta aggacagaag ccccgaggag tatggaatca 2520 ccgtcattag ccaacccctg aacctgacca aggagcagct ctcagagatt acagtgctga 2580 ccacttcagt ggatgctgtg gttgccatct gcgtgatttt ctccatgtcc ttcgtcccag 2640 ccagctttgt cctttatttg atccaggagc gggtgaacaa atccaagcac ctccagttta 2700 tcagtggagt gagccccacc acctactggg taaccaactt cctctgggac atcatgaatt 2760 attccgtgag tgctgggctg gtggtgggca tcttcatcgg gtttcagaag aaagcctaca 2820 cttctccaga aaaccttcct gcccttgtgg cactgctcct gctgtatgga tgggcggtca 2880 ttcccatgat gtacccagca tccttcctgt ttgatgtccc cagcacagcc tatgtggctt 2940 tatcttgtgc taatctgttc atcggcatca acagcagtgc tattaccttc atcttggaat 3000 tatttgagaa taaccggacg ctgctcaggt tcaacgccgt gctgaggaag ctgctcattg 3060 tcttccccca cttctgcctg ggccggggcc tcattgacct tgcactgagc caggctgtga 3120 cagatgtcta tgcccggttt ggtgaggagc actctgcaaa tccgttccac tgggacctga 3180 ttgggaagaa cctgtttgcc atggtggtgg aaggggtggt gtacttcctc ctgaccctgc 3240 tggtccagcg ccacttcttc ctctcccaat ggattgccga gcccactaag gagcccattg 3300 ttgatgaaga tgatgatgtg gctgaagaaa gacaaagaat tattactggt ggaaataaaa 3360 ctgacatctt aaggctacat gaactaacca agatttatcc aggcacctcc agcccagcag 3420 tggacaggct gtgtgtcgga gttcgccctg gagagtgctt tggcctcctg ggagtgaatg 3480 gtgccggcaa aacaaccaca ttcaagatgc tcactgggga caccacagtg acctcagggg 3540 atgccaccgt agcaggcaag agtattttaa ccaatatttc tgaagtccat caaaatatgg 3600 gctactgtcc tcagtttgat gcaatcgatg agctgctcac aggacgagaa catctttacc 3660 tttatgcccg gcttcgaggt gtaccagcag aagaaatcga aaaggttgca aactggagta 3720 ttaagagcct gggcctgact gtctacgccg actgcctggc tggcacgtac agtgggggca 3780 acaagcggaa actctccaca gccatcgcac tcattggctg cccaccgctg gtgctgctgg 3840 atgagcccac cacagggatg gacccccagg cacgccgcat gctgtggaac gtcatcgtga 3900 gcatcatcag agaagggagg gctgtggtcc tcacatccca cagcatggaa gaatgtgagg 3960 cactgtgtac ccggctggcc atcatggtaa agggcgcctt tcgatgtatg ggcaccattc 4020 agcatctcaa gtccaaattt ggagatggct atatcgtcac aatgaagatc aaatccccga 4080 aggacgacct gcttcctgac ctgaaccctg tggagcagtt cttccagggg aacttcccag 4140 gcagtgtgca gagggagagg cactacaaca tgctccagtt ccaggtctcc tcctcctccc 4200 tggcgaggat cttccagctc ctcctctccc acaaggacag cctgctcatc gaggagtact 4260 cagtcacaca gaccacactg gaccaggtgt ttgtaaattt tgctaaacag cagactgaaa 4320 gtcatgacct ccctctgcac cctcgagctg ctggagccag tcgacaagcc caggacgact 4380 acaaagacca tgacggtgat tataaagatc atgacatcga ctacaaggat gacgatgaca 4440 agtgagcggc cgcttcgagc agacatgata agatacattg atgagtttgg acaaaccaca 4500 actagaatgc agtgaaaaaa atgctttatt tgtgaaattt gtgatgctat tgctttattt 4560 gtaaccatta taagctgcaa taaacaagtt aacaacaaca attgcattca ttttatgttt 4620 caggttcagg gggagatgtg ggaggttttt taaagcaagt aaaacctcta caaatgtggt 4680 aaaatcgata aggatcttcc tagagcatgg ctacgtagat aagtagcatg gcgggttaat 4740 cattaactac aaggaacccc tagtgatgga gttggccact ccctctctgc gcgctcgctc 4800 gctcactgag gccgggcgac caaaggtcgc ccgacgcccg ggctttgccc gggcggcctc 4860 agtgagcgag cgagcgcgca g 4881 <210> 63 <211> 4709 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 63 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca tatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat tatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggcc gcctggctga ccgcccaacg acccccgcc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtacac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 caattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc acccccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gaggaattcg aataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggatagaaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 tggggattga ctccacaagg aaggatccta tctattctta tgacagaga acacatcct 2400 ttgtaatgc attgatccag agcctggagt caatcctttt aaccaaaatc gcttggagg 2460 cggcaaagcc ttgctgatg ggaaaaatcc tgtacactcc tgatcacct gcagcacgaa 2520 ggatactgaa gatgccaac tcaacttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agaagtagggg cccagatct ggtactctt tgacacagc accagatga 2640 acatgatcag agataccctg gggaacccaa cagtaaaga cttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact gggaggacat atttacatc actgatcgca 2820 cccctccgcct tgtcaatca tacctggagt gcttggtcct ggataagttt gaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaac atgttctggg 2940 ccggagtggt atccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 aatagatccg atggacata gacgtggtgg agaaaaccaa agattaa gaggaggtatt 3060 gggattctgg tcccagagct gatcccgtgg aagatttccg gtacatctgg ggcgggtttg 3120 cctatctgca ggacatggtt gaacagggga tcacaaggag ccaggtgcag gcggaggctc 3180 cagttggaat ctacctccag cagatgccct acccctgctt cgtggacgat tctttcatga 3240 tcatcctgaa ccgctgtttc cctatcttca tggtgctggc atggatctac tctgtctcca 3300 tgactgtgaa gagcatcgtc ttggagaagg agttgcgact gaaggagacc ttgaaaaatc 3360 agggtgtctc caatgcagtg atttggtgta cctggttcct ggacagcttc tccatcatgt 3420 cgatgagcat cttcctcctg acgatattca tcatgcatgg aagaatccta cattacagcg 3480 acccattcat cctcttcctg ttcttgttgg ctttctccac tgccaccatc atgctgtgct 3540 ttctgctcag caccttcttc tccaaggcca gtctggcagc agcctgtagt ggtgtcatct 3600 atttcaccct ctacctgcca cacatcctgt gcttcgcctg gcaggaccgc atgaccgctg 3660 agctgaagaa ggctgtgagc ttactgtctc cggtggcatt tggatttggc actgagtacc 3720 tggttcgctt tgaagagcaa ggcctggggc tgcagtggag caacatcggg aacagtccca 3780 cggaagggga cgaattcagc ttcctgctgt ccatgcagat gatgctcctt gatgctgctg 3840 tctatggctt actcgcttgg taccttgatc aggtgttcc aggagactt ggaacccac 3900 ttccttgta ctttcttcta caagagtcgt attggctttgg cggtgaaggg tgttcaacca 3960 gagagaaag agccctggaa aagaccgagc ccctaacaga ggaaacggag gatccagagc 4020 acccagaagg atacacgac tccttctttg aacgtgagca tccaggttgg gttcctgggg 4080 tatgcgtgaa gatctgta aagatttg agccctgtgg ccggccagct gtggaccgtc 4140 tgaacatcac cttctacgag aaccagatca ccgcattcct gggccacaat ggagctggga 4200 aaaccaccac cttgtaagta tcaggttac aagacaggtt taaggacc atagaaact 4260 gggcttgtcg agacagagaa gactctgcg tttctgtgat cctaggtgga ggccgaaagt 4320 acatgtttcg catgggacc ccagaccctg agtacccaga tgactacagc caggtggga 4380 ccaggctgga cgggaagaat ctggtgcagg aatggctggc gaagcgccag ggtgcccggt 4440 acgtgtggaa ccgcactgag ctcatgcagg cttccctgga cccgctgtg acccatctca 4500 tgggtctctt tgagcctgga gacatgaaat acgagatcca ccgagactcc acactggacc 4560 cctccctgat ggacaattga ggaaccccta gtgatggagt tggccactcc ctctctgcgc 4620 gctcgctcgc tcactgaggc cgggcgacca aaggtcgccc gacgcccggg ctttgcccgg 4680 gcggcctcag tgagcgagcg agcgcgcag 4709 <210> 64 <211> 4871 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 64 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccgtga tcctaggtgg aggccgaaag tacatgtttc gcatgggaac 180 cccagaccct gagtacccag atgactacag ccaaggtggg accaggctgg acgggaagaa 240 tctggtgcag gaatggctgg cgaagcgcca gggtgcccgg tacgtgtgga accgcactga 300 gctcatgcag gcttccctgg acccgtctgt gacccatctc atgggtctct ttgagcctgg 360 agacatgaaa tacgagatcc accgagactc cacactggac ccctccctga tggagatagg 480. cacctattgg tcttactgac atccactttg cctttctctc cacaggtcca tcctgacggg tctgttgcca ccaacctctg ggactgtgct cgttgggggga agggacattg aaaccagcct ggatgcagtc cggcagagcc ttggcatgtg tccacagcac aacatcctgt tccaccacct 600 cacggtggct gagcacatgc tgttctatgc ccagctgaaa ggaaagtccc aggaggaggc 660 ccagctggag atggaagcca tgttggagga cacaggcctc caccacaagc ggaatgaaga ggctcaggac ctatcaggtg gcatgcagag aaagctgtcg gttgccattg cctttgtggg 780 agatgccaag gtggtgattc tggacgaacc cacctctggg gtggaccctt actcgagacg 840 900. ctcaatctgg gatctgctcc tgaagtatcg ctcaggcaga accatcatca tgtccactca ccacatggac gaggccgacc tccttgggga ccgcattgcc atcattgccc agggaaggct ctactgctca ggcaccccac tcttcctgaa gaactgcttt ggcacaggct tgtacttaac cttggtgcgc aagatgaaaa acatccagag ccaaaggaa ggcagtgagg ggacctgcag ctgctcgtct aagggtttct ccaccacgtg tccagcccac gtcgatgacc taactccaga 1140 acaagtcctg gatggggatg taaatgagct gatggatgta gttctccacc atgttccaga 1200 ggcaaagctg gtggagtgca ttggtcaaga acttatcttc cttcttccaa ataagaactt 1260 caagcacaga gcatatgcca gccttttcag agagctggag gagacgctgg ctgaccttgg 1320 tctcagcagt tttggaattt ctgacactcc cctggaagag atttttctga aggtcacgga 1380 ggattctgat tcaggacctc tgtttgcggg tggcgctcag cagaaaagag aaaacgtcaa 1440 cccccgacac ccctgcttgg gtcccagaga gaaggctgga cagacacccc aggactccaa 1500 tgtctgctcc ccaggggcgc cggctgctca cccagagggc cagcctcccc cagagccaga 1560 gtgcccaggc ccgcagctca acacggggac acagctggtc ctccagcatg tgcaggcgct 1620 gctggtcaag agattccaac acaccatccg cagccacaag gacttcctgg cgcagatcgt 1680 gctcccggct acctttgtgt ttttggctct gatgctttct attgttatcc ctccttttgg 1740 cgaatacccc gctttgaccc ttcacccctg gatatatggg cagcagtaca ccttcttcag 1800 catggatgaa ccaggcagtg agcagttcac ggtacttgca gacgtcctcc tgaataagcc 1860 aggctttggc aaccgctgcc tgaaggaagg gtggcttccg gagtacccct gtggcaactc 1920 aacaccctgg aagactcctt ctgtgtcccc aaacatcacc cagctgttcc agaagcagaa 1980 atggacacag gtcaaccctt caccatcctg caggtgcagc accagggaga agctcaccat 2040 gctgccagag tgccccgagg gtgccgggg cctcccgccc ccccagagaa cacagcgcag 2100 cacggaaatt ctacaagacc tgacggacag gaacatctcc gacttcttgg taaaaacgta 2160 tcctgctctt ataagaagca gcttaaagag caaattctgg gtcaatgaac agaggtatgg 2220 aggaatttcc attggaggaa agctcccagt cgtccccatc acgggggaag cacttgttgg 2280 gtttttaagc gaccttggcc ggatcatgaa tgtgagcggg ggccctatca ctagagaggc 2340 ctctaaagaa atacctgatt tccttaaaca tctagaaact gaagacaaca ttaaggtgtg 2400 gtttaataac aaaggctggc atgccctggt cagctttctc aatgtggccc acaacgccat 2460 cttacgggcc agcctgccta aggacagaag ccccgaggag tatggaatca ccgtcattag 2520 ccaacccctg aacctgacca aggagcagct ctcagagatt acagtgctga ccacttcagt 2580 ggatgctgtg gttgccatct gcgtgatttt ctccatgtcc ttcgtcccag ccagctttgt 2640 cctttatttg atccaggagc gggtgaacaa atccaagcac ctccagttta tcagtggagt 2700 gagccccacc acctactggg taaccaactt cctctgggac atcatgaatt attccgtgag 2760 tgctgggctg gtggtgggca tcttcatcgg gtttcagaag aaagcctaca cttctccaga 2820 aaaccttcct gcccttgtgg cactgctcct gctgtatgga tgggcggtca ttcccatgat 2880 gtacccagca tccttcctgt ttgatgtccc cagcacagcc tatgtggctt tatcttgtgc 2940 taatctgttc atcggcatca acagcagtgc tattaccttc atcttggaat tatttgagaa 3000 taaccggacg ctgctcaggt tcaacgccgt gctgaggaag ctgctcattg tcttccccca 3060 cttctgcctg ggccggggcc tcattgacct tgcactgagc caggctgtga cagatgtcta 3120 tgcccggttt ggtgaggagc actctgcaaa tccgttccac tgggacctga ttgggaagaa 3180 cctgtttgcc atggtggtgg aaggggtggt gtacttcctc ctgaccctgc tggtccagcg 3240 ccacttcttc ctctcccaat ggattgccga gcccactaag gagcccattg ttgatgaaga 3300 tgatgatgtg gctgaagaaa gacaaagaat tattactggt ggaaataaaa ctgacatctt 3360 aaggctacat gaactaacca agatttatcc aggcacctcc agcccagcag tggacaggct 3420 gtgtgtcgga gttcgccctg gagagtgctt tggcctcctg ggagtgaatg gtgccggcaa 3480 aacaaccaca ttcaagatgc tcactgggga caccacagtg acctcagggg atgccaccgt 3540 agcaggcaag agtattttaa ccaatatttc tgaagtccat caaaatatgg gctactgtcc 3600 tcagtttgat gcaatcgatg agctgctcac aggacgagaa catctttacc tttatgcccg 3660 gcttcgaggt gtaccagcag aagaaatcga aaaggttgca aactggagta ttaagagcct 3720 gggcctgact gtctacgccg actgcctggc tggcacgtac agtgggggca acaagcggaa 3780 actctccaca gccatcgcac tcattggctg cccaccgctg gtgctgctgg atgagcccac 3840 cacagggatg gacccccagg cacgccgcat gctgtggaac gtcatcgtga gcatcatcag 3900 agaagggagg gctgtggtcc tcacatccca cagcatggaa gaatgtgagg cactgtgtac 3960 ccggctggcc atcatggtaa agggcgcctt tcgatgtatg ggcaccattc agcatctcaa 4020 gtccaaattt ggagatggct atatcgtcac aatgaagatc aaatccccga aggacgacct 4080 gcttcctgac ctgaaccctg tggagcagtt cttccagggg aacttcccag gcagtgtgca 4140 gagggagagg cactacaaca tgctccagtt ccaggtctcc tcctcctccc tggcgaggat 4200 cttccagctc ctcctctccc acaaggacag cctgctcatc gaggagtact cagtcacaca 4260 gaccacactg gaccaggtgt ttgtaaattt tgctaaacag cagactgaaa gtcatgacct 4320 ccctctgcac cctcgagctg ctggagccag tcgacaagcc caggacgact acaaagacca 4380 tgacggtgat tataaagatc atgacatcga ctacaaggat gacgatgaca agtgagcggc 4440 cgcttcgagc agacatgata agatacattg atgagtttgg acaaaccaca actagaatgc 4500 agtgaaaaaa atgctttatt tgtgaaattt gtgatgctat tgctttattt gtaaccatta 4560 taagctgcaa taaacaagtt aacaacaaca attgcattca ttttatgttt caggttcagg 4620 gggagatgtg ggaggttttt taaagcaagt aaaacctcta caaatgtggt aaaatcgata 4680 aggatcttcc tagagcatgg ctacgtagat aagtagcatg gcgggttaat cattaactac 4740 aaggaacccc tagtgatgga gttggccact ccctctctgc gcgctcgctc gctcactgag 4800 gccgggcgac caaaggtcgc ccgacgcccg ggctttgccc gggcggcctc agtgagcgag 4860 cgagcgcgca g 4871 <210> 65 <211> 4073 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 65 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca taatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagtggg ccccagaagc 360 ctggtggttg tttgtccttc tcaggggaaa agtgaggcgg ccccttggag gaaggggccg 420 ggcagaatga tctaatcgga ttccaagcag ctcaggggat tgtctttttc tagcaccttc ttgccactcc tagcgtcct ccgtgacccc ggctgggatt tagcctggtg ctgtgtcagc 540 cccggggctcc caggggcttc ccagtggtcc ccaggaccc tcgacagggc cagggcgtct 600 ctctcgtcca gcaagggcag ggacgggcca caggcaaggg cgcggccgcc atgggcttcg 660 tgagacagat acagcttttg ctctggaaga actggaccct gcggaaaagg caaaagattc gctttgtggt ggaactcgtg tggcctttat ctttatttct ggtcttgatc tggttaagga 780 atgccaaccc gctctacagc catcatgaat gccatttccc caacaaggcg atgccctcag caggatgct gccgtggctc caggggatct tctgcaatgt gacaatccc tgttttcaaa gccccacccc aggagaatct cctggaattg tgtcaaacta taacaactcc atcttggcaa gggtatatcg agttttcaa gaactcctca tgaatgcacc agagagccag caccttggcc gtatttggac agagctacac atcttgtccc aattcatgga caccctccgg actcacccgg agagaattgc aggaagga attcgaata gggatatctt gaaagatgaa gaaacactga cactatttct cattaaaac atcggcctgt ctgactcagt ggtctacctt ctgatcaact 1200 ctcaagtccg tccagagcag ttcgctcatg gagtcccgga cctggcgctg areacatcg 1260 cctgcagcga ggccctcctg gagcgcttca tcatcttcag ccagagacgc ggggcaaaga 1320 cggtgcgcta tgccctgtgc tccctctccc agggcaccct acagtggata gaagacactc 1380 tgtatgccaa cgtggacttc ttcaagctct tccgtgtgct tcccacactc ctagacagcc 1440 gttctcaagg tatcaatctg agatcttggg gaggaatatt atctgatatg tcaccaagaa 1500 ttcaagagtt tatccatcgg ccgagtatgc aggacttgct gtgggtgacc aggcccctca 1560 tgcagaatgg tggtccagag acctttacaa agctgatggg catcctgtct gacctcctgt 1620 gtggctaccc cgagggaggt ggctctcggg tgctctcctt caactggtat gaagacaata 1680 actataaggc ctttctgggg attgactcca caaggaagga tcctatctat tcttatgaca 1740 gaaacaac atccttttgt aatgcattga tccagagcct ggagtcaaat cctttaacca 1800 aaatcgcttg gagggcggca aagcctttgc tgatgggaaa aatcctgtac actcctgatt 1860 cacctgcagc acgaaggata ctgaagaatg ccaactcaac ttttgaagaa ctggaacacg 1920 ttaggaagtt ggtcaaagcc tgggaagaag tagggcccca gatctggtac ttctttgaca 1980 acagcacaca gatgaacatg atcagagata ccctggggaa cccaacagta aaagactttt 2040 tgaataggca gcttggtgaa gaaggtatta ctgctgaagc catcctaaac ttcctctaca 2100 agggccctcg ggaaagccag gctgacgaca tggccaactt cgactggagg gacatattta 2160 acatcactga tcgcaccctc cgccttgtca atcaatacct ggagtgcttg gtcctggata 2220 agtttgaaag ctacaatgat gaaactcagc tcacccaacg tgccctctct ctactggagg 2280 aaaacatgtt ctgggccgga gtggtattcc ctgacatgta tccctggacc agctctctac 2340 caccccacgt gaagtataag atccgaatgg acatagacgt ggtggagaaa accaataaga 2400 ttaaagacag gtattgggat tctggtccca gagctgatcc cgtggaagat ttccggtaca 2460 tctggggcgg gtttgcctat ctgcaggaca tggttgaaca ggggatcaca aggagccagg 2520 tgcaggcgga ggctccagtt ggaatctacc tccagcagat gccctacccc tgcttcgtgg 2580 acgattcttt catgatcatc ctgaaccgct gtttccctat cttcatggtg ctggcatgga 2640 tctactctgt ctccatgact gtgaagagca tcgtcttgga gaaggagttg cgactgaagg 2700 agaccttgaa aaatcagggt gtctccaatg cagtgatttg gtgtacctgg ttcctggaca 2760 gcttctccat catgtcgatg agcatcttcc tcctgacgat attcatcatg catggaagaa 2820 tcctacatta cagcgaccca ttcatcctct tcctgttctt gttggctttc tccactgcca 2880 ccatcatgct gtgctttctg ctcagcacct tcttctccaa ggccagtctg gcagcagcct 2940 gtagtggtgt catctatttc accctctacc tgccacacat cctgtgcttc gcctggcagg 3000 accgcatgac cgctgagctg aagaaggctg tgagcttact gtctccggtg gcatttggat 3060 ttggcactga gtacctggtt cgctttgaag agcaaggcct ggggctgcag tggagcaaca 3120 tcgggaacag tcccacggaa ggggacgaat tcagcttcct gctgtccatg cagatgatgc 3180 tccttgatgc tgctgtctat ggcttactcg cttggtacct tgatcaggtg tttccaggag 3240 actatggaac cccacttcct tggtactttc ttctacaaga gtcgtattgg cttggcggtg 3300 aagggtgttc aaccagagaa gaaagagccc tggaaaagac cgagccccta acagaggaaa 3360 cggaggatcc agagcaccca gaaggaatac acgactcctt ctttgaacgt gagcatccag 3420 ggtgggttcc tggggtatgc gtgaagaatc tggtaaagat ttttgagccc tgtggccggc 3480 cagctgtgga ccgtctgaac atcaccttct acgagaacca gatcaccgca ttcctgggcc 3540 acaatggagc tgggaaaacc accaccttgt aagtatcaag gttacaagac aggtttaagg 3600 agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct ccccgggtgc 3660 gcggcgtcgg tggtgccggc ggggggcgcc aggtcgcagg cggtgtaggg ctccaggcag 3720 gcggcgaagg ccatgacgtg cgctatgaag gtctgctcct gcacgccgtg aaccaggtgc 3780 gcctgcgggc cgcgcgcgaa caccgccacg tcctcgcctg cgtgggtctc ttcgtccagg 3840 ggcactgctg actgctgccg atactcgggg ctcccgctct cgctctcggt aacatccggc 3900 cgggcgccgt ccttgagcac atagcctgga ccgtttccaa ttgaggaacc cctagtgatg 3960 gagttggcca ctccctctct gcgcgctcgc tcgctcactg aggccgggcg accaaaggtc 4020 gcccgacgcc cgggctttgc ccgggcggcc tcagtgagcg agcgagcgcg cag 4073 <210> 66 <211> 4074 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 66 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca taatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagtggg ccccagaagc 360 ctggtggttg tttgtccttc tcaggggaaa agtgaggcgg ccccttggag gaaggggccg 420 ggcagaatga tctaatcgga ttccaagcag ctcaggggat tgtctttttc tagcaccttc 480 ttgccactcc taagcgtcct ccgtgacccc ggctgggatt tagcctggtg ctgtgtcagc 540 cccggggctcc caggggcttc ccagtggtcc ccaggaccc tcgacagggc cagggcgtct 600 ctctcgtcca gcaagggcag ggacgggcca caggcaaggg cgcggccgcc atgggcttcg 660 tgagacagat acagcttttg ctctggaaga actggaccct gcggaaaagg caaaagattc gctttgtggt ggaactcgtg tggcctttat ctttatttct ggtcttgatc tggttaagga 780 atgccaaccc gctctacagc catcatgaat gccatttccc caacaaggcg atgccctcag caggatgct gccgtggctc caggggatct tctgcaatgt gacaatccc tgttttcaaa gccccacccc aggagaatct cctggaattg tgtcaaacta taacaactcc atcttggcaa gggtatatcg agttttcaa gaactcctca tgaatgcacc agagagccag caccttggcc gtatttggac agagctacac atcttgtccc aattcatgga caccctccgg actcacccgg agagaattgc aggaagga attcgaata gggatatctt gaaagatgaa gaaacactga cactatttct cattaaaaac atcggcctgt ctgactcagt ggtctacctt ctgatcaact ctcaagtccg tccagagcag ttcgctcatg gagtcccgga cctggcgctg aaggacatcg cctgcagcga ggccctcctg gagcgcttca tcatcttcag ccagagacgc ggggcaaaga 1320 cggtgcgcta tgccctgtgc tccctctccc agggcaccct acagtggata gaagacactc 1380 tgtatgccaa cgtggacttc ttcaagctct tccgtgtgct tcccacactc ctagacagcc 1440 gttctcaagg tatcaatctg agatcttggg gaggaatatt atctgatatg tcaccaagaa 1500 ttcaagagtt tatccatcgg ccgagtatgc aggacttgct gtgggtgacc aggcccctca 1560 tgcagaatgg tggtccagag acctttacaa agctgatggg catcctgtct gacctcctgt 1620 gtggctaccc cgagggaggt ggctctcggg tgctctcctt caactggtat gaagacaata 1680 actataaggc ctttctgggg attgactcca caaggaagga tcctatctat tcttatgaca 1740 gaaacaac atccttttgt aatgcattga tccagagcct ggagtcaaat cctttaacca 1800 aaatcgcttg gagggcggca aagcctttgc tgatgggaaa aatcctgtac actcctgatt 1860 cacctgcagc acgaaggata ctgaagaatg ccaactcaac ttttgaagaa ctggaacagg 1920 ttaggaagtt ggtcaaagcc tgggaagaag tagggcccca gatctggtac ttctttgaca 1980 acagcacaca gatgaacatg atcagagata ccctggggaa cccaacagta aaagactttt 2040 tgaataggca gcttggtgaa gaaggtatta ctgctgaagc catcctaaac ttcctctaca 2100 agggccctcg ggaaagccag gctgacgaca tggccaactt cgactggagg gacatattta 2160 acatcactga tcgcaccctc cgccttgtca atcaatacct ggagtgcttg gtcctggata 2220 agtttgaaag ctacaatgat gaaactcagc tcacccaacg tgccctctct ctactggagg 2280 aaaacatgtt ctgggccgga gtggtattcc ctgacatgta tccctggacc agctctctac 2340 caccccacgt gaagtataag atccgaatgg acatagacgt ggtggagaaa accaataaga 2400 ttaaagacag gtattgggat tctggtccca gagctgatcc cgtggaagat ttccggtaca 2460 tctggggcgg gtttgcctat ctgcaggaca tggttgaaca ggggatcaca aggagccagg 2520 tgcaggcgga ggctccagtt ggaatctacc tccagcagat gccctacccc tgcttcgtgg 2580 acgattcttt catgatcatc ctgaaccgct gtttccctat cttcatggtg ctggcatgga 2640 tctactctgt ctccatgact gtgaagagca tcgtcttgga gaaggagttg cgactgaagg 2700 agaccttgaa aaatcagggt gtctccaatg cagtgatttg gtgtacctgg ttcctggaca 2760 gcttctccat catgtcgatg agcatcttcc tcctgacgat attcatcatg catggaagaa 2820 tcctacatta cagcgaccca ttcatcctct tcctgttctt gttggctttc tccactgcca 2880 ccatcatgct gtgctttctg ctcagcacct tcttctccaa ggccagtctg gcagcagcct 2940 gtagtggtgt catctatttc accctctacc tgccacacat cctgtgcttc gcctggcagg 3000 accgcatgac cgctgagctg aagaaggctg tgagcttact gtctccggtg gcatttggat 3060 ttggcactga gtacctggtt cgctttgaag agcaaggcct ggggctgcag tggagcaaca 3120 tcgggaacag tcccacggaa ggggacgaat tcagcttcct gctgtccatg cagatgatgc 3180 tccttgatgc tgctgtctat ggcttactcg cttggtacct tgatcaggtg tttccaggag 3240 actatggaac cccacttcct tggtactttc ttctacaaga gtcgtattgg cttggcggtg 3300 aagggtgtc aaccagagaa gaaagagccc tggaaaagac cgagcccta acagaggaaa 3360 cggaggatcc agagcaccca gaaggaatac acgactcctt ctttgaacgt gagcatccag 3420 ggtgggttcc tggggtatgc gtgaagaatc tggtaaagat ttttgagccc tgtggccggc 3480 cagctgtgga ccgtctgaac atcaccttct acgagaacca gatcaccgca ttcctgggcc 3540 acaatggagc tgggaaaacc accaccttgt aagtatcaag gttacaagac aggtttaagg 3600 agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct cgcagggcag 3660 cctctgtcat ctccatcagg gaggggtcca gtgtggagtc tcggtggatc tcgtatttca 3720 tgtctccagg ctcaaagaga cccatgagat gggtcacaga cgggtccagg gaagcctgca 3780 tgagctcagt gcggttccac acataccggg caccctggcg cttcgccagc cattcctgca 3840 ccagattctt cccgtccagc ctggtcccac cttggctgta gtcatctggg tactcagggt 3900 ctggggttcc catgcgaaac atgtactttc ggcctccaca attgaggaac ccctagtgat 3960 ggagttggcc actccctctc tgcgcgctcg ctcgctcact gaggccgggc gaccaaaggt 4020 cgcccgacgc ccgggctttg cccgggcggc ctcagtgagc gagcgagcgc gcag 4074 <210> 67 <211> 4636 <212> DNA <213> Synthetic Sequence <220> <223> Synthesis <400> 67 ctctcccccc tgtcgcgttc gctcgctcgc tggctcgttt gggggggtgg cagctcaaag 60 agctgccaga cgacggccct ctggccgtcg cccccccaaa cgagccagcg agcgagcgaa 120 cgcgacaggg gggagagtgc cacactctca agcaaggggg ttttgtaagc agtgagctag 180 cctgaattcc agcacactgg cggccgttac tagtggatct tcaatattgg ccattagcca 240 tattattcat tggttatata gcataaatca atattggcta ttggccattg catacgttgt 300 atctatatca taatatgtac atttatattg gctcatgtcc aatatgaccg ccatgttggc 360 attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 420 atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 480 acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 540 tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 600 tgtatcatat gccaagtccg cccccctattg acgtcaatga cggtaaatgg cccgcctggc 660 attatgccca gtacatgacc ttacgggact ttcctacttg gcagtacatc tacgtattag 720 tcatcgctat taccatggtg atgcggtttt ggcagtacac caatgggcgt ggatagcggt 780 ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt ttgttttggc 840 accaaaatca acgggacttt ccaaaatgtc gtaataaccc cgccccgttg acgcaaatgg 900 gcggtaggcg tgtacggtgg gaggtctata taagcagagc tcgtttagtg aaccgtcaga 960 tcactagaag ctttattgcg gtagtttatc acagttaaat tgctaacgca gtcagtgctt 1020 ctgacacaac agtctcgaac ttaagctgca gaagttggtc gtgaggcact gggcaggtaa 1080 gtatcaaggt tacaagacag gtttaaggag accaatagaa actgggcttg tcgagacaga 1140 gaagactctt gcgtttctga taggcaccta ttggtcttac tgacatccac tttgcctttc 1200 tctccacagg tgtccactcc cagttcaatt acagctctta aggctagagt acttaatacg 1260 actcactata ggctagcctc gagaattcac gcgtggtacc tctagagtcg acccgggcgg 1320 1380 aaaggcaaaa gattcgcttt gtggtggaac tcgtgtggcc tttatcttta tttctggtct 1440 tgatctggtt aaaggaatgcc aacccgctct acagccatca tgaatgccat ttccccaaca 1500 aggcgatgcc ctcagcagga atgctgccgt ggctccaggg gatcttctgc aatgtgaaca 1560 atccctgttt tcaaagcccc accccaggag aatctcctgg aattgtgtca aactataaca 1620 actccatctt ggcaagggta tatcgagatt ttcaagaact cctcatgaat gcaccagaga 1680 gccagcacct tggccgtatt tggacagagc tacacatctt gtcccaattc atggacaccc 1740 1800 atgaagaac actgacacta tttctcatta aaaacatcgg cctgtctgac tcagtggtct 1860 accttctgat caactctcaa gtccgtccag agcagttcgc tcatggagtc ccggacctgg 1920 cgctgaagga catcgcctgc agcgaggccc tcctggagcg cttcatcatc ttcagccaga 1980 gacgcggggc aaagacggtg cgctatgccc tgtgctccct ctcccagggc accctacagt 2040 ggataaga cactctgtat gccaacgtgg acttcttcaa gctcttccgt gtgcttccca 2100 cactcctaga cagccgttct caaggtatca atctgagatc ttggggagga atattatctg 2160 atatgtcacc aagaattcaa gagtttatcc atcggccgag tatgcaggac ttgctgtggg 2220 tgaccaggcc cctcatgcag aatggtggtc cagagacctt tacaaagctg atgggcatcc 2280 tgtctgacct cctgtgtggc taccccgagg gaggtggctc tcgggtgctc tccttcaact 2340 ggtatgaaga caataactat aaggcctttc tggggattga ctccacaagg aaggatccta 2400 tctattctta tgacagaaga acaacatcct tttgtaatgc attgatccag agcctggagt 2460 caaatccttt aaccaaaatc gcttggaggg cggcaaagcc tttgctgatg ggaaaaatcc 2520 tgtacactcc tgattcacct gcagcacgaa ggatactgaa gaatgccaac tcaacttttg 2580 aagaactgga acacgttagg aagttggtca aagcctggga agaagtaggg ccccagatct 2640 ggtacttctt tgacaacagc acacagatga acatgatcag agataccctg gggaacccaa 2700 ftaaaga ctttttgaat aggcagcttg gtgaagaagg tattactgct gaagccatcc 2760 taaacttcct ctacaagggc cctcgggaaa gccaggctga cgacatggcc aacttcgact 2820 ggagggacat atttaacatc actgatcgca ccctccgcct tgtcaatcaa tacctggagt 2880 gcttggtcct ggataagttt gaaagctaca atgatgaaac tcagctcacc caacgtgccc 2940 tctctctact ggaggaaaac atgttctggg ccggagtggt attccctgac atgtatccct 3000 ggaccagctc tctaccaccc cacgtgaagt ataagatccg aatggacata gacgtggtgg 3060 agaaaaccaa taagattaaa gacaggtatt ggggactacaa agaccatgac ggtgattata 3120 aagatcatga catcgactac aaggatgacg atgacaagga ttctggtccc agagctgatc 3180 ccgtggaaga tttccggtac atctggggcg ggtttgccta tctgcaggac atggttgaac 3240 aggggatcac aagggagccag gtgcaggcgg aggctccagt tggaatctac ctccagcaga 3300 tgccctaccc ctgcttcgtg gacgattctt tcatgatcat cctgaaccgc tgtttcccta 3360 tcttcatggt gctggcatgg atctactctg tctccatgac tgtgaagagc atcgtcttgg 3420 agaaggagtt gcgactgaag gagaccttga aaaatcaggg tgtctccaat gcagtgattt 3480 ggtgtacctg gttcctggac agcttctcca tcatgtcgat gagcatcttc ctcctgacga 3540 tattcatcat gcatggaaga atcctacatt acagcgaccc attcatcctc ttcctgttct 3600 tgttggcttt ctccactgcc accatcatgc tgtgctttct gctcagcacc ttcttctcca 3660 aggccagtct ggcagcagcc tgtagtggtg tcatctattt caccctctac ctgccacaca 3720 tcctgtgctt cgcctggcag gaccgcatga ccgctgagct gaagaggct gtgagcttac 3780 tgtctccggt ggcatttgga tttggcactg agtacctggt tcgctttgaa gagcaaggcc 3840 tggggctgca gtggagcaac atcgggaaca gtcccacgga aggggacgaa ttcagcttcc 3900 tgctgtccat gcagatgatg ctccttgatg ctgctgtcta tggcttactc gcttggtacc 3960 ttgatcaggt gtttccagga gactatggaa ccccacttcc ttggtacttt cttctacaag 4020 agtcgtattg gcttggcggt gaagggtgtt caaccagaga agaaagagcc ctggaaaaga 4080 4140 tctttgaacg tgagcatcca gggtgggttc ctggggtatg cgtgaagaat ctggtaaaga 4200 tttttgagcc ctgtggccgg ccagctgtgg accgtctgaa catcaccttc tacgagaacc 4260 agatcaccgc attcctgggc cacaatggag ctgggaaaac caccaccttg taagtatcaa 4320 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 4380 cttgcgtttc tgggattttt ccgatttcgg cctattggtt aaaaaatgag ctgatttaac 4440 aaaaatttaa cgcgaatttt aacaaaatat taacgtttat aatttcaggt ggcatctttc 4500 caattgagga acccctagtg atggagttgg ccactccctc tctgcgcgct cgctcgctca 4560 ctgaggccgg gcgaccaaag gtcgcccgac gcccgggctt tgcccgggcg gcctcagtga 4620 gcgagcgagc gcgcag 4636 <210> 68 <211> 4731 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 68 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga tttttccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaacaa aatattaacg tttataattt caggtggcat 240 ctttcgatag gcacctattg gtcttactga catccacttt gcctttctct ccacaggtcc 300 atcctgacgg gtctgttgcc accaacctct gggactgtgc tcgttggggg aagggacatt 360 gaaaccagcc tggatgcagt ccggcagagc cttggcatgt gtccacagca caacatcctg 420 ttccaccacc tcacggtggc tgagcacatg ctgttctatg cccagctgaa aggaaagtcc 480 caggaggagg cccagctgga gatggaagcc atgttggagg acacaggcct ccaccacaag 540 cggaatgaag aggctcagga cctatcaggt ggcatgcaga gaaagctgtc ggttgccatt 600 gcctttgtgg gagatgccaa ggtggtgatt ctggacgaac ccacctctgg ggtggaccct 660 tactcgagac gctcaatctg ggatctgctc ctgaagtatc gctcaggcag aaccatcatc 720 atgtccactc accacatgga cgaggccgac ctccttgggg accgcattgc catcattgcc 780 cagggaaggc tctactgctc aggcacccca ctcttcctga agaactgctt tggcacaggc 840 ttgtacttaa ccttggtgcg caagatgaaa aacatccaga gccaaaggaa aggcagtgag 900 gggacctgca gctgctcgtc taagggtttc tccaccacgt gtccagccca cgtcgatgac 960 ctaactccag aacaagtcct ggatggggat gtaaatgagc tgatggatgt agttctccac 1020 catgttccag aggcaaagct ggtggagtgc attggtcaag aacttatctt ccttcttcca 1080 aataagaact tcaagcacag agcatatgcc agccttttca gagagctgga ggagacgctg 1140 gctgaccttg gtctcagcag ttttggaatt tctgacactc ccctggaaga gatttttctg 1200 aaggtcacgg aggattctga ttcaggacct ctgtttgcgg gtggcgctca gcagaaaaga 1260 gaaaacgtca acccccgaca cccctgcttg ggtcccagag agaaggctgg acagacaccc 1320 caggactcca atgtctgctc cccaggggcg ccggctgctc acccagaggg ccagcctccc 1380 ccagagccag agtgcccagg cccgcagctc aacacgggga cacagctggt cctccagcat 1440 gtgcaggcgc tgctggtcaa gagattccaa cacaccatcc gcagccacaa ggacttcctg 1500 gcgcagatcg tgctcccggc tacctttgtg tttttggctc tgatgctttc tattgttatc 1560 cctccttttg gcgaataccc cgctttgacc cttcacccct ggatatatgg gcagcagtac 1620 accttcttca gcatggatga accaggcagt gagcagttca cggtacttgc agacgtcctc 1680 ctgaataagc caggctttgg caaccgctgc ctgaaggaag ggtggcttcc ggagtacccc 1740 tgtggcaact caacaccctg gaagactcct tctgtgtccc caaacatcac ccagctgttc 1800 cagaagcaga aatggacaca ggtcaaccct tcaccatcct gcaggtgcag caccagggag 1860 1920 acacagcgca gcacggaaat tctacaagac ctgacggaca ggaacatctc cgacttcttg 1980 gtaaaaacgt atcctgctct tataagaagc agctaaaga gcaaattctg ggtcaatgaa 2040 cagaggtatg gaggaatttc cattggagga aagctcccag tcgtccccat cacgggggaa 2100 gcacttgttg ggtttttaag cgaccttggc cggatcatga atgtgagcgg gggccctatc 2160 2220 attaaggtgt ggtttaataa caaaggctgg catgccctgg tcagctttct caatgtggcc 2280 cacaacgcca tcttacgggc cagcctgcct aagcagaa gccccgaga gtatggaatc 2340 accgtcatta gccaacccct gaacctgacc aaggagcagc tctcagagat tacagtgctg 2400 accacttcag tggatgctgt ggttgccatc tgcgtgattt tctccatgtc cttcgtccca 2460 gccagctttg tcctttattt gatccaggag cgggtgaaca aatccaagca cctccagttt 2520 atcagtggag tgagccccac cacctactgg gtaaccaact tcctctggga catcatgaat 2580 tattccgtga gtgctgggct ggtggtgggc atcttcatcg ggtttcagaa gaaagcctac 2640 acttctccag aaaaccttcc tgcccttgtg gcactgctcc tgctgtatgg atgggcggtc 2700 attcccatga tgtacccagc atccttcctg tttgatgtcc ccagcacagc ctatgtggct 2760 ttatcttgtg ctaatctgtt catcggcatc aacagcagtg ctattacctt catcttggaa 2820 ttatttgaga ataaccggac gctgctcagg ttcaacgccg tgctgaggaa gctgctcatt 2880 gtcttccccc acttctgcct gggccggggc ctcattgacc ttgcactgag ccaggctgtg 2940 acagatgtct atgcccggtt tggtgaggag cactctgcaa atccgttcca ctgggacctg 3000 attgggaaga acctgtttgc catggtggtg gaaggggtgg tgtacttcct cctgaccctg 3060 ctggtccagc gccacttctt cctctcccaa tggattgccg agcccactaa ggagcccatt 3120 gttgatgaag atgatgatgt ggctgaagaa agacaaagaa ttatactgg tggaaataa 3180 actgacatct taaggctaca tgaactaacc aagatttatc caggcacctc cagcccagca 3240 gtggacaggc tgtgtgtcgg agttcgccct ggagagtgct ttggcctcct gggagtgaat 3300 ggtgccggca aaacaaccac attcaagatg ctcactgggg acaccacagt gacctcaggg 3360 gatgccaccg tagcaggcaa gagtatttta accaatattt ctgaagtcca tcaaaatatg 3420 ggctactgtc ctcagtttga tgcaatcgat gagctctctca caggacgaga acatctttac 3480 ctttatgccc ggcttcgagg tgtaccagca gaagaaatcg aaaaggttgc aaactggagt 3540 attaagagcc tgggcctgac tgtctacgcc gactgcctgg ctggcacgta cagtgggggc 3600 aaaagcgga aactctccac agccatcgca ctcattggct gcccaccgct ggtgctgctg 3660 gatgagccca ccacagggat ggacccccag gcacgccgca tgctgtggaa cgtcatcgtg 3720 agcatcatca gagagggag ggctgtggtc ctcacatccc acagcatgga agaatgtgag 3780 gcactgtgta cccggctggc catcatggta aagggcgcct ttcgatgtat gggcaccatt 3840 cagcatctca agtccaaatt tggagatggc tatatcgtca caatgaagat caaatccccg 3900 aggacgacc tgcttcctga cctgaaccct gtggagcagt tcttccaggg gaacttccca 3960 ggcagtgtgc agagggagag gcactacaac atgctccagt tccaggtctc ctcctctcc 4020 ctggcgagga tcttccagct cctctctcc cacaaggaca gcctgctcat cgaggagtac 4080 tcagtcacac agaccacact ggaccaggtg tttgtaaatt ttgctaaaca gcagactgaa 4140 agtcatgacc tccctctgca ccctcgagct gctggagcca gtcgacaagc ccaggacgac 4200 tacaaagacc atgacggtga ttaataaagat catgacatcg actacaagga tgacgatgac 4260 aagtgagcgg ccgcttcgag cagacatgat aagatacatt gatgagtttg gacaaaccac 4320 aactagaatg cagtgaaaaa aatgctttat ttgtgaaatt tgtgatgcta ttgctttatt 4380 tgtaaccatt ataagctgca ataaacaagt taacaacaac aattgcattc attttatgtt 4440 tcaggttcag ggggagatgt gggaggtttt ttaaagcaag taaaacctct acaaatgtgg 4500 taaaatcgat aggatcttc ctagagcatg gctacatctg cagaattcag gctagctcac 4560 tgcttacaaa acccccttgc ttgagagtgt ggcactctcc cccctgtcgc gttcgctcgc 4620 tcgctggctc gtttgggggg gcgacggcca gagggccgtc gtctggcagc tctttgagct 4680 gccacccccc caaacgagcc agcgagcgag cgaacgcgac aggggggaga g 4731 <210> 69 <211> 4420 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 69 ctctcccccc tgtcgcgttc gctcgctcgc tggctcgttt gggggggtgg cagctcaaag 60 agctgccaga cgacggccct ctggccgtcg cccccccaaa cgagccagcg agcgagcgaa 120 cgcgacaggg gggagagtgc cacactctca agcaaggggg ttttgtaagc agtgagctag 180 cgtgccacct ggtcgacatt gattattgac tagttattaa tagtaatcaa ttacggggtc 240 attagttcat agcccatata tggagttccg cgttacataa cttacggtaa atggcccgcc 300 tggctgaccg cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt 360 aacgccaata gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca 420 cttggcagta catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg 480 taaatggccc gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca 540 gtacatctac gtattagtca tcgctattac catggtcgag gtgagcccca cgttctgctt 600 cactctcccc atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt 660 attttgtgca gcgatggggg cggggcgggg cgaggcggag aggtgcggcg gcagccaatc 720 ggagcggcgc gctccgaaag tttcctttta tggcgaggcg gcggcggcgg cggctctata 780 aaaagcgaag cgcgcggcgg gcggctgcag aagttggtcg tgaggcactg ggcaggtaag 840 tatcaaggtt acaagacagg tttaaggaga ccaatagaaa ctgggcttgt cgagacagag 900 aagactcttg cgtttctgat aggcacctat tggtcttact gacatccact ttgcctttct 960 ctccacaggt gtccaggcgg ccgccatggt gattcttcag cagggggacc atgtgtggat 1020 ggacctgaga ttggggcagg agttcgacgt gcccatcggg gcggtggtga agctctgcga 1080 ctctgggcag gtccaggtgg tggatgatga agacaatgaa cactggatct ctccgcagaa 1140 cgcaacgcac atcaagccta tgcaccccac gtcggtccac ggcgtggagg acatgatccg 1200 cctgggggac ctcaacgagg cgggcatctt gcgcaacctg cttatccgct accgggacca 1260 cctcatctac acgtatacgg gctccatcct ggtggctgtg aacccctacc agctgctctc 1320 catctactcg ccagagcaca tccgccagta taccaacaag aagattgggg agatgccccc 1380 ccacatcttt gccattgctg acaactgcta cttcaacatg aaacgcaaca gccgagacca 1440 gtgctgcatc atcagtgggg aatctggggc cgggaagacg gagagcacaa agctgatcct 1500 gcagttcctg gcagccatca gtgggcagca ctcgtggatt gagcagcagg tcttggaggc 1560 cacccccatt ctggaagcat ttgggaatgc caagaccatc cgcaatgaca actcaagccg 1620 tttcggaaag tacatcgaca tccacttcaa caagcggggc gccatcgagg gcgcgaagat 1680 tgagcagtac ctgctggaaa agtcacgtgt ctgtcgccag gccctggatg aaaggaacta 1740 ccacgtgttc tactgcatgc tggagggcat gagtgaggat cagaagaaga agctgggctt 1800 gggccaggcc tctgactaca actacttggc catgggtaac tgcataacct gtgagggccg 1860 ggtggacagc caggagtacg ccaacatccg ctccgccatg aaggtgctca tgttcactga 1920 caccgagaac tgggagatct cgaagctcct ggctgccatc ctgcacctgg gcaacctgca 1980 gtatgaggca cgcacatttg aaaacctgga tgcctgtgag gttctcttct ccccatcgct 2040 ggccacagct gcatccctgc ttgaggtgaa ccccccagac ctgatgagct gcctgactag 2100 ccgcaccctc atcacccgcg gggagacggt gtccacccca ctgagcaggg aacaggcact 2160 ggacgtgcgc gacgccttcg taaaggggat ctacgggcgg ctgttcgtgt ggattgtgga 2220 caagatcaac gcagcaattt acaagcctcc ctcccaggat gtgaagaact ctcgcaggtc 2280 catcggcctc ctggacatct ttgggtttga gaactttgct gtgaacagct ttgagcagct 2340 ctgcatcaac ttcgccaatg agcacctgca gcagttcttt gtgcggcacg tgttcaagct 2400 ggagcaggag gaatatgacc tggagagcat tgactggctg cacatcgagt tcactgacaa 2460 ccaggatgcc ctggacatga ttgccaacaa gcccatgaac atcatctccc tcatcgatga 2520 ggagagcaag ttccccaagg gcacagacac caccatgtta cacaagctga actcccagca 2580 caagctcaac gccaactaca tcccccccaa gaacaaccat gagacccagt ttggcatcaa 2640 ccattttgca ggcatcgtct actatgagac ccaaggcttc ctggagaaga accgagacac 2700 cctgcatggg gacattatcc agctggtcca ctcctccagg aacaagttca tcaagcagat 2760 cttccaggcc gatgtcgcca tgggcgccga gaccaggaag cgctcgccca cacttagcag 2820 ccagttcaag cggtcactgg agctgctgat gcgcacgctg ggtgcctgcc agcccttctt 2880 tgtgcgatgc atcaagccca atgagttcaa gaagcccatg ctgttcgacc ggcacctgtg 2940 cgtgcgccag ctgcggtact caggaatgat ggagaccatc cgaatccgcc gagctggcta 3000 ccccatccgc tacagcttcg tagagtttgt ggagcggtac cgtgtgctgc tgccaggtgt 3060 gaagccggcc tacaagcagg gcgacctccg cgggacttgc cagcgcatgg ctgaggctgt 3120 gctgggcacc cacgatgact ggcagatagg caaaaccaag atctttctga aggaccacca 3180 tgacatgctg ctggaagtgg agcgggacaa agccatcacc gacagagtca tcctccttca 3240 gaaagtcatc cggggattca aagacaggtc taactttctg aagctgaaga acgctgccac 3300 actgatccag aggcactggc ggggtcacaa ctgtaggaag aactacgggc tgatgcgtct 3360 gggcttcctg cggctgcagg ccctgcaccg ctcccggaag ctgcaccagc agtaccgcct 3420 ggcccgccag cgcatcatcc agttccaggc ccgctgccgc gcctatctgg tgcgcaaggc 3480 cttccgccac cgcctctggg ctgtgctcac cgtgcaggcc tatgcccggg gcatgatcgc 3540 ccgcaggctg caccaacgcc tcagggctga gtatctgtgg cgcctcgagg ctgagaaaat 3600 gcggctggcg gaggaagaga agcttcggaa ggagatgagc gccaagaagg ccaaggagga 3660 ggccgagcgc aagcatcagg agcgcctggc ccagctggct cgtgaggacg ctgagcggga 3720 gctgaaggag aaggaggccg ctcggcggaa gaaggagctc ctggagcaga tggaaagggc 3780 ccgccatgag cctgtcaatc actcagacat ggtggacaag atgtttggct tcctggggac 3840 ttcaggtggc ctgccaggcc aggagggcca ggcacctagt ggctttgagg acctggagcg 3900 agggcggagg gagatggtgg aggaggacct ggatgcagcc ctgcccctgc ctgacgagga 3960 tgaggaggac ctctctgagt ataaatttgc caagttcgcg gccacctact tccaggggac 4020 aactacgcac tcctacaccc ggcggccact caaacagcca ctgctctacc atgacgacga 4080 gggtgaccag ctggtaagta tcaaggttac aagacaggtt taaggagacc aatagaaact 4140 gggcttgtcg agacagagaa gactcttgcg tttctgggat ttttccgatt tcggcctatt 4200 ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgt 4260 ttataatttc aggtggcatc tttccaattg aggaacccct agtgatggag ttggccactc 4320 cctctctgcg cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg 4380 gctttgcccg ggcggcctca gtgagcgagc gagcgcgcag 4420 <210> 70 <211> 4367 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 70 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga ttttgccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaacaa aatattaacg tttataattt caggtggcat 240 ctttcgatag gcacctattg gtcttactga catccacttt gcctttctct ccacaggcag 300 ccctggcggt ctggatcacc atcctccgct tcatggggga cctccctgag cccaagtacc 360 acacagccat gagtgatggc agtgagaaga tccctgtgat gaccaagatt tatgagaccc 420 tgggcaagaa gacgtacaag agggagctgc aggccctgca gggcgagggc gaggcccagc 480 tccccgaggg ccagaagaag agcagtgtga ggcacaagct ggtgcatttg actctgaaaa 540 agaagtccaa gctcacagag gaggtgacca agaggctgca tgacggggag tccacagtgc 600 agggcaacag catgctggag gaccggccca cctccaacct ggagaagctg cacttcatca 660 tcggcaatgg catcctgcgg ccagcactcc gggacgagat ctactgccag atcagcaagc 720 agctgaccca caacccctcc aagagcagct atgcccgggg ctggattctc gtgtctctct 780 gcgtgggctg tttcgccccc tccgagaagt ttgtcaagta cctgcggaac ttcatccacg 840 ggggcccgcc cggctacgcc ccgtactgtg aggagcgcct gagaaggacc tttgtcaatg 900 ggacacggac acagccgccc agctggctgg agctgcaggc caccaagtcc aagaagccaa 960 tcatgttgcc cgtgacattc atggatggga ccaccaagac cctgctgacg gactcggcaa 1020 ccacggccaa ggagctctgc aacgcgctgg ccgacaagat ctctctcaag gaccggttcg 1080 ggttctccct ctacattgcc ctgtttgaca aggtgtcctc cctgggcagc ggcagtgacc 1140 acgtcatgga cgccatctcc cagtgcgagc agtacgccaa ggagcagggc gcccaggagc 1200 gcaacgcccc ctggaggctc ttcttccgca aagaggtctt cacgccctgg cacagcccct 1260 ccgaggacaa cgtggccacc aacctcatct accagcaggt ggtgcgagga gtcaagttttg 1320 gggagtacag gtgtgagaag gaggacgacc tggctgagct ggcctcccag cagtactttg 1380 tagactatgg ctctgagatg atcctggagc gcctcctgaa cctcgtgcc acctcatcc 1440 ccgaccgcga gatcacgccc ctgaagacgc tggagaagtg ggcccagctg gccatcgccg 1500 cccacaagaa ggggatttat gcccagagga gaactgatgc ccagaaggtc aaagaggatg 1560 tggtcagtta tgcccgcttc aagtggccct tgctcttctc caggttttat gaagcctaca 1620 aattctcagg ccccagtctc cccaagaacg acgtcatcgt ggccgtcaac tggacgggtg 1680 tgtactttgt ggatgagcag gagcaggtac ttctggagct gtccttccca gagatcatgg 1740 ccgtgtccag cagcagggag tgccgtgtct ggctctcact gggctgctct gatcttggct 1800 gtgctgcgcc tcactcaggc tgggcaggac tgaccccggc ggggccctgt tctccgtgtt 1860 ggtcctgcag gggagcgaaa acgacggccc ccagcttcac gctggccacc atcaaggggg 1920 acgaatacac cttcacctcc agtaatgctg aggacattcg tgacctggtg gtcaccttcc 1980 tagaggggct ccggaagaga tctaagtatg ttgtggccct gcaggataac cccaaccccg 2040 caggcgagga gtcaggcttc ctcagctttg ccaagggaga cctcatcatc ctggaccatg 2100 acacgggcga gcaggtcatg aactcgggct gggccaacgg catcaatgag aggaccaagc 2160 agcgtgggga cttccccacc gactgtgtgt acgtcatgcc cactgtcacc atgccacctc 2220 gtgagattgt ggccctggtc accatgactc ccgatcagag gcaggacgtt gtccggctct 2280 tgcagctgcg aacggcggag cccgaggtgc gtgccaagcc ctacacgctg gaggagtttt 2340 cctatgacta cttcaggccc ccacccaagc acacgctgag ccgtgtcatg gtgtccaagg cccgaggcaa ggaccggctg tggagccaca cgcgggaacc gctcaagcag gcgctgctca agagctcct gggcagtgag gagctctcgc aggaggcctg cctggccttc attgctgtgc 2520 tcaagtacat gggcgactac ccgtccaaga ggacacgctc cgtcaatgag ctcaccgacc agatctttga gggtcccctg aaagccgagc ccctgaagga cgaggcatat gtgcagatcc 2640. tgaagcagct gaccgacaac cacatcaggt acagcgagga gcggggttgg gagctgctct ggctgtgcac gggccttttc ccacccagca acatcctcct gccccacgtg cagcgcttcc 2760 tgcagtcccg aaagcactgc ccactcgcca tcgactgcct gcaacggctc cagaaagccc tgagaaacgg gtcccggaag taccctccgc acctggtgga ggtggaggcc atccagcaca agaccaccca gattttccac aaggtctact tccctgatga cactgacgag gccttcgaag tggagtccag caccaaggcc aaggacttct gccagacat cgccaccagg ctgctcctca agtcctcaga gggattcagc ctctttgtca aaattgcaga caaggtcatc agcgttcctg agaatgactt cttctttgac tttgttcgac acttgacaga ctggataaag aaagctcggc 3120 ccatcaagga cggaattgtg ccctcactca cctaccaggt gttcttcatg aagaagctgt 3180 ggaccaccac ggtgccaggg aaggatccca tggccgattc catcttccac tattaccagg 3240 agttgcccaa gtatctccga ggctaccaca agtgcacgcg ggaggaggtg ctgcagctgg 3300 gggcgctgat ctacagggtc aagttcgagg aggacaagtc ctacttcccc agcatcccca 3360 agctgctgcg ggagctggtg ccccaggacc ttatccggca ggtctcacct gatgactgga 3420 agcggtccat cgtcgcctac ttcaacaagc acgcagggaa gtccaaggag gaggccaagc 3480 tggccttcct gaagctcatc ttcaagtggc ccacctttgg ctcagccttc ttcgaggtga 3540 agcaaactac ggagccaaac ttccctgaga tcctcctaat tgccatcaac aagtatgggg 3600 tcagcctcat cgatcccaaa acgaaggata tcctcaccac tcatcccttc accaagatct 3660 ccaactggag cagcggcaac acctacttcc acatcaccat tgggaacttg gtgcgcggga 3720 gcaaactgct ctgcgagacg tcactgggct acaagatgga tgacctcctg acttcctaca 3780 ttagccagat gctcacagcc atgagcaaac agcggggctc caggagcggc aagatgtatg 3840 atgttcctga ttatgctagc ctctgaccgc ggcctgctgc cggctctgcg gcctcttccg 3900 cgtcttcgag atctgcctcg actgtgcctt ctagttgcca gccatctgtt gtttgcccct 3960 cccccgtgcc ttccttgacc ctggaaggtg ccactcccac tgtcctttcc taataaaatg 4020 aggaaattgc atcgcattgt ctgagtaggt gtcattctat tctggggggt ggggtggggc 4080 aggacagcaa gggggaggat tgggaagaca atagcaggca tgctggggac tcgagcaatt 4140 cccgataagg atcttcctag agcatggcta catctgcaga attcaggcta gctcactgct 4200 tacaaaaccc ccttgcttga gagtgtggca ctctcccccc tgtcgcgttc gctcgctcgc 4260 tggctcgttt gggggggcga cggccagagg gccgtcgtct ggcagctctt tgagctgcca 4320 cccccccaaa cgagccagcg agcgagcgaa cgcgacaggg gggagag 4367 <210> 71 <211> 4738 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 71 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc cgggcgtcg ggcgacctt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca tatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat tatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggcc gcctggctga ccgcccaacg acccccgcc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtcac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 ccattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc accccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gaggaattcg aataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggatagaaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 2400 tttgtaatgc attgatccag agcctggagt caaatccttt aaccaaaatc gcttggaggg 2460 cggcaaagcc tttgctgatg ggaaaaatcc tgtacactcc tgattcacct gcagcacgaa 2520 ggaatgaa gaatgccaac tcaacttttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agagtggg ccccagatct ggtacttctt tgacaacagc acacagatga 2640 acatgatcag agataccctg gggaacccaa cagtaaaaga cttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact ggagggacat atttaacatc actgatcgca 2820 ccctccgcct tgtcaatcaa tacctggagt gcttggtcct ggataagttt gaaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaaac atgttctggg 2940 ccggagtggt attccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 ataagatccg aatggacata gacgtggtgg agaaaaccaa taagattaaa gacaggtatt 3060 gggactacaa agaccatgac ggtgattata aagatcatga catcgactac aaggatgacg 3120 atgacaagga ttctggtccc agagctgatc ccgtggaaga tttccggtac atctggggcg 3180 ggtttgccta tctgcaggac atggttgaac aggggatcac aagggagccag gtgcaggcgg 3240 aggctccagt tggaatctac ctccagcaga tgccctaccc ctgcttcgtg gacgattctt 3300 tcatgatcat cctgaaccgc tgtttcccta tcttcatggt gctggcatgg atctactctg 3360 tctccatgac tgtgaagagc atcgtcttgg agaaggagtt gcgactgaag gagaccttga 3420 aaaatcagg tgtctccaat gcagtgattt ggtgtacctg gttcctggac agcttctcca 3480 tcatgtcgat gagcatcttc ctcctgacga tattcatcat gcatggaaga atcctacatt 3540 acagcgaccc attcatcctc ttcctgttct tgttggcttt ctccactgcc accatcatgc 3600 tgtgctttct gctcagcacc ttcttctcca aggccagtct ggcagcagcc tgtagtggtg 3660 tcatctattt caccctctac ctgccacaca tcctgtgctt cgcctggcag gaccgcatga 3720 ccgctgagct gaaaggct gtgagcttac tgtctccggt ggcatttgga tttggcactg 3780 agtacctggt tcgctttgaa gagcaaggcc tggggctgca gtggagcaac atcgggaaca 3840 gtcccacgga aggggacgaa ttcagcttcc tgctgtccat gcagatgatg ctccttgatg 3900 ctgctgtcta tggcttactc gcttggtacc ttgatcaggt gtttccagga gactatggaa 3960 ccccacttcc ttggtacttt cttctacaag agtcgtattg gcttggcggt gaagggtgtt 4020 caaccagaga agaaagagcc ctggaaaaga ccgagcccct aacagagaa acggaggatc 4080 4140 ctggggtatg cgtgaagaat ctggtaaaga tttttgagcc ctgtggccgg ccagctgtgg 4200 accgtctgaa catcaccttc tacgagaacc agatcaccgc attcctgggc caaatggag 4260 ctgggaaaac caccaccttg tagtatcaa ggttacaaga caggtttaag gagaccaata 4320 gaaactgggc ttgtcgagac agagaagact cttgcgtttc tgggattttt ccgatttcgg 4380 cctattggtt aaaaaatgag ctgatttaac aaaaatttaa cgcgaatttt aacaaaatat 4440 taacgtttat aatttcaggt ggcatctttc caattcgccc ttagatctag cctatcctgg 4500 attacttgaa cgatagccta tcctggatta cttgaaaagc ttagcctatc ctggattact 4560 tgaatcacag cctatcctgg attacttgaa agatctaagg gcgaattgag gaacccctag 4620 tgatggagtt ggccactccc tctctgcgcg ctcgctcgct cactgaggcc gggcgaccaa 4680 [[ID=##ID=12]]aggtcgcccg acgcccgggc tttgcccggg cggcctcagt gagcgagcga gcgcgcag 4738 <210> 72 <211> 4770 [[ID=1##ID=18]]<212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 72 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 It should be noted that there may be some errors in the original text. For example, in the translation of "人工序列", it is translated as "Artificial Sequence" here. You can adjust it according to the actual situation. And in the original text, there may be some incorrect "##ID=" and "##ID=1" in the tags which are just presented as they are in the translation for the purpose of following the rules.aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca tatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat tatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggcc gcctggctga ccgcccaacg acccccgcc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtacac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 caattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc acccccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gaggaattcg aataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggatagaaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 tggggattga ctccacaagg aaggatccta tctattctta tgacagaga acacatcct 2400 ttgtaatgc attgatccag agcctggagt caatcctttt aaccaaaatc gcttggagg 2460 cggcaaagcc ttgctgatg ggaaaaatcc tgtacactcc tgatcacct gcagcacgaa 2520 ggatactgaa gatgccaac tcaacttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agaagtagggg cccagatct ggtactctt tgacacagc accagatga 2640 acatgatcag agataccctg gggaacccaa cagtaaaga cttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact gggaggacat atttacatc actgatcgca 2820 cccctccgcct tgtcaatca tacctggagt gcttggtcct ggataagttt gaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaac atgttctggg 2940 ccggagtggt atccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 aatagatccg atggacata gacgtggtgg agaaaaccaa agattaa gaggaggtatt 3060 gggactacaa agaccatgac ggtgattata aagatcatga catcgactac areatgacg 3120 atgacaagga ttctggtccc agagctgaatc ccgtggaaga tttccggtac atctggggcg 3180 ggtttgccta tctgcaggac atggttgaac aggggatcac aagggccag gtgcaggcgg 3240 aggctccagt tggaatctac ctccagcaga tgccctaccc ctgcttcgtg gacgattctt 3300 tcatgatcat cctgaaccgc tgtttcccta tcttcatggt gctggcatgg atctactctg 3360 tctccatgac tgtgaagagc atcgtcttgg agaaggagtt gcgactgaag gagaccttga 3420 aaaatcaggg tgtctccaat gcagtgattt ggtgtacctg gttcctggac agcttctcca 3480 tcatgtcgat gagcatcttc ctcctgacga tattcatcat gcatggaga atcctacatt 3540 acagcgaccc attcatcctc ttcctgttct tgttggcttt ctccactgcc accatcatgc 3600 tgtgctttct gctcagcacc ttcttctcca aggccagtct ggcagcagcc tgtagtggtg 3660 tcatctattt caccctctac ctgccacaca tcctgtgctt cgcctggcag gaccgcatga 3720 ccgctgagct gaaaggct gtgagcttac tgtctccggt ggcatttgga tttggcactg 3780 agtacctggt tcgctttgaa gagcaaggcc tggggctgca gtggagcaac atcgggaaca 3840 gtcccacgga aggggacgaa ttcagcttcc tgctgtccat gcagatgatg ctccttgatg 3900 ctgctgtcta tggcttactc gcttggtacc ttgatcaggt gtttccagga gactatggaa 3960 ccccacttcc ttggtacttt cttctacaag agtcgtattg gcttggcggt gaagggtgtt 4020 caaccagaga agaaagagcc ctggaaaaga ccgagcccct aacagagaa acggaggatc 4080 4140 ctggggtatg cgtgaagaat ctggtaaaga tttttgagcc ctgtggccgg ccagctgtgg 4200 accgtctgaa catcaccttc tacgagaacc agatcaccgc attcctgggc caaatggag 4260 ctgggaaaac caccaccttg tagtatcaa ggttacaaga caggtttaag gagaccaata 4320 gaaactgggc ttgtcgagac agaaagact cttgcgtttc tgggattttt ccgatttcgg 4380 cctattggtt aaaaaatgag ctgatttaac aaaaatttaa cgcgaatttt aacaaaatat 4440 taacgtttat aatttcaggt ggcatctttc caattgaggc ataggatgac aagggaacg 4500 ataggcatag gatgacaaag ggaaaagctt aggcatagga tgacaaaggg aaggtaccag 4560 atctggcatt caccgcgtgc cttacgatgg cattcaccgc gtgccttaaa gcttggcatt 4620 caccgcgtgc cttacaattg aggaacccct agtgatggag ttggccactc cctctctgcg 4680 cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg gctttgcccg 4740 ggcggcctca gtgagcgagc gagcgcgcag 4770 <210> 73 <211> 4656 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 73 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct 180 aggaagatct tcaatattgg ccattagcca tattattcat tggttatata gcataaatca 240 atattggcta ttggccattg catacgttgt atctatatca taatatgtac atttatattg 300 gctcatgtcc aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat 360 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 420 taaatggccc gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt 480 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 540 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg 600 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact 660 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 720 ggcagtacac caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 780 ccattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 840 gtaataaccc cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 900 taagcagagc tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc 960 acagttaaat tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca 1020 gaagttggtc gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag 1080 1140 ttggtcttac tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt 1200 acagctctta aggctagagt acttaatacg actcactata ggctagcctc gagaattcac 1260 gcgtggtacc tctagagtcg acccgggcgg ccgccatggg cttcgtgaga cagatacagc 1320 ttttgctctg gaagaactgg accctgcgga aaaggcaaaa gattcgcttt gtggtggaac 1380 tcgtgtggcc tttatcttta tttctggtct tgatctggtt aaaggaatgcc aacccgctct 1440 acagccatca tgaatgccat ttccccaaca aggcgatgcc ctcagcagga atgctgccgt 1500 ggctccaggg gatcttctgc aatgtgaaca atccctgttt tcaaagcccc acccccaggag 1560 aatctcctgg aattgtgtca aactataaca actccatctt ggcaagggta tatcgagatt 1620 ttcaagaact cctcatgaat gcaccagaga gccagcacct tggccgtatt tggacagagc 1680 tacacatctt gtcccaattc atggacaccc tccggactca cccggagaga attgcaggaa 1740 gaggaattcg aataagggat atcttgaaag atgaagaaac actgacacta tttctcatta 1800 aaaacatcgg cctgtctgac tcagtggtct accttctgat caactctcaa gtccgtccag 1860 agcagttcgc tcatggagtc ccggacctgg cgctgaagga catcgcctgc agcgaggccc 1920 tcctggagcg cttcatcatc ttcagccaga gacgcggggc aaagacggtg cgctatgccc 1980 tgtgctccct ctcccagggc accctacagt ggatagaaga cactctgtat gccaacgtgg 2040 acttcttcaa gctcttccgt gtgcttccca cactcctaga cagccgttct caaggtatca 2100 atctgagatc ttggggagga atattatctg atatgtcacc aagaattcaa gagtttatcc 2160 atcggccgag tatgcaggac ttgctgtggg tgaccaggcc cctcatgcag aatggtggtc 2220 cagagacctt tacaaagctg atgggcatcc tgtctgacct cctgtgtggc taccccgagg 2280 gaggtggctc tcgggtgctc tccttcaact ggtatgaaga caataactat aaggcctttc 2340 tggggattga ctccacaagg aaggatccta tctattctta tgacagaaga acaacatcct 2400 tttgtaatgc attgatccag agcctggagt caaatccttt aaccaaaatc gcttggaggg 2460 cggcaaagcc ttgctgatg ggaaaaatcc tgtacactcc tgatcacct gcagcacgaa 2520 ggatactgaa gatgccaac tcaacttg aagaactgga acacgttagg aagttggtca 2580 aagcctggga agaagtagggg cccagatct ggtactctt tgacacagc accagatga 2640 acatgatcag agataccctg gggaacccaa cagtaaaga cttttgaat aggcagcttg 2700 gtgaagaagg tattactgct gaagccatcc taaacttcct ctacaagggc cctcgggaaa 2760 gccaggctga cgacatggcc aacttcgact gggaggacat atttacatc actgatcgca 2820 cccctccgcct tgtcaatca tacctggagt gcttggtcct ggataagttt gaagctaca 2880 atgatgaaac tcagctcacc caacgtgccc tctctctact ggaggaaac atgttctggg 2940 ccggagtggt atccctgac atgtatccct ggaccagctc tctaccaccc cacgtgaagt 3000 aatagatccg atggacata gacgtggtgg agaaaaccaa agattaa gaggaggtatt 3060 gggactacaa agaccatgac ggtgattata aagatcatga catcgactac aaggatgacg 3120 atgacaagga ttctggtccc agagctgatc ccgtggaaga ttccggtac atctggggcg 3180 ggtttgccta tctgcaggac atggttgaac aggggatcac aagggccag gtgcaggcgg 3240 aggctccagt tggaatctac ctccagcaga tgccctaccc ctgcttcgtg gacgattctt 3300 tcatgatcat cctgaaccgc tgtttcccta tcttcatggt gctggcatgg atctactctg 3360 tctccatgac tgtgaagagc atcgtcttgg agaaggagtt gcgactgaag gagaccttga 3420 aaaatcaggg tgtctccaat gcagtgattt ggtgtacctg gttcctggac agcttctcca 3480 tcatgtcgat gagcatcttc ctcctgacga tattcatcat gcatggaga atcctacatt 3540 acagcgaccc attcatcctc ttcctgttct tgttggcttt ctccactgcc accatcatgc 3600 tgtgctttct gctcagcacc ttcttctcca aggccagtct ggcagcagcc tgtagtggtg 3660 tcatctattt caccctctac ctgccacaca tcctgtgctt cgcctggcag gaccgcatga 3720 ccgctgagct gaaaggct gtgagcttac tgtctccggt ggcatttgga tttggcactg 3780 agtacctggt tcgctttgaa gagcaaggcc tggggctgca gtggagcaac atcgggaaca 3840 gtcccacgga aggggacgaa ttcagcttcc tgctgtccat gcagatgatg ctccttgatg 3900 ctgctgtcta tggcttactc gcttggtacc ttgatcaggt gtttccagga gactatggaa 3960 ccccacttcc ttggtacttt cttctacaag agtcgtattg gcttggcggt gaagggtgtt 4020 caaccagaga agaaagagcc ctggaaaaga ccgagcccct aacagagaa acggaggatc 4080 4140 ctggggtatg cgtgaagaat ctggtaaaga tttttgagcc ctgtggccgg ccagctgtgg 4200 accgtctgaa catcaccttc tacgagaacc agatcaccgc attcctgggc caaatggag 4260 ctgggaaaac caccaccttg tagtatcaa ggttacaaga caggtttaag gagaccaata 4320 gaaactgggc ttgtcgagac agaaagact cttgcgtttc tgggattttt ccgatttcgg 4380 cctattggtt aaaaaatgag ctgatttaac aaaaatttaa cgcgaatttt aacaaaatat 4440 taacgtttat aatttcaggt ggcatctttc ccgcctgcaa gaactggttc agcagcctga 4500 gccacttcgt gatccacctg caattgagga acccctagtg atggagttgg ccactccctc 4560 tctgcgcgct cgctcgctca ctgaggccgg gcgaccaaag gtcgccccgac gcccgggctt 4620 tgcccgggcg gcctcagtga gcgagcgagc gcgcag 4656 <210> 74 <211> 4719 <212> DNA <213> Artificial Sequence <220> <223> Synthetic <400> 74 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga tttttccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaacaa aatattaacg tttataattt caggtggcat 240 ctttcaagct ttgaatgaat gagataggca cctattggtc ttactgacat ccactttgcc 300 tttctctcca caggtccatc ctgacgggtc tgttgccacc aacctctggg actgtgctcg 360 ttgggggaag ggacattgaa accagcctgg atgcagtccg gcagagcctt ggcatgtgtc 420 cacagcacaa catcctgttc caccacctca cggtggctga gcacatgctg ttctatgccc 480 agctgaaagg aaagtcccag gaggaggccc agctggagat ggaagccatg ttggaggaca 540 caggcctcca ccacaagcgg aatgaagagg ctcaggacct atcaggtggc atgcagagaa agctgtcggt tgccattgcc tttgtgggag atgccaaggt ggtgattctg gacgaaccca 660 cctctggggt ggacccttac tcgagacgct caatctggga tctgctcctg aagtatcgct 720 caggcagaac catcatcatg tccactcacc acatggacga ggccgacctc cttggggacc gcattgccat cattgcccag ggaaggctct actgctcagg caccccactc ttcctgaaga 840 actgctttgg cacaggcttg tacttaacct tggtgcgcaa gatgaaaaac atccagagcc aaaggaaagg cagtgagggg acctgcagct gctcgtctaa gggtttctcc accacgtgtc cagcccacgt cgatgaccta actccagaac aagtcctgga tggggatgta aatgagctga tggatgtagt tctccaccat gttccagagg caaagctggt ggagtgcatt ggtcaagaac ttatcttcct tcttccaaat aagaacttca agcacagagc atatgccagc cttttcagag agctgagga gacgctggct gaccttggtc tcagcagttt tggaattct gacactcccc tggaagagat ttttctgaag gtcacggagg attctgattc aggacctctg tttgcgggtg gcgctcagca gaaagagaa aacgtcaacc cccgacaccc ctgcttgggt cccagagaga 1320 aggctggaca gacaccccag gactccaatg tctgctcccc aggggcgccg gctgctcacc 1380 cagaggcca gcctccccca gagccagagt gcccaggccc gcagctcaac acggggacac 1440 agctggtcct ccagcatgtg caggcgctgc tggtcaagag attccaacac accatccgca 1500 gccacaagga cttcctggcg cagatcgtgc tcccggctac ctttgtgttt ttggctctga 1560 tgctttctat tgttatccct ccttttggcg aataccccgc tttgaccctt cacccctgga 1620 tatatgggca gcagtacacc ttcttcagca tggatgaacc aggcagtgag cagttcacgg 1680 tacttgcaga cgtcctcctg aatagccag gctttggcaa ccgctgcctg aaagagggt 1740 ggcttccgga gtacccctgt ggcaactcaa caccctggaa gactccttct gtgtccccaa 1800 acatcaccca gctgttccag aagcagaaat ggacacaggt caacccttca ccatcctgca 1860 ggtgcagcac cagggagaag ctcaccatgc tgccagagtg ccccgagggt gccggggggcc 1920 tcccgcccc ccagagaaca cagcgcagca cggaaattct acaagacctg acggacagga 1980 acatctccga cttcttggta aaaacgtatc ctgctcttat aagaagcagc ttaaagagca 2040 aattctgggt caatgaacag aggtatggag gaatttccat tggaggaaag ctcccagtcg 2100 tccccatcac gggggaagca cttgttgggt ttttaagcga ccttggccgg atcatgaatg 2160 tgagcggggg ccctatcact agagaggcct ctaaagaaat acctgatttc cttaaacatc 2220 tagaaactga agacaacatt aaggtgtggt ttaataacaa aggctggcat gccctggtca 2280 gctttctcaa tgtggcccac aacgccatct tacgggccag cctgcctaag gacagaagcc 2340 ccgaggagta tggaatcacc gtcattagcc aacccctgaa cctgaccaag gagcagctct 2400 cagagattac agtgctgacc acttcagtgg atgctgtggt tgccatctgc gtgattttct 2460 ccatgtcctt cgtcccagcc agctttgtcc tttatttgat ccaggagcgg gtgaacaaat 2520 ccaagcacct ccagttatc agtggagtga gccccaccac ctactgggta accaacttcc 2580 tctgggacat catgaattat tccgtgagtg ctgggctggt ggtgggcatc ttcatcgggt 2640 ttcagaagaa agcctacact tctccagaaa accttcctgc ccttgtggca ctgctcctgc 2700 tgtatggatg ggcggtcatt cccatgatgt acccagcatc cttcctgttt gatgtcccca 2760 gcacagccta tgtggcttta tcttgtgcta atctgttcat cggcatcaac agcagtgcta 2820 ttaccttcat cttggaatta tttgagaata accggacgct gctcaggttc aacgccgtgc 2880 tgaggaagct gctcattgtc ttcccccact tctgcctggg ccggggcctc attgaccttg 2940 cactgagcca ggctgtgaca gatgtctatg cccggtttgg tgaggagcac tctgcaaatc 3000 cgttccactg ggacctgatt gggaagaacc tgtttgccat ggtggtggaa ggggtggtgt 3060 acttcctcct gaccctgctg gtccagcgcc acttcttcct ctcccaatgg attgccgagc 3120 ccactaagga gcccattgtt gatgaagatg atgatgtggc tgaagaaaga caaagaatta 3180 ttactggtgg aaataaaact gacatcttaa ggctacatga actaaccaag atttatccag 3240 gcacctccag cccagcagtg gacaggctgt gtgtcggagt tcgccctgga gagtgctttg 3300 gcctcctggg agtgaatggt gccggcaaaa caaccacatt caagatgctc actggggaca 3360 ccacagtgac ctcaggggat gccaccgtag caggcaagag tattttaacc aatatttctg 3420 aagtccatca aaatatgggc tactgtcctc agtttgatgc aatcgatgag ctgctcacag 3480 gacgagaaca tctttacctt tatgcccggc ttcgaggtgt accagcagaa gaaatcgaaa 3540 aggttgcaaa ctggagtatt aagagcctgg gcctgactgt ctacgccgac tgcctggctg 3600 gcacgtacag tgggggcaac aagcggaaac tctccacagc catcgcactc attggctgcc 3660 caccgctggt gctgctggat gagcccacca cagggatgga cccccaggca cgccgcatgc 3720 tgtggaacgt catcgtgagc atcatcagag aagggagggc tgtggtcctc acatcccaca 3780 gcatggaaga atgtgaggca ctgtgtaccc ggctggccat catggtaaag ggcgcctttc 3840 gatgtatggg caccattcag catctcaagt ccaaatttgg agatggctat atcgtcacaa 3900 tgaagatcaa atccccgaag gacgacctgc ttcctgacct gaaccctgtg gagcagttct 3960 tccaggggaa cttcccaggc agtgtgcaga gggagaggca ctacaacatg ctccagttcc 4020 aggtctcctc ctcctccctg gcgaggatct tccagctcct cctctcccac aaggacagcc 4080 tgctcatcga ggagtactca gtcacacaga ccacactgga ccaggtgttt gtaaattttg 4140 ctaaacagca gactgaaagt catgacctcc ctctgcaccc tcgagctgct ggagccagtc 4200 ctaaacagca gactgaaagt catgacctcc ctctgcaccc tcgagctgct ggagccagtc 4200 gacaagccca ggacgactac aaagaccatg acggtgatta taaagatcat gacatcgact 4260 gacaagccca ggacgactac aaagaccatg acggtgatta taaagatcat gacatcgact 4260 acaaggatga cgatgacaag tgagcggccg cttcgagcag acatgataag atacattgat 4320 acaaggatga cgatgacaag tgagcggccg cttcgagcag acatgataag atacattgat 4320 gagtttggac aaaccacaac tagaatgcag tgaaaaaaat gctttatttg tgaaatttgt 4380 gagtttggac aaaccacaac tagaatgcag tgaaaaaaat gctttatttg tgaaatttgt 4380 gatgctattg ctttatttgt aaccattata agctgcaata aacaagttaa caacaacaat 4440 gatgctattg ctttatttgt aaccattata agctgcaata aacaagttaa caacaacaat 4440 tgcattcatt ttatgtttca ggttcagggg gagatgtggg aggtttttta aagcaagtaa 4500 tgcattcatt ttatgtttca ggttcagggg gagatgtggg aggtttttta aagcaagtaa 4500 aacctctaca aatgtggtaa aatcgataag gatcttccta gagcatggct acgtagataa 4560 aacctctaca aatgtggtaa aatcgataag gatcttccta gagcatggct acgtagataa 4560 gtagcatggc gggttaatca ttaactacaa ggaaccccta gtgatggagt tggccactcc 4620 gtagcatggc gggttaatca ttaactacaa ggaaccccta gtgatggagt tggccactcc 4620 ctctctgcgc gctcgctcgc tcactgaggc cgggcgacca aaggtcgccc gacgcccggg 4680 ctctctgcgc gctcgctcgc tcactgaggc cgggcgacca aaggtcgccc gacgcccggg 4680 ctttgcccgg gcggcctcag tgagcgagcg agcgcgcag 4719 ctttgcccgg gcggcctcag tgagcgagcg agcgcgcag 4719 <210> 75<210> 75 <211> 4758 <211> 4758 <212> DNA <212> DNA [[ID=ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccggggcgtcg ggcgaccttt ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga tttttccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaaca aatattaacg tttataattt caggtggcat ctttcaagct tatgcacagc tggaacttca agctgtacgt catgggcagc ggcggggtac cgataggcac ctattggtct tactgacatc cactttgcct ttctctccac aggtccatcc 360 420. tgacgggtct gttgccacca acctctggga ctgtgctcgt tgggggaagg gacattgaaa ccagcctgga tgcagtccgg cagagccttg gcatgtgtcc acagcacaac atcctgttcc 480 accacctcac ggtggctgag cacatgctgt tctatgccca gctgaaagga aagtcccagg 540 aggaggccca gctggagatg gaagccatgt tggaggacac aggcctccac cacaagcgga atgaagaggc tcaggaccta tcaggtggca tgcagagaaa gctgtcggtt gccattgcct 660 ttgtggggaga tgccaaggtg gtgattctgg acgaacccac ctctggggtg gacccttact 720 cgagacgctc aatctgggat ctgctcctga agtatcgctc aggcagaacc atcatcatgt 780 ccactcacca catggacgag gccgacctcc ttgggggaccg cattgccatc attgcccagg 840 gaaggctcta ctgctcaggc accccactct tcctgaagaa ctgctttggc acaggcttgt 900 acttaacctt ggtgcgcaag atgaaaaaca tccagagcca aaggaaggc agtgaggga 960 cctgcagctg ctcgtctaag ggtttctcca ccacgtgtcc agcccacgtc gatgacctaa 1020 ctccagaaca agtcctggat ggggatgtaa atgagctgat ggatgtagtt ctccaccatg 1080 ttccagaggc aaagctgggtg gagtgcattg gtcaagaact tatcttcctt cttccaaata 1140 agaacttcaa gcacagagca tatgccagcc ttttcagaga gctgggagg acgctggctg 1200 accttggtct cagcagtttt ggaatttctg acactcccct ggaagagatt tttctgaagg 1260 tcacggagga ttctgattca ggacctctgt ttgcgggtgg cgctcagcag aaaagagaaa 1320 acgtcaaccc ccgacacccc tgcttgggtc ccagagaagaa ggctggacag acaccccagg 1380 actccaatgt ctgctcccca ggggcgccgg ctgctcacc agagggccag cctcccccag 1440 agccagagtg cccaggcccg cagctcaca cggggacaca gctggtccctc cagcatgtgc 1500 aggcgctgct gtcaagaga ttccacaca ccacacag ccacaggac ttcctggcgc 1560 agatcgtgct cccggctacc ttgtgtttt tggctgat gcttctatt gttatccctc 1620 cttttggcga ataccccgct ttgaccctc acccctggat atatgggcag cagtacacct 1680 tcttcagcat ggatgaacca ggcagtgagc agttcacggt acttgcagac gtcctcctga 1740 ataagccagg ctttggcac cgctgcctga aggaagggtg gcttccggag taccctgtg 1800 gcaactcaac accctggaag actccttctg tgtccccaa catcacccag ctgttccaga 1860 agcagaaatg gacacaggtc aacccttcac catcctgcag gtgcagcacc agggagaagc 1920 tcaccatgct gccagagtgc cccgagggtg ccggggcct cccgcccccc cagagacacc 1980 agcgcagcac ggaaattcta caacctga cggacaggaa catctccgac ttctgtaa 2040 aaacgtatcc tgctcttata agaagcagct taagagcaa attctgggtc aatgaacaga 2100 ggtatggagg aatttccatt ggaggaagc tcccagtcgt cccatcacg ggggaagcac 2160 ttgttgggtt tttaagcgac cttggccgga tcatgaatgt gagcgggggc cctatcacta 2220 gagaggcctc taaagaaata cctgatttcc ttaaacatct agaaactgaa gacaacatta 2280 aggtgtggtt taataacaaa ggctggcatg ccctggtcag ctttctcaat gtggcccaca 2340 acgccatctt acgggccagc ctgcctaagg acagaagccc cgaggagtat ggaatcaccg 2400 tcattagcca acccctgaac ctgaccaagg agcagctctc agagattaca gtgctgacca 2460 cttcagtgga tgctgtggtt gccatctgcg tgattttctc catgtccttc gtcccagcca 2520 gctttgtcct ttatttgatc caggagcggg tgaacaaatc caagcacctc cagtttatca 2580 gtggagtgag ccccaccacc tactgggtaa ccaacttcct ctgggacatc atgaattatt 2640 ccgtgagtgc tgggctggtg gtgggcatct tcatcgggtt tcagaagaaa gcctacactt 2700 ctccagaaaa ccttcctgcc cttgtggcac tgctcctgct gtatggatgg gcggtcattc 2760 ccatgatgta cccagcatcc ttcctgtttg atgtccccag cacagcctat gtggctttat 2820 cttgtgctaa tctgttcatc ggcatcaaca gcagtgctat taccttcatc ttggaattat 2880 ttgagaataa ccggacgctg ctcaggttca acgccgtgct gaggaagctg ctcattgtct 2940 tcccccactt ctgcctgggc cggggcctca ttgaccttgc actgagccag gctgtgacag 3000 atgtctatgc ccggtttggt gaggagcact ctgcaaatcc gttccactgg gacctgattg 3060 ggaagaacct gtttgccatg gtggtggaag gggtggtgta cttcctcctg accctgctgg 3120 tccagcgcca cttcttcctc tcccaatgga ttgccgagcc cactaaggag cccattgttg 3180 atgaagatga tgatgtggct gaagaaagac aaagaattat tactggtgga aataaaactg 3240 acatcttaag gctacatgaa ctaaccaaga tttatccagg cacctccagc ccagcagtgg 3300 acaggctgtg tgtcggagtt cgccctggag agtgctttgg cctcctggga gtgaatggtg 3360 ccggcaaaac aaccacattc aagatgctca ctggggacac cacagtgacc tcaggggatg 3420 ccaccgtagc aggcaagagt attttaacca atatttctga agtccatcaa aatatgggct 3480 actgtcctca gtttgatgca atcgatgagc tgctcacagg acgagaacat ctttaccttt 3540 atgcccggct tcgaggtgta ccagcagaag aaatcgaaaa ggttgcaaac tggagtatta 3600 agagcctggg cctgactgtc tacgccgact gcctggctgg cacgtacagt gggggcaaca 3660 agcggaaact ctccacagcc atcgcactca ttggctgccc accgctggtg ctgctggatg 3720 agcccaccac agggatggac ccccaggcac gccgcatgct gtggaacgtc atcgtgagca 3780 tcatcagaga agggaggct gtggtcctca catcccacag catggaagaa tgtgaggcac 3840 tgtgtacccg gctggccatc atggtaaagg gcgcctttcg atgtatgggc accattcagc 3900 atctcaagtc caaatttgga gatggctata tcgtcacaat gaagatcaaa tccccgaagg 3960 acgacctgct tcctgacctg aaccctgtgg agcagttctt ccagggggaac ttcccaggca 4020 gtgtgcagag ggagaggcac tacaacatgc tccagttcca ggtctcctcc tcctccctgg 4080 cgaggatctt ccagctcctc ctctcccaca aggacagcct gctcatcgag gagtactcag 4140 tcacacagac caactggac caggtgtttg taaattttgc taaacagcag actgaaagtc 4200 atgacctccc tctgcaccct cgagctgctg gagccagtcg aaagcccag gacgactaca 4260 aagaccatga cggtgattat aaagatcatg acatcgaacta caaggatgac gatgacaagt 4320 gagcggccgc ttcgagcaga catgataaga tacattgatg agtttggaca aaccacaact 4380 agaatgcagt gaaaaaaatg ctttatttgt gaaatttgtg atgctattgc tttatttgta 4440 accattataa gctgcaataa acaagttaac aacaacaatt gcattcattt tatgtttcag 4500 gttcaggggg agatgtggga ggttttttaa agcaagtaaa acctctacaa atgtggtaaa 4560 atcgataagg atcttcctag agcatggcta cgtagataag tagcatggcg ggttaatcat 4620 taactacaag gaacccctag tgatggagtt ggccactccc tctctgcgcg ctcgctcgct 4680 cactgaggcc gggcgaccaa aggtcgcccg acgcccgggc tttgcccggg cggcctcagt 4740 gagcgagcga gcgcgcag 4758 <210> 76 <211> 4844 <212> DNA <213> Artificial sequence <220> <223> Synthetic <400> 76 ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt 60 ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact 120 aggggttcct ggatccggga tttttccgat ttcggcctat tggttaaaaa atgagctgat 180 ttaacaaaaa tttaacgcga attttaacaa aatattaacg tttataattt caggtggcat 240 ctttcaagct tatgcacagc tggaacttca agctgtacgt catgggcagc ggcggggtac 300 catgcacagc tggaacttca agctgtacgt catgggcagc ggcggatgca cagctggaac 360 ttcaagctgt acgtcatggg cagcggcgat aggcacctat tggtcttact gacatccact 420 ttgcctttct ctccacaggt ccatcctgac gggtctgttg ccaccaacct ctgggactgt 480 gctcgttggg ggaagggaca ttgaaaccag cctggatgca gtccggcaga gccttggcat 540 gtgtccacag cacaacatcc tgttccacca cctcacggtg gctgagcaca tgctgttcta 600 tgcccagctg aaaggaaagt cccaggagga ggcccagctg gagatggaag ccatgttgga 660 ggacacaggc ctccaccaca agcggaatga agaggctcag gacctatcag gtggcatgca 720 gagaaagctg tcggttgcca ttgcctttgt gggagatgcc aaggtggtga ttctggacga 780 acccacctct ggggtggacc cttactcgag acgctcaatc tgggatctgc tcctgaagta 840 tcgctcaggc agaaccatca tcatgtccac tcaccacatg gacgaggccg acctccttgg 900 ggaccgcatt gccatcattg cccagggaag gctctactgc tcaggcaccc cactcttcct 960 gaagaactgc tttggcacag gcttgtactt aaccttggtg cgcaagatga aaaacatcca 1020 gagccaaagg aaaggcagtg aggggacctg cagctgctcg tctaagggtt tctcaccac 1080 gtgtccagcc cacgtcgatg acctaactcc agaacaagtc ctggatgggg atgtaaatga 1140 gctgatggat gtagttctcc accatgttcc agaggcaaag ctggtggagt gcattggtca 1200 agaacttatc ttccttcttc aaataagaa cttcaagcac agagcatatg ccagcctttt 1260 cagagagctg gaggagacgc tggctgacct tggtctcagc agttttggaa tttctgacac 1320 tcccctggaa gagatttttc tgaaggtcac ggaggattct gattcaggac ctctgtttgc 1380 gggtggcgct cagcagaaaa gagaaacgt caacccccga cacccctgct tgggtcccag 1440 agaaggct ggacagacac cccaggactc caatgtctgc tccccagggg cgccggctgc 1500 tcacccagag ggccagcctc ccccagagcc agagtgccca ggcccgcagc tcaacacggg 1560 gacacagctg gtcctccagc atgtgcaggc gctgctggtc aagagattcc aacacaccat 1620 ccgcagccac aaggacttcc tggcgcagat cgtgctcccg gctacctttg tgtttttggc 1680 tctgatgctt tctattgtta tccctccttt tggcgaatac cccgctttga cccttcaccc 1740 ctggatatat gggcagcagt acaccttctt cagcatggat gaaccaggca gtgagcagtt 1800 cacggtactt gcagacgtcc tcctgaataa gccaggcttt ggcaaccgct gcctgaagga 1860 agggtggctt ccggagtacc cctgtggcaa ctcaacaccc tggaagactc cttctgtgtc 1920 cccaaacatc acccagctgt tccagaagca gaaatggaca caggtcaacc cttcaccatc 1980 ctgcaggtgc agcaccaggg agaagctcac catgctgcca gagtgccccg agggtgccgg 2040 gggcctcccg cccccccaga gaacacagcg cagcacggaa attctacaag acctgacgga 2100 caggaacatc tccgacttct tggtaaaaac gtatcctgct cttataagaa gcagcttaaa 2160 gagcaaattc tgggtcaatg aacagaggta tggaggaatt tccattggag gaaagctccc 2220 agtcgtcccc atcacggggg aagcacttgt tgggttttta agcgaccttg gccggatcat 2280 gaatgtgagc gggggcccta tcactagaga ggcctctaaa gaaatacctg atttccttaa 2340 acatctagaa actgaagaca acattaaggt gtggtttaat aacaaaggct ggcatgccct 2400 ggtcagcttt ctcaatgtgg cccacaacgc catcttacgg gccagcctgc ctaaggacag 2460 aagccccgag gagtatggaa tcaccgtcat tagccaaccc ctgaacctga ccaaggagca 2520 gctctcagag attacagtgc tgaccacttc agtggatgct gtggttgcca tctgcgtgat 2580 tttctccatg tccttcgtcc cagccagctt tgtcctttat ttgatccagg agcgggtgaa 2640 caaatccaag cacctccagt ttatcagtg...

Claims

1. An adeno-associated virus (AAV) vector system for expressing a coding sequence of a gene of interest in cells, said coding sequence comprising a first part and a second part, said vector system comprising: a) A first carrier, which is contained in the 5' to 3' direction: - 5'-inverted terminal repeat (5'-ITR) sequence, - The first portion (CDS1) of the encoded sequence, - The first reconstruction sequence includes a splice donor signal (SD) and a recombination-initiating region, and - 3'-inverted terminal repeat (3'-ITR) sequence; and b) A second carrier, which comprises, in the 5' to 3' direction: - 5'-inverted terminal repeat (5'-ITR) sequence, - The second reconstruction sequence contains recombination-initiating regions and scissor receptor signals, and - The second part (CDS2) of the encoded sequence, - 3'-inverted terminal repeat (3'-ITR) sequence, Its features are, Both the first and second vectors further contain nucleotide sequences of a degradation signal, said sequences located at the 3' position relative to SD in the first vector and at the 5' position relative to SA in the second vector; and in, The first and second vectors are independently viral vectors, and The nucleotide sequence of the degradation signal in the first vector is as follows: It consists of a sequence encoding an amino acid sequence selected from CL1 SEQ ID No. 1; and, The nucleotide sequence of the degradation signal in the second vector is as follows: It consists of a nucleotide sequence encoding PB29, wherein the amino acid sequence of PB29 is SEQ ID No. 15 or SEQ ID No. 14; or consists of a nucleotide sequence SEQ ID No. 19 or SEQ ID No. 20; or consists of a sequence encoding three copies of PB29, wherein the amino acid sequence of a single copy of PB29 is SEQ ID No. 15 or SEQ ID No. 14; and, The nucleotide sequences of the recombination initiation region are selected from the following group: AK GGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAAT (SEQ ID No. 22) or GGGATTTTTCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAAT (SEQ ID No. 23); AP1 (SEQ ID No. 24); AP2 (SEQ ID No. 25); and AP (SEQ ID No. 26).

2. The AAV carrier system as described in claim 1, wherein, The first carrier also includes a promoter sequence operatively connected to the 5' end portion of the first part (CDS1) of the encoded sequence.

3. The AAV carrier system as described in claim 1, wherein, The ITR is derived from the same viral serotype or from different viral serotypes.

4. The AAV carrier system as described in claim 1, wherein, The coding sequence is divided into a first part and a second part at the natural exon-exon junction.

5. The AAV carrier system as described in claim 1, wherein, The splice donor signal consists of the same sequence as GTAAGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACTCTTGCGTTTCT (SEQ ID No. 27).

6. The AAV carrier system as described in claim 1, wherein, The cuticle acceptor signal consists of the same sequence as GATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCACAG (SEQ ID No. 28).

7. The AAV carrier system as described in claim 1, wherein, The first vector also includes at least one enhancer nucleotide sequence operatively linked to the coding sequence.

8. The AAV carrier system as described in claim 1, wherein, The coding sequence encodes a protein that can correct retinal degeneration.

9. The AAV carrier system as described in claim 1, wherein, The coding sequence encodes proteins that can correct Duchenne muscular dystrophy, cystic fibrosis, hemophilia A, and Dysferlin myopathy.

10. The AAV carrier system as claimed in claim 1, wherein, The coding sequences are selected from the coding sequences of genes from the following group: ABCA4, MYO7A, CEP290, CDH23, EYS, PCDH15, CACNA1, SNRNP200, RP1, PRPF8, RP1L1, ALMS1, USH2A, GPR98, and HMCN1.

11. The AAV carrier system as claimed in claim 1, wherein, The coding sequences are selected from the coding sequences of genes from the following group: DMD, CFTR, F8, and DYSF.

12. The AAV carrier system as claimed in claim 1, wherein, The first carrier does not contain the polyadenylation signal nucleotide sequence.

13. The AAV carrier system as claimed in claim 1, comprising: a) A first carrier, which comprises, in the 5'-3' direction: - 5' inverted terminal repeat (5'-ITR) sequence; - Starter sequence; - The 5' end portion (CDS1) of the coding sequence of the gene of interest, which is operatively linked to and controlled by the promoter; - The nucleotide sequence of the splicing donor signal; - The nucleotide sequence of the recombination initiation region; and - 3' inverted terminal repeat (3'-ITR) sequence; and b) A second carrier, which comprises, in the 5'-3' direction: - 5' inverted terminal repeat (5'-ITR) sequence; - The nucleotide sequence of the recombination initiation region; - The nucleotide sequence of the splice acceptor signal; - The 3' end (CDS2) of the encoded sequence; - Polyadenylation signal nucleotide sequence; and - 3' inverted terminal repeat (3'-ITR) sequence, Its features are, It also contains nucleotide sequences that signal degradation. The nucleotide sequence of the degradation signal is the 3' of CDS1; and The nucleotide sequence of the degradation signal is the 5' of CDS2.

14. The AAV carrier system as claimed in claim 1, wherein, The first and second adeno-associated virus (AAV) vectors are selected from the same or different AAV serotypes.

15. The AAV carrier system as claimed in claim 1, wherein, The adeno-associated virus is selected from serotype 2, serotype 8, serotype 5, serotype 7 or serotype 9.

16. The AAV carrier system as claimed in claim 1, wherein, The second vector also contains a polyadenylated signal nucleotide sequence linked to the 3' end portion (CDS2) of the coding sequence.

17. A host cell transduced using the AAV vector system as described in claim 1.

18. The AAV vector system of claim 1 or the host cell of claim 17, for medical applications.

19. The AAV vector system of claim 1 or the host cell of claim 17, for use in gene therapy.

20. The AAV vector system of claim 1 or the host cell of claim 17, for the treatment and / or prevention of a pathological condition or disease characterized by retinal degeneration.

21. The AAV vector system or host cell of claim 20, wherein, The retinal degeneration is hereditary.

22. The AAV vector system or host cell of claim 20, wherein, The pathological conditions or diseases mentioned are selected from the following group: retinitis pigmentosa (RP), Leber congenital amaurosis (LCA), Sturgeon's disease (STGD), Arthur's disease (USH), Alstréd syndrome, congenital stationary night blindness (CSNB), macular dystrophy, occult macular dystrophy, and diseases caused by mutations in the ABCA4 gene.

23. The AAV vector system of claim 1 or the host cell of claim 17, for the prevention and / or treatment of Duchenne muscular dystrophy, cystic fibrosis, hemophilia A and Dysferlin myopathy.

24. A pharmaceutical composition comprising the AAV carrier system of claim 1 or the host cell of claim 17, and a pharmaceutically acceptable carrier.

25. Use of the AAV vector system of claim 1 or the host cell of claim 17 in the preparation of products for the treatment and / or prevention of pathological conditions or diseases characterized by retinal degeneration.

26. Use of the AAV carrier system of claim 1 or the host cell of claim 17 in the preparation of products for the treatment and / or prevention of Duchenne muscular dystrophy, cystic fibrosis, hemophilia A or Dysferlin myopathy.