PPR editor protein having improved translation efficiency and use for same
By modifying the PPR editor protein's C-terminal domain with targeted amino acid substitutions, the detachment from edited RNA is facilitated, enhancing RNA editing and translation efficiency, addressing the issue of low translation efficiency in existing PPR editor proteins.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- EDITFORCE INC
- Filing Date
- 2025-12-26
- Publication Date
- 2026-07-02
AI Technical Summary
RNA edited with a PPR editor protein exhibits low translation efficiency due to improper detachment of the protein from the edited RNA, leading to poor protein synthesis.
Modify the PPR editor protein by substituting specific amino acids in the C-terminal domain, specifically at positions 3 of the E1 and E2 motifs and positions 1 of the PG or WW domains, to enhance the protein's ability to dissociate from edited RNA, thereby improving translation efficiency.
The modified PPR editor protein demonstrates enhanced RNA editing efficiency and translation efficiency, as evidenced by increased accumulation of β-catenin protein and improved expression of downstream genes.
Smart Images

Figure JP2025045893_02072026_PF_FP_ABST
Abstract
Description
PPR editor proteins with improved translation efficiency and their applications
[0001] This invention relates to a PPR protein that can bind to and edit target RNA. This invention is useful in fields such as medicine (drug discovery support, treatment), agriculture (agricultural, fishery, and livestock product production, breeding), and chemistry (biological substance production).
[0002] Pentatricopeptide repeat (PPR) proteins are involved in eukaryotic gene regulation at the RNA level (Non-Patent Literature 1). PPR proteins are RNA-binding proteins with a tandem repeat structure of a PPR motif consisting of 31 to 36 amino acids. The first of the two helices constituting each motif interacts with the RNA molecule. Two amino acids within the motif (the 5th and last positions) recognize RNA in a base-specific manner according to the nucleic acid recognition code. PPR proteins are classified into two subclasses based on the composition of the PPR motif: the P class, which consists only of the standard P motif, and the PLS class, which has L1 and S1 motifs in addition to P1 (P) (Non-Patent Literature 2). Most of the PLS class have a PPR-like motif (P2, L2, S2, E1, E2) and part or full-length of a cytidine deaminase-like domain (DYW) at the C-terminus. The DYW domain has three subclasses: the PG and WW subclasses are involved in cytidine-to-uridine editing of RNA (C-to-U), and the KP subclass is involved in uridine-to-cytidine editing (U-to-C) (Non-Patent Literature 3, 4). The inventors previously created a "designer PRR editor" by ligating a PLS domain, composed of artificially designed PLS repeats, with a C-terminal RNA editing domain consisting of PPR-like motifs (P2, L2, S2, E1, E2) and a DYW domain. This protein edited target bases on foreign mRNA introduced into the cytoplasm of human cultured cells (Non-Patent Literature 5, Patent Literature 1).
[0003] International release WO2021-201198
[0004] Barkan A, Small I. Pentatricopeptide repeat proteins in plants. Annu Rev Plant Biol. 2014;65:415-42.Cheng S, Gutmann B, Zhong X, Ye Y, Fisher MF, Bai F, Castleden I, Song Y, Song B, Huang J, Liu X, Xu X, Lim BL, Bond CS, Yiu SM, Small I. Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants. Plant J. 2016 Feb;85(4):532-547Gerke P, Szovenyi P, Neubauer A, Lenz H, Gutmann B, McDowell R, Small I, Schallenberg-Rudinger M, Knoop V. Towards a plant model for enigmatic U-to-C RNA editing: the organelle genomes, transcriptomes, editomes and candidate RNA editing factors in the hornwort Anthoceros agrestis. New Phytol. 2020 Mar;225(5):1974-1992Gutmann B, Royan S, Schallenberg-Rudinger M, Lenz H, Castleden IR, McDowell R, Vacher MA, Tonti-Filippini J, Bond CS, Knoop V, Small ID. The Expansion and Diversification of Pentatricopeptide Repeat RNA-Editing Factors in Plants. Mol Plant. 2020 Feb 3;13(2):215-230Ichinose M.J Biol Chem. 2003 Aug 22;278(34):31781-9.
[0005] According to our research, RNA edited with a PPR editor having C-to-U activity has low translation efficiency. RNA editing and translation by a PPR editor protein first involves the PPR editor protein binding to the target RNA and performing the predetermined editing (base substitution). Next, the PPR editor protein detaches from the edited RNA, and ribosomes bind to the detached RNA, leading to the translation of the RNA into a protein. If the detachment of the PPR editor protein from the edited RNA is not performed properly, the translation efficiency will be poor.
[0006] The inventors first constructed a mechanism to monitor whether edited RNA is being translated properly. Specifically, they focused on the CTNNB1 gene, which encodes β-catenin involved in the Wnt / β-catenin pathway. In the absence of Wnt protein, β-catenin is phosphorylated and degraded via ubiquitination (Non-Patent Literature 6) (Figure 1). On the other hand, in the presence of Wnt protein, Wnt activates proteins that inhibit the phosphorylation of β-catenin, so β-catenin is not degraded and accumulates in the cytoplasm. The accumulated β-catenin then translocates to the nucleus and forms a complex with TCF / LEF family transcription factors to promote downstream gene expression. Threonine, the 41st amino acid of β-catenin, is one of the amino acids that is phosphorylated, and substitution with isoleucine inhibits phosphorylation, leading to the accumulation of β-catenin. The amount of β-catenin in the nucleus can be monitored by the expression level of a reporter gene under the control of the TCF / LEF promoter.
[0007] Using this mechanism, we confirmed that protein translation from the target RNA molecule was inefficient after RNA editing by the PPR editor protein. Next, we considered that this was because the PPR editor was not easily detached from the edited RNA, so we focused on the C-terminal domain of the PPR editor protein, which includes the DYW domain, and confirmed that by substituting amino acids that are thought to be involved in the interaction between the C-terminal domain and RNA, we improved the editing activity and the protein translation efficiency, thus completing the present invention.
[0008] The present invention provides the following: [1] A method for modifying a PPR editor protein having a C-terminal domain, comprising the following steps a or b: a. Substituting at least one selected from position 3 of the E1 motif, position 3 of the E2 motif, and position 1 of the PG domain in a C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a PG domain; or b. Substituting at least one selected from position 3 of the P2 motif, position 10 of the L2 motif, position 1 of the E1 motif, and position 13 of the E2 motif in a C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a WW domain. [2] The method according to 1, wherein the substitution made in step a is at least one substitution selected from 3T of the E1 motif, 3A of the E2 motif, and 1R of the PG domain, and the substitution made in step b is at least one substitution selected from 3S of the P2 motif, 10A of the L2 motif, 1A of the E1 motif, and 13A of the E2 motif. [3] The method according to 1 or 2, wherein the substitution made in step a includes at least a substitution of P3A of the E2 motif, and the substitution made in step b includes at least a substitution of D13A of the E2 motif. [4] A method for promoting the translation of edited RNA, comprising: carrying out the modification method described in any one of items 1 to 3; binding the obtained modified PPR editor protein to a target RNA to perform RNA editing; and translating a protein from the edited RNA, wherein the translation efficiency of the protein is higher than that of an unmodified PPR editor protein.[5] A PPR editor protein having a portion consisting of 2 to 20 PPR motifs and a C-terminal domain consisting of a polypeptide of the following a or b: a. A polypeptide consisting of a sequence in which at least one selected from V3 of the E1 motif, P3 of the E2 motif, and E1 of the PG domain is substituted in the amino acid sequence of the C-terminal domain consisting of a P2 motif, L2 motif, S2 motif, E1 motif, E2 motif, and PG domain; or b. A polypeptide consisting of a sequence in which at least one selected from T3 of the P2 motif, V10 of the L2 motif, L1 of the E1 motif, and D13 of the E2 motif is substituted in the amino acid sequence of the C-terminal domain consisting of a P2 motif, L2 motif, S2 motif, E1 motif, E2 motif, and WW domain, wherein the PPR motif including the P2 motif, L2 motif, S2 motif, E1 motif, and E2 motif consists of a polypeptide with a total length of 31 to 36 amino acids represented by the following formula 1, and each amino acid is numbered in order as A1, A2, A3, A4... A PPR editor protein in which the amino acid combination of A5 and L functions for selective binding to RNA bases. More specifically, the bases that can bind to the two amino acid combinations of A5 and L satisfy one of the following conditions in the table below, depending on the subclass to which the motif is classified: (Helix A)-X1-(Helix B)-X2 (Formula 1) (wherein Helix A is a portion consisting of 13 or 14 amino acids in length capable of forming an α-helix structure, X1 consists of 1 to 9 amino acids in length, preferably 1 to 3 amino acids, Helix B is a portion consisting of 10 to 14 amino acids in length capable of forming an α-helix structure, X2 consists of 1 to 9 amino acids in length, preferably 4 to 9 amino acids, and in X2, the C-terminal amino acid is represented by L). [6] The PPR editor protein according to 5, wherein polypeptide a is subjected to at least one substitution selected from V3T of the E1 motif, P3A of the E2 motif, and E1R of the PG domain, and polypeptide b is subjected to at least one substitution selected from T3S of the P2 motif, V10A of the L2 motif, L1A of the E1 motif, and D13A of the E2 motif. [7] The PPR editor protein according to 5 or 6, wherein polypeptide a is a polypeptide comprising a sequence in the amino acid sequence of SEQ ID NO: 1 that is subjected to at least one substitution selected from V106T, P140A, and E172R; and polypeptide b is a polypeptide comprising a sequence in the amino acid sequence of SEQ ID NO: 2 that is subjected to at least one substitution selected from T3S, V45A, L104A, and D150A. [8] A nucleic acid encoding the PPR editor protein according to any one of items 5 to 7. [9] A vector comprising the nucleic acid according to 8.
[10] A cell comprising the vector according to 9.
[0009] This invention modifies the interaction between the PPR editor protein and the target RNA or the flexibility of the C-terminal domain, thereby altering the binding affinity of the PPR editor protein to RNA, promoting the dissociation of the edited PPR editor protein from the RNA, improving the editing efficiency of the PPR editor protein, and enabling efficient protein synthesis from the edited RNA.
[0010] β-catenin activity measurement reporter system. The CTNNB1 gene encodes β-catenin. β-catenin translocates slightly to the nucleus, but is mostly phosphorylated and subsequently degraded via ubiquitination. When threonine, the 41st amino acid of β-catenin, is replaced with isoleucine, phosphorylation is inhibited. This T41I mutant β-catenin accumulates in the cytoplasm and then translocates to the nucleus, where it interacts with TCF / LEF family transcription factors, activating the expression of downstream genes. To measure β-catenin activity, a promoter sequence recognized by the β-catenin / TCF complex is inserted upstream of the Nluc reporter gene. β-catenin activity can be measured by simultaneously introducing a plasmid containing the Fluc reporter gene inserted into the EF1α promoter (which is unaffected by β-catenin) and the Nluc reporter into cells and comparing the ratio of their respective luminescence levels (Nluc / Fluc). Mutations within the C-terminal domain promote RNA dissociation of the 14P-DYW protein. (a) Comparison of the consensus sequences obtained from PG and WW proteins in early land plant populations with the amino acid sequences of the C-terminal WW1 and PG1 domains. Amino acids matching the consensus sequence are shown as black dots. Each motif is shown in a different grayscale. The amino acid positions 5 and the last (L) within each motif are enclosed in boxes. The first helix region is shown with a black underline. Single amino acid mutants of 14P-WW (b, d, f) and 14P-PG (c, e, g) targeting the CTNNB1-T41I site were designed to fit the consensus sequence. The editing efficiency (b, c), Nluc / Fluc ratio (d, e), and β-catenin protein accumulation (f, g) of the mutants were compared with 14P-WW or 14P-PG (WT) without mutation. Bar graphs show the mean values of editing activity or Nluc / Fluc ratio. Data are based on three independent experimental results, and each measurement is shown as a dot. Significant differences between WT and mutants are indicated by *. * P < 0.05, ** P < 0.01, *** P < 0.001, ****P < 0.0001 (Student's t-test). Mutations showing favorable results are boxed. Mutations to the C-terminal domain of the 14P-PG protein promote RNA dissociation. Editing efficiency (a) and Nluc / Fluc ratio (b) were compared between unmutated 14P-PG (WT) and mutants. Bar graphs show the mean values for editing efficiency and Nluc / Fluc ratio. Data are based on three independent experiments, with each measurement indicated by a dot. Significant differences between WT and mutants are indicated by an asterisk (*). ** P < 0.01 (Student's t-test). Frequency of amino acid occurrence in each selected DYW domain lineage. Visualized with sequence logos created using WebLogo (see Patent Document 1). The same applies to Figure 4-2. (a) DYW:PG (b) DYW:WW The E2 motif used with DYW:PG was logoized in the same way as in Figures 4-1 and 4-2. The E2 motif used with DYW:WW was logoized in the same way as in Figures 4-1 and 4-2. Effect of combinations of mutations in the C-terminal domain of the 14P-PG protein. One to three mutations selected from the E1 motif V3T, the E2 motif P3A, and the DYW domain E1R were introduced into the C-terminal domain of the 14P-PG protein. The Nluc / Fluc ratio did not change significantly between the fully substituted and the single amino acid substitution (E2:P3A). The data is based on the results of three experiments, and each measured value is indicated by a symbol (○, △, □). RNA editing efficiency and the amount of protein (β-catenin protein) from the edited mRNA were measured when 1 to 3 mutations selected from the E1 motif V3T, the E2 motif P3A, and the DYW domain E1R were introduced into the C-terminal domain of the 14P-PG protein (same as Figure 5), or when 1 to 2 mutations selected from the P2 motif T3S and the E2 motif D13A were introduced into the C-terminal domain of the 14P-WW protein. Different letters (a, b) indicate a statistically significant difference (one-way ANOVA, Tukey's comparative test, p < 0.05). When multiple letters are present (ab), there is no significant difference between a and b, indicating an intermediate position. Sequences of sequence numbers 6-17.
[0011] I. Method for Modifying a PPR Editor Protein (Embodiment 1) This embodiment relates to a method for modifying a PPR editor protein. The PPR editor protein according to this embodiment has C-to-U editing activity and is modified to promote translation, as described later. The PPR editor protein includes a binding region consisting of an array of PPR motifs that can bind to a target RNA, and a C-terminal domain. The binding region may consist of a P array, i.e., a simple repeat of a standard 35-amino acid PPR motif (P), or a PLS array, i.e., in addition to P, it may include two similar motifs called L and S, and may be configured as a repeating unit of PLS, or more specifically, as a repeating unit of three PPR motifs: P1 (approximately 35 amino acids), L1 (approximately 35 amino acids), and S1 (approximately 31 amino acids). Unless otherwise specified, the C-terminal domain consists of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a DYW domain.
[0012] (PPR motif) A PPR motif refers to a polypeptide consisting of 30 to 38 amino acids whose amino acid sequence, when analyzed using a web-based protein domain search program, has an E value of PF01535 in Pfam and PS51375 in Prosite that is below a predetermined value (preferably E-03). In this application, the position numbers of the amino acids constituting the PPR motif are almost synonymous with PF01535, but correspond to the number obtained by adding 1 to the amino acid position of PF01535 (e.g., position 5 in this invention → position 4 in PF01535). For Pfam, see http: / / pfam.sanger.ac.uk / , and for Prosite, see http: / / www.expasy.org / prosite / .
[0013] In relation to the present invention, the position of an amino acid on the sequence of a PPR motif is represented by position x.
[0014] In relation to the present invention, when simply referring to a PPR motif, unless otherwise specified, it includes all subclasses of PPR motifs, specifically the P, P1, L1, S1, SS, P2, L2, S2, E1, and E2 motifs. The P2, L2, S2, E1, and E2 motifs may also be referred to as PPR-like motifs.
[0015] More specifically, the PPR motif consists of a polypeptide with a total length of 31 to 36 amino acids, represented by the following formula 1, where each amino acid is numbered sequentially as A1, A2, A3, A4, and so on.
[0016]
[0017] (In Formula 1: Helix A is a portion capable of forming an α-helix structure, consisting of 13 or 14 amino acids in length; X1 consists of 1 to 9 amino acids in length, preferably 1 to 3 amino acids; Helix B is a portion capable of forming an α-helix structure, consisting of 10 to 14 amino acids in length; X2 consists of 1 to 9 amino acids in length, preferably 4 to 9 amino acids, where the C-terminal amino acid in X2 is represented by L.) The amino acid combination of A5 and L functions for selective binding to RNA bases.
[0018] In one embodiment, the Helix A, X1, Helix B, X2, and total length of the PPR motif in each subclass are as follows:
[0019]
[0020] The PPR motif relies on the combination of amino acids A5 and L for selective binding to RNA bases. The relationship between the two amino acid combinations A5 and L and the bindable bases is known as the PPR code. The table below shows the bindable bases and the two amino acid combinations A5 and L that constitute the PPR motif.
[0021]
[0022] In one embodiment, the PPR motif may have the following combinations of amino acids involved in the stability of binding to an RNA molecule, with A2 of an adjacent motif sandwiching a nucleic acid by van der Waals forces (Non-Patent Literature 3), and A5 and L, which are involved in the specific recognition of RNA bases. (3-1) In a PPR motif that selectively binds to U, the combination of the three amino acids A2, A5, and L is, in order, valine, asparagine, and aspartic acid; (3-2) In a PPR motif that selectively binds to A, the combination of the three amino acids A2, A5, and L is, in order, valine, threonine, and asparagine; (3-3) In a PPR motif that selectively binds to C, the combination of the three amino acids A2, A5, and L is, in order, valine, asparagine, and asparagine; (3-4) In a PPR motif that selectively binds to G, the combination of the three amino acids A2, A5, and L is, in order, glutamic acid, glycine, and aspartic acid; (3-5) In a PPR motif that selectively binds to C or U, the combination of the three amino acids A2, A5, and L is, in order, isoleucine, asparagine, and asparagine; (3-6) In a PPR motif that selectively binds to G, the combination of the three amino acids A2, A5, and L is, in order, valine, threonine, and aspartic acid; (3-7) In a PPR motif that selectively binds to G, the combination of the three amino acids A2, A5, and L is, in order, lysine, threonine, and aspartic acid; (3-8) In a PPR motif that selectively binds to A, the combination of the three amino acids A2, A5, and L is, in order, phenylalanine, serine, and asparagine; (3-9) In a PPR motif that selectively binds to C, the combination of the three amino acids A2, A5, and L is, in order, valine, asparagine, and serine; (3-10) In a PPR motif that selectively binds to A, the combination of the three amino acids A2, A5, and L is, in order, phenylalanine, threonine, and asparagine; (3-11) In PPR motifs that selectively bind to U or A, the combination of the three amino acids A2, A5, and L is, in order, isoleucine, asparagine, and aspartic acid;(3-12) In PPR motifs that selectively bind to A, the combination of the three amino acids A2, A5, and L is, in order, threonine, threonine, and asparagine; (3-13) In PPR motifs that selectively bind to U or C, the combination of the three amino acids A2, A5, and L is, in order, isoleucine, methionine, and aspartic acid; (3-14) In PPR motifs that selectively bind to U, the combination of the three amino acids A2, A5, and L is, in order, phenylalanine, proline, and aspartic acid; (3-15) In PPR motifs that selectively bind to U, the combination of the three amino acids A2, A5, and L is, in order, tyrosine, proline, and aspartic acid; (3-16) In PPR motifs that selectively bind to G, the combination of the three amino acids A2, A5, and L is, in order, leucine, threonine, and aspartic acid.
[0023] (Modification of the C-terminal domain) This embodiment relates to a method for modifying a PPR editor protein having a C-terminal domain, comprising the following steps a or b: a. Substituting at least one selected from position 3 of the E1 motif, position 3 of the E2 motif, and position 1 of the PG domain in a C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a PG domain; or b. Substituting at least one selected from position 3 of the P2 motif, position 10 of the L2 motif, position 1 of the E1 motif, and position 13 of the E2 motif in a C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a WW domain.
[0024] Based on the model structure of the DYW protein available from the AlphaFold server (Non-Patent Document 7, cited below), the amino acid at position 3 of the PPR motif interacts with the first helix of the upstream PPR motif. Therefore, mutating the amino acid at position 3 may alter the positions of the first helix of two adjacent PPR motifs, potentially altering the positions of amino acids (A5 and L) involved in specific nucleic acid recognition. This could indirectly affect the specificity of RNA recognition and promote RNA dissociation.
[0025] On the other hand, mutations in amino acids at positions 1, 10, and 13 are not thought to have a direct effect on RNA recognition. The amino acid at position 1 of the E1 motif and the amino acid at position 10 of the L2 motif may alter the structure of the PPR / RNA complex in the C-terminal domain by binding to the upstream PPR-like motif. In addition, the amino acid at position 13 of the E2 motif may interact with a region called the gating domain, which controls catalysis within the DYW domain.
[0026] In one embodiment, the substitution made in step a is at least one substitution selected from the following: substitution of the amino acid at position 3 of the E1 motif with T or an amino acid similar in nature to T; substitution of the amino acid at position 3 of the E2 motif with A or an amino acid similar in nature to A; and substitution of the amino acid at position 1 of the PG domain with R or an amino acid similar in nature to R. The substitution made in step b is at least one substitution selected from the following: substitution of the amino acid at position 3 of the P2 motif with S or an amino acid similar in nature to S; substitution of the amino acid at position 10 of the L2 motif with A or an amino acid similar in nature to A; substitution of the amino acid at position 1 of the E1 motif with A or an amino acid similar in nature to A; and substitution of the amino acid at position 13 of the E2 motif with A or an amino acid similar in nature to A.
[0027] Amino acids with properties similar to T (threonine) include, from the perspective that threonine is a nonhydrophobic amino acid, arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, cysteine, histidine, and methionine; from the perspective that threonine is a hydrophilic amino acid, arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, and serine; and from the perspective that threonine is a neutral amino acid, alanine, asparagine, cysteine, glutamine, glycine, isoleucine, leucine, methionine, phenylalanine, proline, serine, tryptophan, tyrosine, and valine.
[0028] Amino acids with properties similar to A (alanine) include, from the perspective that alanine is a hydrophobic (nonpolar) amino acid, valine, glycine, isoleucine, leucine, phenylalanine, proline, tryptophan, and tyrosine; and from the perspective that alanine is a neutral amino acid, asparagine, cysteine, glutamine, glycine, isoleucine, leucine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine; and sulfur-containing amino acids: methionine and cysteine.
[0029] From the perspective that arginine is a non-hydrophobic amino acid, the amino acids with similar properties to R (arginine) are asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, threonine, cysteine, histidine, and methionine; from the perspective that arginine is a hydrophilic amino acid, the amino acids with similar properties are asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, and threonine; and from the perspective that arginine is a basic amino acid, the amino acids with similar properties are lysine and histidine.
[0030] Amino acids with properties similar to serine (S) include, from the perspective that serine is a nonhydrophobic amino acid, arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, threonine, cysteine, histidine, and methionine; from the perspective that serine is a hydrophilic amino acid, arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, and threonine; and from the perspective that serine is a neutral amino acid, alanine, asparagine, cysteine, glutamine, glycine, isoleucine, leucine, methionine, phenylalanine, proline, threonine, tryptophan, tyrosine, and valine.
[0031] In one embodiment, the substitution performed in step a is at least one substitution selected from 3T of the E1 motif, 3A of the E2 motif, and 1R of the PG domain, and the substitution performed in step b is at least one substitution selected from 3S of the P2 motif, 10A of the L2 motif, 1A of the E1 motif, and 13A of the E2 motif.
[0032] In one embodiment, the substitution performed in step a is at least one substitution selected from V3T of the E1 motif, P3A of the E2 motif, and E1R of the PG domain, and the substitution performed in step b is at least one substitution selected from T3S of the P2 motif, V10A of the L2 motif, L1A of the E1 motif, and D13A of the E2 motif. One particularly preferred embodiment in that it provides a particularly high desired effect is in which the substitution performed in step a is a combination of P3A of the E2 motif and V3T of E1 or E1R of the PG domain, and the substitution performed in step b is at least one selected from T3S of the P2 motif and D13A of the E2 motif.
[0033] In a preferred embodiment, the substitution performed in step a includes at least a substitution of P3A in the E2 motif. One particularly preferred embodiment of the C-terminal domain is the sequence of SEQ ID NO: 6. SEQ ID NO: 6 shows a sequence in which the P3A of the E2 motif is substituted in a C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a PG domain.
[0034] In a preferred embodiment, the substitution made in step b includes at least a substitution of D13A in the E2 motif. One particularly preferred embodiment of the C-terminal domain is the sequence of SEQ ID NO: 10. SEQ ID NO: 10 shows a sequence in which the D13A of the E2 motif is substituted in a C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a WW domain.
[0035] (P2, L2, S2, E1, E2) The total length of P2 is not particularly limited as long as it can bind to the target base, but is for example 33 to 37 amino acids long, preferably 34 to 36 amino acids long, and more preferably 35 amino acids long.
[0036] The total length of L2 is not particularly limited as long as it can bind to the target base, but is, for example, 34 to 38 amino acids long, preferably 35 to 37 amino acids long, and more preferably 36 amino acids long.
[0037] The total length of S2 is not particularly limited as long as it can bind to the target base, but is, for example, 30 to 34 amino acids long, preferably 31 to 33 amino acids long, and more preferably 32 amino acids long.
[0038] The S2 motif correlates with the nucleotide corresponding to L (Takenaka, M. et al. (2013). PLoS One 8:e65343.). Furthermore, it can be incorporated into PLS-type PPR proteins by noting that the C four positions downstream of the target base of the S2 motif in the target sequence of the PPR editor protein is the editing target base by the DYW domain.
[0039] The total length of E1 is not particularly limited as long as it can bind to the target base, but is, for example, 32 to 36 amino acids long, preferably 33 to 35 amino acids long, and more preferably 34 amino acids long.
[0040] In the E1 motif, correlation with nucleotides is observed only with the A5 amino acid (Ruwe et al. (2019) New Phytol. 222 218-229).
[0041] The total length of E2 is not particularly limited as long as it can bind to the target base, but is, for example, 32 to 36 amino acids long, preferably 33 to 35 amino acids long, and more preferably 34 amino acids long.
[0042] The A5 and last amino acid in the E2 motif are highly conserved and are not involved in the recognition of specific PPR-RNAs (Non-Patent Literature 2).
[0043] The frequency of amino acids in the E2 motif selected from those used with DYW:PG, and the frequency of amino acids in the E2 motif selected from those used with DYW:WW, are visualized using sequence logos created with WebLogo and are shown in Figures 4-3 and 4-4, respectively. From Figures 4-3 and 4-4, the conserved and non-conserved positions in the E2 sequence can be understood.
[0044] Generally, the amino acid sequence of the E2 motif used with DYW:PG is as follows:
[0045] AAxYVLLSNIYAAAGRWDExAKVRKLMKERGVKK (SEQ ID NO:18)
[0046] The amino acid sequence of the E2 motif commonly used with DYW:WW is as follows:
[0047] AAAYVLMSNIYADAHMWEERDKIQAMRKNARAWK (SEQ ID NO:19)
[0048] In the above sequence, x independently represents any amino acid. Specifically, x can be selected from alanine, valine, glycine, isoleucine, leucine, phenylalanine, proline, tryptophan, tyrosine, arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, threonine, cysteine, histidine, and methionine.
[0049] (DYW domain) The DYW domain used in this embodiment has a more detailed structure specified and provided by the method described in Patent Document 1, and consists of any of the following amino acid sequences. x a1 PGx a2 SWIEx a3 -x a16 HP - First linking portion - Hx aa E - Second linking portion - Cx a17 x a18 CH - Third linking portion - DYW x b1 PGx b2 SWWTDx b3 -x b16 HP - First linking portion - Hx bb E - Second linking portion - Cx b17 x b18 CH - Third linking portion - DYW
[0050] In the sequence, x independently represents any amino acid, and the first linking portion, the second linking portion, and the third linking portion each independently represent a polypeptide fragment consisting of an amino acid sequence of any length. x a1 PGx a2 SWIEx a3 -x a16 HP - First linking portion - Hx aa E - Second linking portion - Cx a17 [[ID=4ģ]]x a18 CH - Third linking portion - DYW consisting of the DYW domain is referred to as DYW:PG (sometimes simply called PG. Or sometimes called the PG domain). x b1 PGx b2 SWWTDx b3 -x b16 HP - First linking portion - Hx bb E - Second linking portion - Cx b17 x[[IP=58]] b18 CH - Third linking portion - DYW consisting of the DYW domain may be denoted as DYW:WW (sometimes simply called WW. Or sometimes called the WW domain).
[0051] The DYW domain includes a region containing a PG box consisting of approximately 15 amino acids at the N-terminus, a zinc-binding domain (HxEx) in the central part. nCxxCH, x n The DYW domain is a sequence of any number n amino acids. It has three regions: the DYW domain and the C-terminal DYW domain. The zinc-binding domain can be further divided into the HxE region and the CxxCH region. These regions of each DYW domain can be represented as shown in the table below. Since x is any amino acid, it can be selected from alanine, valine, glycine, isoleucine, leucine, phenylalanine, proline, tryptophan, tyrosine, arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, threonine, cysteine, histidine, and methionine.
[0052]
[0053] (DYW:PG) DYW:PG is x a1 PGx a2 SWISE a3 -x a16 HP - First connecting section - Hx aa E - Second connecting section - Cx a17 x a18 CH - Third linkage - DYW polypeptide. Preferably, x a1 PGx a2 SWISE a3 -x a16 HP - First connecting section - Hx aa E - Second connecting section - Cx a17 x a18CH - Third linkage - DYW has a sequence identity (detailed in the [Terminology] section) with the sequence of position 172-307 of SEQ ID NO: 1, and has C-to-U editing activity when used in the C-terminal domain. In this case, regardless of the sequence identity, it is preferable that at least one selected from position 3 of the E1 motif, position 3 of the E2 motif, and position 1 of the PG domain is substituted in the sequence of SEQ ID NO: 1; it is more preferable that at least one selected from 3T of the E1 motif, 3A of the E2 motif, and 1R of the PG domain is substituted; and it is even more preferable that at least P3A of the E2 motif is substituted. One particularly preferred embodiment of the DYW:PG domain used in this embodiment is that which consists of the sequence of position 172-307 of SEQ ID NO: 6.
[0054] The total length of DYW:PG is not particularly limited as long as it exhibits C-to-U editing activity, but is for example 110 to 160 amino acids long, preferably 124 to 148 amino acids long, more preferably 128 to 144 amino acids long, and even more preferably 132 to 140 amino acids long, for example 136 amino acids long.
[0055] Area containing the PG box of DYW:PG (x a1 PGx a2 SWISE a3 -x a16 HP) at: x a1 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is preferably E (glutamic acid) or a similar amino acid, and more preferably G. a2 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is preferably C (cysteine) or an amino acid with similar properties, and more preferably C. a3 -x a16Each of the amino acids is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is preferably the same as or similar in properties to the corresponding amino acid at positions 180-193 in the sequence of SEQ ID NO: 6, and more preferably is the same as the corresponding amino acid at positions 180-193 in the sequence of SEQ ID NO: 6.
[0056] In one preferred embodiment, the HxE region of DYW:PG is HSE regardless of the other regions.
[0057] DYW:PG's CxxCH region, i.e., Cx a17 x a18 In CH: x a17 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is preferably G (glycine) or a similar amino acid, and more preferably G. a18 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is preferably D (aspartic acid) or an amino acid with similar properties, and more preferably D.
[0058] In DYW:PG, the area including the PG box and Hx aa The part that connects E, Hx aa The portion connecting the E region and the CxxCH region, and the portion connecting the CxxCH region and the DYW, are referred to as the first junction, the second junction, and the third junction, respectively (the same applies to other DYW domains).
[0059] The total length of the first linkage portion of DYW:PG is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is, for example, 39 to 47 amino acids long, preferably 40 to 46 amino acids long, more preferably 41 to 45 amino acids long, and even more preferably 42 to 44 amino acids long. The amino acid sequence of the first linkage portion is not particularly limited as long as it exhibits C-to-U editing activity as DYW:PG, but is preferably the same as the portion at positions 196 to 238 of the sequence of Sequence ID No. 6, or a sequence in which 1 to 22 amino acids are substituted, deleted, or added in that subsequence, or a sequence having sequence identity with that subsequence, and more preferably the same as that subsequence.
[0060] One preferred embodiment of the first ligation site of DYW:PG is a polypeptide represented by the following formula, which is 43 amino acids long, regardless of the sequence of the other parts of the DYW domain.
[0061] N a25 -N a26 -N a27 - … -N a65 -N a66 -N a67
[0062] The polypeptide described above is preferably a sequence that is the same as the portion of the sequence of SEQ ID NO: 6 from positions 196 to 238, or a sequence in which multiple amino acids are substituted in that subsequence, and which can exhibit C-to-U editing activity as DYW:PG. In this case, the amino acid substitution is made by an amino acid with a large bits value at the corresponding position in Figure 4-1 (for example, N a29 , N a30 , N a32 , N a33 , N a35 , N a36 , N a40 , N a44 , N a45 , N a47 , N a48 , N a52 , N a53 , N a54 , N a55 , N a58 , N a61 , N a65 , N a67is the same as that in Fig. 4, and it is preferable that other amino acids are substituted.
[0063] The total length of the second linker of DYW:PG is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:PG. For example, it is 21 to 29 amino acids long, preferably 22 to 28 amino acids long, more preferably 23 to 27 amino acids long, and still more preferably 24 to 26 amino acids long. The amino acid sequence of the second linker is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:PG. Preferably, it is the same as the portion at positions 242 to 266 of the sequence of SEQ ID NO: 6, or a sequence in which 1 to 13 amino acids are substituted, deleted, or added in that partial sequence, or a sequence having sequence identity with that partial sequence. More preferably, it is the same sequence as that partial sequence.
[0064] One preferred embodiment of the second linker of DYW:PG is a polypeptide represented by the following formula with a length of 25 amino acids, regardless of the sequence of other parts of the DYW domain.
[0065] N a71 -N a72 -N a73 - … -N a93 -N a94 -N a95
[0066] The above polypeptide is preferably the same as the portion at positions 242 to 266 of the sequence of SEQ ID NO: 6, or a sequence in which a plurality of amino acids are substituted in that partial sequence and can exhibit C-to-U editing activity as DYW:PG. At this time, the amino acid substitution is an amino acid with a large bits value at the corresponding position in Fig. 4-1 (for example, N a71 、N a72 、N a73 、N a76 、N a77 、N a78 、N a79 、N a81 、N a82 、N a86 、N a88 、N a89 、N a91 、N a92 、N a93 、Na94 ) is the same as in Figure 4, and it is preferable that the substitution is carried out so that the other amino acids are replaced.
[0067] The total length of the third linkage of DYW:PG is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:PG, but is, for example, 29 to 37 amino acids long, preferably 30 to 36 amino acids long, more preferably 31 to 35 amino acids long, and even more preferably 32 to 34 amino acids long. The amino acid sequence of the third linkage is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:PG, but is preferably the same as the portion of the sequence of Sequence ID No. 6 at positions 272 to 304, or a sequence in which 1 to 17 amino acids are substituted, deleted, or added in that subsequence, or a sequence having sequence identity with that subsequence, and more preferably the same as that subsequence.
[0068] One preferred embodiment of the third ligation site of DYW:PG is a polypeptide represented by the following formula, which is 33 amino acids long, regardless of the sequence of the other parts of the DYW domain.
[0069] N a101 -N a102 -N a103 - … -N a131 -N a132 -N a133
[0070] The polypeptide described above is preferably a sequence that is the same as the portion of the sequence of SEQ ID NO: 6 at positions 272-304, or a sequence in which multiple amino acids are substituted in that subsequence, and which can exhibit C-to-U editing activity as DYW:PG. In this case, the amino acid substitution is made by substituting an amino acid with a large bits value at the corresponding position in Figure 4 (for example, N a102 , N a104 , N a107 , N a112 , N a114 , N a117 , N a118 , N a121 , N a122 , N a123 , N a124 , N a125 , N a128 , N a130 , Na131 , N a132 ) is the same as in Figure 4, and it is preferable that the substitution is carried out so that the other amino acids are replaced.
[0071] (DYW:WW) DYW:WW is x b1 PGx b2 SWWTDx b3 -x b16 HP - First connecting section - Hx bb E - Second connecting section - Cx b17 x b18 CH - Third linkage - DYW polypeptide. Preferably, x b1 PGx b2 SWWTDx b3 -x b16 HP - Third Connection - Hx bb E - Third connecting section - Cx b17 x b18 This polypeptide has a CH-third linkage-DYW, has sequence identity (detailed in the [Terminology] section) with the sequence of the portion at positions 172-308 of SEQ ID NO: 2, and has C-to-U editing activity when used in the C-terminal domain. In this case, regardless of the sequence identity, it is preferable that at least one selected from position 3 of the P2 motif, position 10 of the L2 motif, position 1 of the E1 motif, and position 13 of the E2 motif is substituted in the sequence of SEQ ID NO: 2; it is more preferable that at least one selected from 3S of the P2 motif, 10A of the L2 motif, 1A of the E1 motif, and 13A of the E2 motif is substituted; and it is even more preferable that at least D13A of the E2 motif is substituted. One particularly preferred embodiment of the DYW:WW domain used in this embodiment is that which consists of the sequence of the portion at positions 172-308 of SEQ ID NO: 10.
[0072] The total length of DYW:WW is not particularly limited as long as it exhibits C-to-U editing activity, but is, for example, 110 to 160 amino acids long, preferably 125 to 149 amino acids long, more preferably 129 to 145 amino acids long, and even more preferably 133 to 141 amino acids long, for example 137 amino acids long.
[0073] The region containing the PG box of DYW:WW, i.e., x b1 PGx b2 SWWTDx b3 -x b16 In HP, the portion consisting of WTD may also be WSD.
[0074] In the area containing the PG box of DYW:WW: x b1 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:WW, but is preferably K (lysine) or a similar amino acid, and more preferably K. b2 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:WW, but is preferably Q (glutamine) or a similar amino acid, and more preferably Q. b3 -x b16 Each of the amino acids is not particularly limited as long as it exhibits C-to-U editing activity as DYW:WW, but is preferably the same as or similar in properties to the corresponding amino acids at positions 181-194 in the sequence of SEQ ID NO: 10, and more preferably is the same as the corresponding amino acids at positions 181-194 in the sequence of SEQ ID NO: 10.
[0075] In one preferred embodiment, the HxE region of DYW:WW is HSE regardless of the arrangement of the other parts.
[0076] The CxxCH region of DYW:WW, i.e., Cx b17 x b18 In CH: x b17 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:WW, but is preferably D (aspartic acid) or a similar amino acid, and more preferably D. b18 The amino acid is not particularly limited as long as it exhibits C-to-U editing activity as DYW:WW, but is preferably D or an amino acid with similar properties, and more preferably D.
[0077] The total length of the first linkage of DYW:WW is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:WW, but is, for example, 39 to 47 amino acids long, preferably 40 to 46 amino acids long, more preferably 41 to 45 amino acids long, and even more preferably 42 to 44 amino acids long. The amino acid sequence of the first linkage is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:WW, but is preferably the same as the portion of the sequence of Sequence ID No. 10 at positions 197 to 239, or a sequence in which 1 to 22 amino acids are substituted, deleted, or added in that subsequence, or a sequence having sequence identity with that subsequence, and more preferably the same as that subsequence.
[0078] One preferred embodiment of the first ligation site of DYW:WW is a polypeptide represented by the following formula, which is 43 amino acids long, regardless of the sequence of the other parts of the DYW domain.
[0079] N b26 -N b27 -N b28 - … -N b66 -N b67 -N b68
[0080] The polypeptide described above is preferably a sequence that is the same as the portion of the sequence of SEQ ID NO: 10 from positions 197 to 239, or a sequence in which multiple amino acids are substituted in that subsequence, and which can exhibit C-to-U editing activity as DYW:PG. In this case, the amino acid substitution is made by substituting an amino acid with a large bits value at the corresponding position in Figure 4 (for example, N b26 , N b30 , N b33 , N b34 , N b37 , N b41 , N b45 , N b46 , N b48 , N b49 , N b51 , N b52 , N b53 , N b55 , N b56 , N b57 , N b59 , N b61 , N b62 , Nb63 , N b64 , N b66 , N b67 , N b68 ) is the same as in Figure 4-2, and it is preferable that the substitution is carried out so that the other amino acids are replaced.
[0081] The total length of the second linkage of DYW:WW is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:WW, but is, for example, 21 to 29 amino acids long, preferably 22 to 28 amino acids long, more preferably 23 to 27 amino acids long, and even more preferably 24 to 26 amino acids long. The amino acid sequence of the second linkage is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:WW, but is preferably the same as the portion at positions 243 to 267 of the sequence of Sequence ID No. 10, or a sequence in which 1 to 13 amino acids are substituted, deleted, or added in that subsequence, or a sequence having sequence identity with that subsequence, and more preferably the same as that subsequence.
[0082] One preferred embodiment of the second ligation site of DYW:WW is a polypeptide represented by the following formula, which is 25 amino acids long, regardless of the sequence of the other parts of the DYW domain.
[0083] N b72 -N b73 -N b74 - … -N b94 -N b95 -N b96
[0084] The polypeptide described above is preferably the same as the portion of the sequence 243-267 of Sequence ID No. 10, or a sequence in which multiple amino acids are substituted in that subsequence, and which can exhibit C-to-U editing activity as DYW:WW. In this case, the amino acid substitution is made by an amino acid with a large bits value at the corresponding position in Figure 4-2 (for example, N b72 , N b73 , N b74 , N b75 , N b77 , N b78 , N b79 , N b81 , N b82 , N b84 , Nb88 , N b89 , N b90 , N b91 , N b92 , N b93 , N b94 , N b95 , N b96 ) is the same as in Figure 4, and it is preferable that the substitution is carried out so that the other amino acids are replaced.
[0085] The total length of the third linkage of DYW:WW is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:WW, but is, for example, 29 to 37 amino acids long, preferably 30 to 36 amino acids long, more preferably 31 to 35 amino acids long, and even more preferably 32 to 34 amino acids long. The amino acid sequence of the third linkage is not particularly limited as long as it can exhibit C-to-U editing activity as DYW:WW, but is preferably the same as the portion of the sequence of Sequence ID No. 10 at positions 273 to 305, or a sequence in which 1 to 17 amino acids are substituted, deleted, or added in that subsequence, or a sequence having sequence identity with that subsequence, and more preferably the same as that subsequence.
[0086] One preferred embodiment of the third ligation site of DYW:WW is a polypeptide represented by the following formula, which is 33 amino acids long, regardless of the sequence of the other parts of the DYW domain.
[0087] N b102 -N b103 -N b104 - … -N b132 -N b133 -N b134
[0088] The polypeptide described above is preferably the same as the portion of the sequence 273-305 of Sequence ID No. 10, or a sequence in which multiple amino acids are substituted in that subsequence, and which can exhibit C-to-U editing activity as DYW:WW. In this case, the amino acid substitution is made by an amino acid with a large bits value at the corresponding position in Figure 4-2 (for example, N b104 , N b105 , N b107 , N b108 , N b109 , N b110 , Nb111 , N b113 , N b115 , N b116 , N b117 , N b118 , N b119 , N b121 , N b122 , N b123 , N b124 , N b126 , N b129 , N b131 , N b132 , N b133 N b134 ) is the same as in Figure 4-2, and it is preferable that the substitution is carried out so that the other amino acids are replaced.
[0089] (Translation Promotion) This embodiment also relates to a method for promoting the translation of edited RNA, wherein the modification method described above is carried out; the obtained modified PPR editor protein is bound to a target RNA to perform RNA editing; and a protein is translated from the edited RNA, and at this time the translation efficiency of the protein is higher than the translation efficiency when an unmodified PPR editor protein is used.
[0090] The inventors have constructed a mechanism to monitor whether edited RNA is being translated correctly. Specifically, they focused on the CTNNB1 gene, which encodes β-catenin involved in the Wnt / β-catenin pathway. In the absence of Wnt protein, β-catenin is phosphorylated and degraded via ubiquitination (Non-Patent Literature 6) (Figure 1). On the other hand, in the presence of Wnt protein, Wnt activates proteins that inhibit the phosphorylation of β-catenin, so β-catenin is not degraded and accumulates in the cytoplasm. The accumulated β-catenin then translocates to the nucleus and forms a complex with TCF / LEF family transcription factors to promote downstream gene expression. Substitution of threonine to isoleucine at amino acid 41 of β-catenin inhibits phosphorylation and causes β-catenin accumulation. The amount of β-catenin in the nucleus can be monitored by the expression level of a reporter gene under the control of the TCF / LEF promoter.
[0091] Furthermore, the translation efficiency of RNA edited by the PPR editor protein can be confirmed by co-transfecting HEK293T cells with an effector plasmid encoding the PPR editor protein and a reporter plasmid expressing NanoLuc luciferase (Nluc) under the control of the TCF / LEF promoter. If RNA editing occurs but Nluc activity is absent, it suggests that the PPR editor protein does not dissociate from the edited RNA, and translation from that RNA is inhibited. On the other hand, if both RNA editing and Nluc activity are present, it suggests that the PPR editor dissociated from the edited RNA and β-catenin was translated. Fluc is a constitutively expressed reporter and is introduced to correct experimental errors such as transfection efficiency. The transcriptional activation of TCF / LEF in CTNNB1 can be evaluated by the value obtained by dividing the luminescence of Nluc by the luminescence of Fluc (Nluc / Fluc).
[0092] To more directly evaluate the translation efficiency of edited RNA, measuring the accumulation of β-catenin protein is effective. The accumulation can be measured by Western blotting, and quantitative comparison is possible by normalizing it using total protein amount.
[0093] By modifying this embodiment, the binding affinity to RNA can be altered and the dissociation of the edited RNA can be indirectly promoted by indirectly altering the interaction between the PPR editor protein and RNA or the flexibility of the C-terminal domain.
[0094] II. Modified PPR Editor Protein (Embodiment 2) This embodiment relates to a PPR editor protein having a portion consisting of 2 to 20 PPR motifs and a C-terminal domain consisting of a polypeptide of the following a or b: a. A polypeptide consisting of a sequence in which at least one selected from V3 of the E1 motif, P3 of the E2 motif, and E1 of the PG domain is substituted in the amino acid sequence of the C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a PG domain; or b. A polypeptide consisting of a sequence in which at least one selected from T3 of the P2 motif, V10 of the L2 motif, L1 of the E1 motif, and D13 of the E2 motif is substituted in the amino acid sequence of the C-terminal domain consisting of a P2 motif, an L2 motif, an S2 motif, an E1 motif, an E2 motif, and a WW domain.
[0095] The portion consisting of 2 to 20 PPR motifs is a binding region comprising an array of PPR motifs (P array or PLS array) capable of binding to the target RNA. The C-terminal domain is as described in Embodiment 1.
[0096] (P Array) P-type PPR proteins consist of a simple repeat (P array) of a standard 35-amino acid PPR motif (P). The P-type PPR motif is as described in Embodiment 1.
[0097] (PLS array) PLS-type PPR proteins are composed of repeating PPR motifs (PLS array) consisting of P1, L1, and S1.
[0098] The total length of P1 is not particularly limited as long as it can bind to the target base, but is for example 33 to 37 amino acids long, preferably 34 to 36 amino acids long, and more preferably 35 amino acids long. The total length of L1 is not particularly limited as long as it can bind to the target base, but is for example 33 to 37 amino acids long, preferably 34 to 36 amino acids long, and more preferably 35 amino acids long. The total length of S1 is not particularly limited as long as it can bind to the target base, but is for example 30 to 33 amino acids long, preferably 30 to 32 amino acids long, and more preferably 31 amino acids long.
[0099] In PLS-type PPR proteins, the P1L1S1 repeat portion and the portion up to P2 can be designed according to the PPR-code rules described above, depending on the sequence of the target RNA.
[0100] The number of P1L1S1 repeats is not particularly limited as long as it can bind to the target base sequence, but is for example 1 to 5, preferably 2 to 4, and more preferably 3. In principle, even one unit (3 repeats) can be used. MEF8 (L1-S1-P2-L2-S2-E-DYW), which consists of 5 PPR motifs, is known to be involved in approximately 60 editing sites.
[0101] In natural PPR proteins, the first and last P1L1S1 units show clear differences in the amino acid residues at specific positions, distinguishing them from the internal P1L1S1 units. From the perspective of designing an artificial PLS array that is as close as possible to naturally occurring ones, it is advisable to design three types of P1L1S1 units corresponding to the positions of the PPR motif: the first (N-terminal) P1L1S1 unit, the internal P1L1S1 unit, and the last (C-terminal) P1L1S1 unit located immediately before P2L2S2. In addition to those composed of repeating PLS units, natural PPR proteins also sometimes contain repeating SS units (31 amino acids), and these can also be used in this embodiment.
[0102] (C-terminal domain) In one embodiment, the C-terminal domain bound to the P array or PLS array described above is substituted with at least one selected from V3T of the E1 motif, P3A of the E2 motif, and E1R of the PG domain. One particularly preferred embodiment in terms of achieving the desired effect is a substitution of P3A of the E2 motif with a combination of V3T of the E1 motif or E1R of the PG domain. In one embodiment, the C-terminal domain bound to the P array or PLS array described above is substituted with at least one selected from T3S of the P2 motif, V10A of the L2 motif, L1A of the E1 motif, and D13A of the E2 motif. One particularly preferred embodiment in terms of achieving the desired effect is a substitution of T3S of the P2 motif and at least one selected from D13A of the E2 motif.
[0103] In one preferred embodiment, the C-terminal domain is a polypeptide comprising a sequence in which at least one substitution selected from V106T, P140A, and E172R is made in the amino acid sequence of SEQ ID NO: 1. In one particularly preferred embodiment, in terms of achieving the desired effect, the C-terminal domain is a polypeptide in which a combination substitution of P140A and V106T or E172R is made in the amino acid sequence of SEQ ID NO: 1. In one preferred embodiment, the C-terminal domain is a polypeptide comprising a sequence in which at least one substitution selected from T3S, V45A, L104A, and D150A is made in the amino acid sequence of SEQ ID NO: 2. In one particularly preferred embodiment, in terms of achieving the desired effect, the C-terminal domain is a polypeptide comprising a sequence in which at least one substitution selected from T3S and D150A is made in the amino acid sequence of SEQ ID NO: 2.
[0104] III. Other Embodiments The present invention also provides: • Nucleic acids encoding the PPR editor protein of Embodiment 2 • A vector containing the above nucleic acid • A cell containing the above vector.
[0105] Vectors include viral vectors. Vectors for amplification can use E. coli or yeast as hosts. In this specification, an expression vector means a vector that includes, for example, DNA having a promoter sequence, DNA encoding a desired protein, and DNA having a terminator sequence from upstream, but the sequences do not necessarily have to be in this order as long as the desired function is performed. In this embodiment, various vectors that are commonly used by those skilled in the art can be rearranged and used.
[0106] Specifically, this embodiment provides a nucleotide sequence encoding a PPR editor protein, comprising at least one PPR motif, an RNA-binding domain (for example, an RNA-binding domain which is a PLS-type PPR protein) which is sequence-specifically bound to a target RNA (preferably an animal target RNA) according to the rules of the PPR-code, and a DYW domain which is one of the aforementioned DYW:PG or DYW:WW.
[0107] Another form of the present invention provides a vector for editing target RNA, comprising a nucleotide sequence encoding a PPR editor protein, which includes at least one PPR motif and an RNA-binding domain (preferably an RNA-binding domain of the PLS type that is a PPR protein) capable of sequence-specifically binding to target RNA (preferably an animal target RNA) according to the rules of the PPR-code, and a DYW domain that is either one of the aforementioned DYW:PG or DYW:WW.
[0108] The PPR editor protein of Embodiment 2 can function in eukaryotic cells (e.g., animals, plants, microorganisms (yeast, etc.), protists). The DYW protein of this embodiment can function in animal cells in particular (in vitro or in vivo). Examples of animal cells into which the PPR editor protein, or a vector expressing the PPR editor protein, can be introduced include cells derived from humans, monkeys, pigs, cattle, horses, dogs, cats, mice, and rats. Examples of cultured cells into which the PPR editor protein, or a vector expressing the PPR editor protein of this embodiment, can be introduced include, but are not limited to, Chinese hamster ovary (CHO) cells, COS-1 cells, COS-7 cells, VERO (ATCC CCL-81) cells, BHK cells, canine kidney-derived MDCK cells, hamster AV-12-664 cells, HeLa cells, WI38 cells, HEK293 cells, HEK293T cells, and PER. C6 cells.
[0109] The PPR editor protein of Embodiment 2 can convert editing target C to U, or editing target U to C, within the target RNA. RNA-binding PPR proteins are involved in all RNA processing steps found in organelles: cleavage, RNA editing, translation, splicing, and RNA stabilization.
[0110] The PPR editor protein of Embodiment 2 enables single-base editing of mitochondrial RNA. Mitochondria have their own genomes and encode constituent proteins of important complexes involved in respiration and ATP production. Mutations in these proteins are known to cause various diseases. By using mutation repair or other mutation introduction with this embodiment, it is expected that various diseases can be treated.
[0111] The improved RNA base editing methods achieved by embodiments 1 and 2 described above are expected to have the following applications in various fields.
[0112] (1) Recognize and edit specific RNAs related to medical treatment or specific diseases. By using this embodiment, it is possible to treat genetic diseases caused by single nucleotide mutations. - Diseases can be treated by creating a single amino acid mutation in a protein encoded by a single nucleotide mutation, thereby altering, activating, or inhibiting its function. Examples of such modifications include post-translational modifications of proteins, such as phosphorylation, glycosylation, ubiquitination, nitrosylation, methylation, acetylation, lipidation, and proteolysis. Other examples include inhibition of binding by mutations in interaction sites between proteins (or nucleic acids and various molecules), inhibition of cleavage by mutations in protein cleavage sites, and control of intracellular and extracellular localization by mutations in intracellular signal sequences.
[0113] - Create cells with controlled RNA suppression and expression. Such cells include stem cells (e.g., iPS cells) whose differentiated and undifferentiated states are monitored, model cells for evaluating cosmetics, and cells in which the expression of functional RNA can be switched ON / OFF for the purpose of elucidating drug discovery mechanisms and conducting pharmacological tests.
[0114] (2) Agriculture, forestry and fisheries: To improve yield and quality in crops, forest products, fishery products, etc. - To improve disease resistance, improve environmental tolerance, and breed organisms with improved or new functionalities.
[0115] For example, with regard to first-generation hybrid (F1) crops, it may be possible to artificially create F1 crops by editing mitochondrial RNA with PPR editor proteins, thereby improving yield and quality. RNA editing with PPR editor proteins enables more accurate and rapid improvement of biological varieties and breeding (genetic improvement of organisms) than conventional techniques. Furthermore, since RNA editing with PPR editor proteins does not alter traits with foreign genes like genetic modification, it is closer to traditional breeding methods such as mutant selection and backcrossing. Therefore, it can reliably and quickly address global food and environmental problems.
[0116] (3) In the production of useful substances using chemicals, microorganisms, cultured cells, plants, and animals (e.g., insects), protein expression levels are controlled by manipulating RNA. This can improve the productivity of useful substances. Examples of useful substances include proteinaceous substances such as antibodies, vaccines, and enzymes, as well as relatively low-molecular-weight compounds such as pharmaceutical intermediates, fragrances, and dyes.
[0117] - Improve the efficiency of biofuel production by modifying the metabolic pathways of algae and microorganisms.
[0118] IV. Unless otherwise specified, the numerical range x to y includes the values x and y at both ends.
[0119] In this specification, claims, and drawings, bases or nucleosides in nucleic acids are represented by a single letter of the alphabet. Unless otherwise specified, A represents adenine or adenosine, C represents cytosine or cytidine, G represents guanine or guanosine, U represents uracil or uridine, T represents uracil or uridine in RNA sequences, and thymine or thymidine in DNA sequences. In this specification, claims, and drawings, unless otherwise specified, amino acids are represented by a single letter of the alphabet. Specifically, A represents alanine, L represents leucine, R represents arginine, K represents lysine, N represents asparagine, M represents methionine, D represents aspartic acid, F represents phenylalanine, C represents cysteine, P represents proline, Q represents glutamine, S represents serine, E represents glutamic acid, T represents threonine, G represents glycine, W represents tryptophan, H represents histidine, Y represents tyrosine, I represents isoleucine, and V represents valine.
[0120] Furthermore, when a variant of a protein or enzyme is represented by a string consisting of one letter of the alphabet, a number, and another letter of the alphabet following the number, the leftmost letter indicates the amino acid before the mutation, the middle number indicates the position of the amino acid, and the rightmost letter indicates the amino acid after the mutation, meaning that the leftmost amino acid has been replaced by the rightmost amino acid. For example, M10I indicates that methionine at position 10 in the amino acid sequence has been replaced with isoleucine.
[0121] In relation to proteins or enzymes, specific amino acids in the amino acid sequence may be represented by a single letter of the alphabet and a number. For example, M10 refers to the M at position 10 in the amino acid sequence. Similarly, 10I indicates that the amino acid at position 10 in the amino acid sequence has been substituted with I. The original amino acid is not relevant.
[0122] In relation to the amino acid sequences of proteins and polypeptides, amino acid residues are sometimes simply referred to as amino acids.
[0123] With respect to base sequences (sometimes called nucleotide sequences) or amino acid sequences, "identity," unless otherwise specified, refers to the percentage of matching bases or amino acids shared between two sequences when the two sequences are aligned in the most optimal manner. That is, identity can be calculated as (number of matching positions / total number of positions) × 100, and can be calculated using commercially available algorithms. Such algorithms are incorporated into the NBLAST and XBLAST programs described in Altschul et al., J.Mol.Biol. 215(1990) 403-410. More specifically, the search and analysis of identity between base sequences or amino acid sequences can be performed using algorithms or programs well known to those skilled in the art (e.g., BLASTN, BLASTP, BLASTX, ClustalW). When using a program, the parameters can be appropriately set by those skilled in the art, or the default parameters of each program may be used. The specific methods of these analysis methods are also well known to those skilled in the art.
[0124] With respect to the base sequence or amino acid sequence, a high degree of sequence identity is preferred unless otherwise specified. Specifically, it is preferable to have 40% or more, more preferably 45% or more, even more preferably 50% or more, even more preferably 55% or more, even more preferably 60% or more, and even more preferably 65% or more. Furthermore, it is preferable to have 70% or more, more preferably 80% or more, even more preferably 85% or more, even more preferably 90% or more, even more preferably 95% or more, and even more preferably 97.5% or more.
[0125] With respect to a polypeptide or protein, the number of amino acids substituted, deleted, or added in a "substituted, deleted, or added sequence" is not particularly limited in any motif or protein, as long as the motif or protein consisting of that amino acid sequence has the desired function, unless otherwise specified. However, it is usually around 1 to 9 or 1 to 4 amino acids, or even more if the substitutions are with similar amino acids. Means for preparing polynucleotides or proteins relating to such amino acid sequences are well known to those skilled in the art.
[0126] Similar amino acids refer to amino acids with similar physical properties such as hydroxyl, charge, pKa, and solubility. Examples include the following: Hydrophobic (nonpolar) amino acids: alanine, valine, glycine, isoleucine, leucine, phenylalanine, proline, tryptophan, tyrosine. Nonhydrophobic amino acids: arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, threonine, cysteine, histidine, methionine; Hydrophilic amino acids: arginine, asparagine, aspartic acid, glutamic acid, glutamine, lysine, serine, threonine; Acidic amino acids: aspartic acid, glutamic acid; Basic amino acids: lysine, arginine, histidine; Neutral amino acids: alanine, asparagine, cysteine, glutamine, glycine, isoleucine, leucine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, valine; Sulfur-containing amino acids: methionine, cysteine; Aromatic ring amino acids: tyrosine, tryptophan, phenylalanine.
[0127] [Results] (RNA edited with the C-to-U PPR editor has low translation efficiency) To investigate the translation efficiency of RNA edited with the PPR editor, we designed PPR editors (14P-WW and 14P-PG) by ligating a C-terminal domain consisting of PPR-like motifs (P2, L2, S2, E1, E2) and a WW or PG DYW domain (Non-Patent Literature 7, Patent Literature 1) to 14 P motifs (Sequence ID 17). These proteins edit the second base of the 41st codon of CTNNB1, substituting threonine with isoleucine. This inhibits the degradation of β-catenin newly produced from the edited RNA.
[0128] HEK293T cells were co-transfected with an effector plasmid encoding a PPR editor and a reporter plasmid expressing NanoLuc luciferase (Nluc) under the control of the TCF / LEF promoter. If RNA editing occurred but Nluc activity was absent, it suggested that the PPR editor did not dissociate from the edited RNA, inhibiting translation from that RNA. Conversely, if both RNA editing and Nluc activity were present, it suggested that the PPR editor dissociated from the edited RNA and β-catenin was translated. Fluc is a constitutively expressed reporter and was introduced to correct for experimental errors such as transfection efficiency. TCF / LEF transcriptional activation of CTNNB1 was evaluated by dividing the luminescence of Nluc by the luminescence of Fluc (Nluc / Fluc). Furthermore, the accumulation of β-catenin protein was also evaluated by Western blotting.
[0129] The 14P-WW and 14P-PG proteins showed approximately 20% editing efficiency at the endogenous CTNNB1-T41I editing site, but the Nluc / Fluc ratio increased only 23-fold and 14-fold, respectively, compared to the negative control (empty vector) (Figure 2d, e). On the other hand, the positive control expressed in HEK293T cells with the T41I mutation CTNNB1 showed an increase of up to 388-fold (Figure 2d, e). These results suggest that much of the edited RNA was not dissociated from the PPR editor, resulting in low translation efficiency.
[0130] (Introducing mutations into the first helix based on the consensus sequence promotes RNA dissociation) To investigate the effect of the C-terminal domain on RNA dissociation, mutations were introduced into the amino acids within the first helix (located in the RNA interaction region) of the P2, L2, S2, E1, and E2 motifs to match the consensus sequence obtained from alignments of the C-terminal WW and PG domains obtained from transcriptomes of hornworts, chlorophyllans, and ferns (Figure 2a). However, amino acids involved in specific RNA recognition were excluded (5 and L in Figure 2a).
[0131] Of the 10 mutations introduced into the 14P-WW protein (SEQ ID NO: 2), editing activity was lost in 2 of them, but the Nluc / Fluc ratio significantly increased 48 hours after transfection in 4 of them (Figure 2b, d). When the amino acid at position 10 of the L2 motif was replaced from valine to alanine (L2:V10A) (SEQ ID NO: 8) and the amino acid at position 1 of the E1 motif was replaced from leucine to alanine (E1: L1A) (SEQ ID NO: 9), the Nluc / Fluc ratio increased by 1.6 times and 3 times, respectively, compared to 14P-WW (WT) without mutations. Similarly, when the amino acid at position 3 of the P2 motif was replaced from threonine to serine (P2: T3S) (SEQ ID NO: 7) and when the amino acid at position 13 of the E2 motif was replaced from aspartic acid to alanine (E2:D13A) (SEQ ID NO: 10), the Nluc / Fluc ratio increased by 21 times and 31 times, respectively, compared to 14P-WW (WT) without mutations. Of these four mutations, only the E2:D13A mutation showed a significant increase in editing activity. Furthermore, the accumulation of β-catenin protein itself was also elevated, similar to the results of the reporter assay (Figure 2f).
[0132] Similarly, when the C-terminal domain of the 14P-PG protein (SEQ ID NO: 1) was examined, only when the amino acid at position 3 of the E2 motif was substituted from proline to alanine (E2:P3A) (SEQ ID NO: 6) did the editing efficiency double and the Nluc / Fluc ratio increase 16-fold (Figure 2c, e). In addition, the accumulation of β-catenin protein itself increased, similar to the results of the reporter assay (Figure 2g).
[0133] (Mutations to PG based on the C-terminal domain of WW promote RNA dissociation) To further explore mutations that promote RNA dissociation in the C-terminal PG domain, we compared the amino acid sequences of the C-terminal PG and WW domains used in this study and found two important sites where amino acids with different properties were placed at the RNA interaction site (Figure 2a). One is the amino acid at position 3 of the E1 motif, and the other is the amino acid at position 1 of the DYW domain. To investigate the effect of these amino acid residues on RNA dissociation, the former was replaced with threonine (E1:V3T) (SEQ ID NO: 11) and the latter with arginine (DYW: E1R) (SEQ ID NO: 12) in the 14P-PG protein (Figure 3). The E1:V3T and DYW: E1R mutations did not affect the editing efficiency of 14P-PG, but increased the Nluc / Fluc ratio by 1.6 times and 2.9 times, respectively. However, this was not as significant as the E2:P3A mutation which showed a 6.2-fold increase in this experiment.
[0134] Furthermore, to investigate the effects of mutation combinations, we introduced 1 to 3 mutations selected from the E1 motif V3T, the E2 motif P3A, and the DYW domain E1R into the C-terminal domain of the 14P-PG protein. The Nluc / Fluc ratio did not change significantly between the cases where all three mutations were substituted and the case with a single amino acid substitution (E2:P3A) (Figure 5).
[0135] Furthermore, when 1 to 3 mutations selected from the E1 motif V3T, the E2 motif P3A, and the DYW domain E1R were introduced into the C-terminal domain of the 14P-PG protein, or when 1 to 2 mutations selected from the P2 motif T3S and the E2 motif D13A were introduced into the C-terminal domain of the 14P-WW protein, the RNA editing efficiency and the amount of protein (β-catenin protein) from the edited mRNA were measured (Figure 6).
[0136] These results suggest that substituting amino acids located in the RNA interaction region of the C-terminal domain can regulate the binding affinity between the PPR editor and the target RNA, and that this has a beneficial effect on the translation of the edited RNA.
[0137] [Discussion] (Evaluation of RNA dissociation efficiency using β-catenin activity) In this study, the dissociation efficiency from edited RNA in the PPR editor was indirectly evaluated through β-catenin activity. The CTNNB1-T41I mutation in the 14P-WW and 14P-PG proteins promoted intracellular β-catenin accumulation and nuclear translocation, inducing Nluc reporter expression. However, this intracellular accumulation of β-catenin may not be due to an improvement in the RNA dissociation efficiency of the PPR editor, but rather to an increase in the amount of edited RNA. Several mutations in the WW protein, such as E2:D13A, suggested this possibility, but mutations in the PG protein, such as DYW:E1R, did not affect editing efficiency, thus ruling out this possibility.
[0138] (Effects of C-terminal mutations on RNA recognition) Three of the mutations examined in this study (E1:V3T and E2:P3A mutations in the PG protein and P2:T3S mutation in the WW protein) involve the substitution of an amino acid at position 3 within the PPR motif. Based on the model structure of the DYW protein available from the AlphaFold server (Non-Patent Literature 7), the amino acid at position 3 of the PPR motif interacts with the first helix of the upstream PPR motif. Therefore, mutating the amino acid at position 3 may change the position of the first helix of two adjacent PPR motifs, and this may also change the positions of amino acids involved in specific nucleic acid recognition (position 5 and the last). This could indirectly affect the specificity of RNA recognition and promote RNA dissociation.
[0139] On the other hand, mutations in amino acids at positions 1, 10, and 13 are not thought to have a direct effect on RNA recognition. The amino acid at position 1 of the E1 motif and the amino acid at position 10 of the L2 motif may alter the structure of the PPR / RNA complex in the C-terminal domain by binding to the upstream PPR-like motif. In addition, the amino acid at position 13 of the E2 motif may interact with a region called the gating domain, which controls catalysis within the DYW domain.
[0140] In summary, this study suggests that by indirectly modifying the interaction between PPR and RNA, or the flexibility of the C-terminal domain, it is possible to alter the binding affinity to RNA and indirectly promote RNA dissociation after editing.
[0141] [Methods] (Design of Artificial PPR Protein) To create the PPR editor, a C-terminal domain (Non-Patent Literature 7, Patent Literature 1) consisting of five PPR-like motifs (P2, L2, S2, E1, E2) and a DYW domain (PG or WW) was first cloned into the mammalian cell expression vector PM18033 using the Golden Gate method with the restriction enzyme Esp3I. This vector contains the CMV promoter, human β-globin chimeric intron, and SV40 polyadenylation signal. Next, a PPR-P domain (Non-Patent Literature 8) designed to recognize the upstream sequence of the target editing site CTNNB1-T41I was inserted upstream of the C-terminal domain using the Golden Gate method with the restriction enzyme BpiI. Mutations were introduced into the PPR editor PG and WW proteins using site-directed mutagenesis PCR.
[0142] The C-terminal domain sequences of the PPR editor PG protein (SEQ ID NO: 1) and PPR editor WW protein (SEQ ID NO: 2) before mutation introduction are shown below. 10 20 30 40 VVSWNAMIAA YAQHGHGKEA LQLFQQMQQE GVKPSEVTFT 50 60 70 80 SILSACSHAG LVDEGHHYFE SMSPDYGITP RVEHYGCMVD 90 100 110 120 LLGRAGRLDE AEDLIKSMPF QPNVVVWGTL LGACRVHGDV 130 140 150 160 ERGERAAERI LELDPESAAP YVLLSNIYAA AGRWDEAAKV 170 180 190 200 RKLMKERGVK KEPGCSWIEV NNKVHEFVAG DKSHPQTKEI 210 220 230 240 YAELERLSKQ MKEAGYVPDT KFVLHDVEEE EKEQLLCYHS 250 260 270 280 EKLAIAFGLI STPPGTPLRI IKNLRVCGDC HTATKFISKI 290 300 307 VGREIVVRDA NRFHHFKDGV CSCGDYW (Sequence ID 1) 10 20 30 40 VVTWNALIAG YARQGESDLV FHLLERMRQE GIQPSGVTFT 50 6070 80 SVLTVCSHAG LVDEGQKYFD AMSEDYGITP RIEHYGCMVD 90 100 110 120 LLGRAGQMDE AVAMVEKMPF QPNLVTWGTL LGACRKWNNV 130 140 150 160 EIGRHAFECA VRLDEKSAAA YVLMSNIYAD AHMWEERDKI 170 180 190 200 QAMRKNARAW KKPGQSWWTD TGGIVHTFVV GDTKHPQSQD 210 220 230 240 IYAKLKDLYV KMKEEGYVPH LDCVLWDISD DEKEDALCGH 250 260 270 280 SEKLAIACAL INTPPGTPIR IVKNLRVCDD CHKAIALISK 290 300 308 IEGRNIICRD ASRFHNYKDG KCSCGDYW (Sequence ID 2)
[0143] (Mammalian cell culture) HEK293T cells (RIKEN, RCB2202) were cultured at 37°C and 5% CO2 in Dulbecco's Modified Eagle Medium (DMEM) (Fujifilm Wako Pure Chemical Industries) containing high glucose, glutamine, phenol RED, and sodium pyruvate, supplemented with 10% fetal bovine serum (Capricorn) and 1% penicillin-streptomycin solution (Fujifilm Wako Pure Chemical Industries). Cells were subculturified every 2-3 days when they reached 80-90% confluence.
[0144] (Transfection) To measure the RNA editing activity of the PPR protein, HEK293T cells were transfected 24 hours prior to transfection in a 24-well flat-bottom cell culture plate (ThermoFisher) at approximately 8.0 x 10⁶ 4 Cells were seeded at a cell / well concentration. For transfection, 500 ng of plasmid DNA was mixed with 18.5 μl of Opti-MEM® I Reduced Serum Medium (ThermoFisher) and 1.5 μl of FuGENE® HD Transfection Reagent (Promega), for a total volume of 25 μL per well. The mixture was incubated at room temperature for 10 minutes before being added to the cells. After transfection, the cells were cultured at 37°C for 24 hours.
[0145] To measure the degree of RNA dissociation by luciferase, HEK293T cells were placed in a 384-well flat-bottom cell culture plate (ThermoFisher) approximately 4 x 10⁶ 24 hours before transfection. 3 Cells were seeded at the cell / well concentration. PPR editor protein, Nluc, and Fluc plasmid DNA were mixed in quantities of 50 ng, 5 ng, and 2.5 ng, respectively. Opti-MEM (trademark) I Reduced Serum Medium (ThermoFisher) and 0.17 μl of FuGENE (trademark) HD Transfection Reagent (Promega) were added to prepare a total volume of 7.5 μL per well. The mixture was incubated at room temperature for 10 minutes, added to the cells, and cultured at 37°C for 48 hours.
[0146] (Luciferase Assay) To measure β-catenin activity, a gene ligated with the TCF / LEF response element and NanoLuc luciferase (Nluc) downstream was synthesized and introduced into the pEX-A2J1 vector (Eurofins Genomics). Similarly, the Firefly luciferase (Fluc) gene was synthesized and cloned into the pEX-A2J2 vector (Eurofins Genomics) under the control of the EF1α promoter. Luciferase activity was measured using HEK293T cells 48 hours after transfection with the Nano-Glo® Dual-Luciferase® Reporter Assay System (Promega).
[0147] (RNA editing activity measurement) RNA was extracted from the recovered cells using the Maxwell® RSC simplyRNA Tissue Kit (Promega). cDNA was synthesized using 300 ng of RNA, ReverTra Ace® (Toyobo), and 1.25 μM random hexamer primers. The region containing the target editing site was amplified by PCR using 1 μL of cDNA, PrimeSTAR Max DNA® polymerase (TaKaRa), and primers with sequences adjacent to the editing site on human mRNA. The PCR product was obtained using ExoSAP-IT. TM The molecules were purified using Express PCR Cleanup Reagent (ThermoFisher) and subjected to Sanger sequencing using gene-specific forward or reverse primers (Azenta). The resulting sequence chromatograms were analyzed using EditR software, and after trimming low-quality regions (P-value cutoff: 0.01), the ratio of the thymidine (T) peak area to the sum of the thymidine (T) and cytidine (C) peak areas was calculated to determine the editing efficiency. If no significant difference in peak area was determined due to background noise, the value was set to 0 (Non-Patent Literature 9).
[0148] (Method for Measuring β-catenin Protein Accumulation) Total protein was extracted with RIPA buffer (ATTO) and adjusted to 1 mg / mL. Western blotting was performed using the Abby Simple Western system (ProteinSimple) with a 12-230 kDa isolation module, RePlex module, and anti-mouse or anti-rabbit detection module (ProteinSimple), according to the manufacturer's instructions. A primary antibody against β-catenin (sc-59737, Santa Cruz Biotechnology) was used. The antibody was diluted 250-fold with Can Get Signal Solution 1 (TOYOBO). Total protein was detected using Total Protein labeling reagent (ProteinSimple). Peak area values of target proteins were normalized to total protein values using Compass software (ProteinSimple).
[0149] No. 7: Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, Yuan D, Stroe O, Wood G, Laydon A, Zidek A, Green T, Tunyasuvunakool K, Petersen S, Jumper J, Clancy E, Green R, Vora A, Lutfi M, Figurnov M, Cowie A, Hobbs N, Kohli P, Kleywegt G, Birney E, Hassabis D, Velankar S. The AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444 Crossref 8: Yagi Y, Teramoto T, Kaieda S, Imai T, Sasaki T, Yagi M, Maekawa N, Nakamura T. Construction of a Versatile, Programmable RNA-Binding Protein Using Designer PPR Proteins and Its Application for Splicing Control in Mammalian Cells. Cells. 2022 Nov 8;11(22):3529 Documentation 9: Kluesner MG, et al. EditR: A method for quantifying base editing from Sanger sequencing. CRISPR J. 2018;1:239–250.
[0150] [Sequences listed in the sequence listing] SEQ ID NO: 1 PPR editor PG protein SEQ ID NO: 2 PPR editor WW protein SEQ ID NO: 3 Consensus sequences obtained from the C-terminal domain containing the PG domain and the C-terminal domain containing the WW domain in early land plant populations SEQ ID NO: 4 WW-C-terminal domain (WW1) SEQ ID NO: 5 PG-C-terminal domain (PG1) SEQ ID NO: 6 PG-E2-P3A SEQ ID NO: 7 WW-P2-T3S SEQ ID NO: 8 WW-L2-V10A SEQ ID NO: 9 WW-E1-L1A SEQ ID NO: 10 WW-E2-D13A SEQ ID NO: 11 PG-E1-V3T SEQ ID NO: 12 PG-DYW-E1R SEQ ID NO: 13 PG-E1:V3T / E2:P3A SEQ ID NO: 14 PG- E1:V3T / DYW:E1R SEQ ID NO:15 PG- E2:P3A / DYW:E1R SEQ ID NO:16 PG- E1:V3T / E2:P3A / DYW:E1R SEQ ID NO:17 14P SEQ ID NO:18 E2 motif for DYW:PG SEQ ID NO:19 E2 motif for DYW:WW SEQ ID NO:20 WW-P2:T3S / E2-D13A
Claims
1. A method for modifying a PPR editor protein having a C-terminal domain, comprising the following steps a or b: a. Substituting at least one selected from position 3 of the E1 motif, position 3 of the E2 motif, and position 1 of the PG domain in a C-terminal domain consisting of a P2 motif, L2 motif, S2 motif, E1 motif, E2 motif, and PG domain; or b. Substituting at least one selected from position 3 of the P2 motif, position 10 of the L2 motif, position 1 of the E1 motif, and position 13 of the E2 motif in a C-terminal domain consisting of a P2 motif, L2 motif, S2 motif, E1 motif, E2 motif, and WW domain.
2. The method according to claim 1, wherein the substitution performed in step a is at least one substitution selected from 3T of the E1 motif, 3A of the E2 motif, and 1R of the PG domain, and the substitution performed in step b is at least one substitution selected from 3S of the P2 motif, 10A of the L2 motif, 1A of the E1 motif, and 13A of the E2 motif.
3. The method according to claim 1, wherein the substitution performed in step a includes the substitution of at least P3A of the E2 motif, and the substitution performed in step b includes the substitution of at least D13A of the E2 motif.
4. A method for promoting the translation of edited RNA, comprising: carrying out the modification method described in any one of claims 1 to 3; binding the obtained modified PPR editor protein to a target RNA to perform RNA editing; and translating a protein from the edited RNA, wherein the translation efficiency of the protein is higher than that of the translation efficiency when an unmodified PPR editor protein is used.
5. A PPR editor protein having a portion consisting of 2 to 20 PPR motifs and a C-terminal domain consisting of the following polypeptide a or b: a. A polypeptide consisting of a sequence in which at least one selected from V3 of the E1 motif, P3 of the E2 motif, and E1 of the PG domain is substituted in the amino acid sequence of the C-terminal domain consisting of the P2 motif, L2 motif, S2 motif, E1 motif, E2 motif, and PG domain; or b. A polypeptide consisting of a sequence in which at least one selected from T3 of the P2 motif, V10 of the L2 motif, L1 of the E1 motif, and D13 of the E2 motif is substituted in the amino acid sequence of the C-terminal domain consisting of the P2 motif, L2 motif, S2 motif, E1 motif, E2 motif, and WW domain, wherein the PPR motif containing the P2 motif, L2 motif, S2 motif, E1 motif, and E2 motif consists of a polypeptide having a full length of 31 to 36 amino acids represented by the following formula 1, and each amino acid is numbered in order as A1, A2, A3, A4... (In formula 1: Helix A is a portion capable of forming an α-helix structure consisting of 13 or 14 amino acids in length, X1 consists of 1 to 9 amino acids in length, preferably 1 to 3 amino acids in length, Helix B is a portion capable of forming an α-helix structure consisting of 10 to 14 amino acids in length, X2 consists of 1 to 9 amino acids in length, preferably 4 to 9 amino acids in length, and in X2, the C-terminal amino acid is represented by L.), and the combination of the amino acids of A5 and L functions for selective binding to RNA bases, a PPR editor protein.
6. The PPR editor protein according to claim 5, wherein polypeptide a is subjected to at least one substitution selected from V3T of the E1 motif, P3A of the E2 motif, and E1R of the PG domain, and polypeptide b is subjected to at least one substitution selected from T3S of the P2 motif, V10A of the L2 motif, L1A of the E1 motif, and D13A of the E2 motif.
7. The PPR editor protein according to claim 5, wherein polypeptide a is a polypeptide comprising a sequence in which at least one substitution selected from V106T, P140A, and E172R is made in the amino acid sequence of SEQ ID NO: 1; and polypeptide b is a polypeptide comprising a sequence in which at least one substitution selected from T3S, V45A, L104A, and D150A is made in the amino acid sequence of SEQ ID NO:
2.
8. A nucleic acid encoding the PPR editor protein according to claims 5 to 7.
9. A vector comprising the nucleic acid described in claim 8.
10. A cell comprising the vector according to claim 9.