Method for preparing cells, cells and method for producing proteins
By using specific recombinases and donor vectors in host cells, multiple target genes can be efficiently integrated and expressed, solving the problem of low production efficiency of medical proteins in existing technologies and achieving efficient and low-cost production of medical proteins.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- FUJIFILM CORP
- Filing Date
- 2024-11-26
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, it is difficult to efficiently integrate and highly express multiple target genes in host cells to produce medical proteins, especially humanized monoclonal antibodies, and existing methods are costly and inefficient.
Using a recombinase and a donor vector, the target gene is integrated into a highly expressed region of the host cell. Through specific recombinase recognition sites and donor vector design, the stable expression of the target gene in the host cell genome is ensured. A serine recombinase such as Bxb1 integrase is used for efficient recombination reaction.
This technology enables the efficient integration and expression of multiple target genes in host cells, increasing the production volume and expression efficiency of medical proteins while reducing production costs.
Smart Images

Figure CN122249564A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for preparing cells and a method for manufacturing cells and proteins. Background Technology
[0002] Patent document 1 discloses a host cell that is a site-specific integrated host cell containing an endogenous Fer1L4 gene, and an exogenous nucleotide sequence integrated into the Fer1L4 gene. Patent Document 2 discloses a cell that possesses exogenous nucleic acid integrated at a specific site within an expression enhancement locus, and the exogenous nucleic acid sequence encodes a bispecific antigen-binding protein. Patent document 3 discloses a cell having a first exogenous nucleic acid integrated into a first expression enhancement site and a second exogenous nucleic acid integrated into a second expression enhancement site, wherein both the first and second exogenous nucleic acids encode antigen-binding proteins. Patent document 4 discloses a cell that is a mammalian cell containing a first recombination target site (RTS) for chromosome integration at a first high integration (HI) locus, wherein the first HI locus is located within an active genomic region of accessible chromatin and within approximately 30,000 base pairs of the TAD boundary, and the first HI locus overlaps with a region of the cellular genome that interacts with at least one enhancer element.
[0003] Non-patent document 1 discloses the following: the directionality of DNA integration based on Bxb1 integrase depends only on the central dinucleotides of attP and attB. Non-patent document 2 discloses the following: Among 15 candidates of serine recombinases for integrating DNA into the human genome, Bxb1 integrase has the best precision and efficiency. Non-Patent Literature 3 discloses the following: [the following is a description of] 4 serine integrases. BT1, TG1 Among Rv1 and Bxb1, the Bxb1 integrase has the highest efficiency. Existing technical documents Patent documents
[0004] Patent Document 1: European Patent Application Publication No. 2711428 Patent Document 2: International Publication No. 2017 / 184831 Patent Document 3: International Publication No. 2017 / 184832 Patent Document 4: International Publication No. 2020 / 072480 Non-patent literature
[0005] Non-patent literature 1: Molecular Cell, 2003, Vol. 12, 1101-1111 Non-patent literature 2: BMC Biotechnology, 2013, 13:87 Non-patent literature 3: Acta Biochim Biophys Sin, 2017, 49(1), 44-50 Summary of the Invention The technical problem to be solved by the invention
[0006] One technique involves integrating a target gene into the genome of a host cell to prepare cells that stably produce medical proteins such as humanized monoclonal antibodies. From a cost perspective, it is preferable to have fewer types of donor vectors for the target gene and fewer types of enzymes in the recombinant donor vector and the host genome. Furthermore, from the perspective of target protein production volume, it is preferable to place multiple target genes in high-expression regions of the host genome. For example, when the target protein is an antibody, it is preferable to introduce a donor vector carrying a target gene containing both H-chain and L-chain coding sequences into a host cell, and insert multiple target genes into a highly expressed region of the host genome through a recombination reaction using a recombinase.
[0007] The present invention was made based on the above circumstances. The subject of this invention is to provide a method for preparing cells that highly express a target gene. The subject of this invention is to provide a cell that highly expresses a target gene. The objective of this invention is to provide a method for producing proteins with excellent productivity. means for solving technical problems
[0008] The specific methods used to solve the problem include the following approaches. <1> A method for preparing cells, the method comprising the following steps: using a recombinase and a donor vector to integrate a target gene into the genome of a host cell. Introduce the donor vector of the target gene into the host cell; The recombinase was reacted in the host cells that had been introduced with the donor vector; and Cells expressing the target gene are selected from host cells after the recombinase reaction. The host cell genome and donor vector are as follows (1) to (4): (1) The host cell genome has regions R that sequentially contain one recognition site for recombinase, namely RRS1, RRS2, RRS3 and RRS4; (2) The donor vector has RRS5 and RRS6 as recognition sites for recombinase, and the target gene located between RRS5 and RRS6; (3) RRS1 and RRS4 can recombine with RRS5 but cannot recombine with RRS6; and (4) RRS2 and RRS3 can recombine with RRS6 but cannot recombine with RRS5. <2> According to the cell preparation method described in <1>, the host cell genome is further as follows (5): (5) RRS1 and RRS4 are the same sequence, and RRS2 and RRS3 are the same sequence. <3> According to the cell preparation method described in <1> or <2>, the donor carrier is further comprising the following (6): (6) The transcription direction of the target gene located between RRS5 and RRS6 is from RRS6 to RRS5. <4> The method for preparing cells according to any one of <1> to <3>, wherein the genome of the host cell is further as follows (7): (7) A first selectable marker gene is configured between RRS1 and RRS2, and a second selectable marker gene is configured between RRS3 and RRS4 in region R. <5> The method for preparing cells according to any one of <1> to <4>, wherein the donor carrier is further comprising the following (8): (8) It has a third selectable marker gene configured between RRS5 and RRS6. <6> The method for preparing cells according to any one of <1> to <5> further includes the step of introducing an expression vector for a recombinant enzyme into a host cell. <7> The method for preparing cells according to any one of <1> to <6>, wherein the recombinase is a serine recombinase. <8> The method for preparing cells according to any one of <1> to <7>, wherein the host cell is a mammalian cell. <9> The method for preparing cells according to any one of <1> to <7>, wherein the host cell is a CHO cell. <10> The method for preparing cells according to any one of <1> to <9>, wherein the target gene is a gene encoding at least one of the following groups: enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, viral agents, vaccines, medical proteins, their subunits and fragments thereof.
[0009] <11> A type of cell that integrates a target gene into its genome. The cells are as follows (A) to (C): (A) A region G in the genome that contains, in sequence, one site each of sites 1, 2, 3 and 4, which are sites formed by recombination through recombination recognition sites of recombinase; (B) Locus 1 and locus 4 are sequence homologous, and locus 2 and locus 3 are sequence homologous; and (C) The region G contains the target gene located between site 1 and site 2, and the target gene located between site 3 and site 4. <12> According to the cell described in <11>, it also has the following characteristics (D): (D) The transcription direction of the target gene located between site 1 and site 2 is from site 2 to site 1, and the transcription direction of the target gene located between site 3 and site 4 is from site 3 to site 4. <13> The cell is described in <11> or <12>, wherein the recombinase is a serine recombinase. <14> The cell according to any one of <11> to <13>, wherein the cell is a mammalian cell. <15> The cell according to any one of <11> to <13>, wherein the cell is a CHO cell. <16> The cell according to any one of <11> to <15>, wherein the target gene is a gene encoding at least one of the following groups: enzyme, antibody, interleukin, cytokine, chemokine, hormone, growth factor, transcription factor, receptor, viral agent, vaccine, medical protein, their subunits and their fragments. <17> A method for manufacturing a protein, the method comprising the following steps: Culture the cells described in any one of <11> to <16> and express the protein encoded by the target gene. Invention Effects
[0010] According to the present invention, a method for preparing cells that highly express a target gene is provided. According to the present invention, a cell that highly expresses a target gene is provided. According to the present invention, a method for producing a protein with excellent productivity is provided. Attached Figure Description
[0011] Figure 1 This is a conceptual diagram representing the recombination pattern between region R of the host genome and the donor vector. Figure 2 This is a schematic diagram of the vector used to construct the host genome in the embodiments. Figure 3 This is a schematic structural diagram of the donor carrier used in the embodiments. Figure 4 This is a schematic diagram of the recombinase expression vector used in the examples. Figure 5 This is a schematic structural diagram of region G of the clone prepared in the example. Detailed Implementation
[0012] The embodiments of the present invention will be described below. These descriptions and examples are illustrative and do not limit the scope of the embodiments. The mechanisms of action described in the present invention include speculation, and their correctness does not limit the scope of the embodiments.
[0013] When describing embodiments of the present invention with reference to the accompanying drawings, the structure of the embodiments of the present invention is not limited to the structure shown in the drawings. The sizes of the elements in the drawings are conceptual, and the relative sizes between the elements are not limited thereto.
[0014] In this invention, the term "process" is included not only in an independent process, but also when the purpose of the process cannot be clearly distinguished from other processes.
[0015] In this invention, the numerical range represented by “~” indicates the range included by taking the values recorded before and after “~” as the minimum and maximum values, respectively. In the numerical ranges described in stages in this invention, the upper or lower limit value described in one numerical range can be replaced with the upper or lower limit value of other numerical ranges described in stages. Furthermore, in the numerical ranges described in this invention, the upper or lower limit value of that numerical range can also be replaced with the values shown in the embodiments.
[0016] In this invention, each component may also comprise multiple corresponding substances. In this invention, when referring to the amount of each component in a composition, unless otherwise specified, the amount refers to the total amount of the multiple substances present in the composition, where multiple substances equivalent to each component are present in the composition.
[0017] In this invention, nucleic acid includes all nucleic acids (e.g., deoxyribonucleic acid (DNA), ribonucleic acid (RNA), analogs of these, natural products, and artificial products) and nucleic acids to which low-molecular-weight compounds, groups (e.g., methyl groups), molecules or structures other than nucleic acids are linked. Nucleic acids can be single-stranded or double-stranded.
[0018] In this invention, the donor vector is a substance that carries exogenous nucleic acids into cells and the cell's genome, and it is itself a nucleic acid. There are no restrictions on the source, method of delivery, or base sequence of the donor vector. The donor vector can be a circular nucleic acid or a linear nucleic acid. The donor vector can be a single-stranded nucleic acid or a double-stranded nucleic acid. Double-stranded DNA is preferred as the donor vector.
[0019] In this invention, there is no limitation on the number of amino acid residues in the protein. The protein includes proteins with post-translational modifications of amino acids. Examples of post-translational modifications of amino acids include phosphorylation, methylation, acetylation, glycan addition, and lipid addition. In this invention, amino acids are labeled using the 3-character and 1-character labels specified by IUPAC-IUBMB JCBN (IUPAC-IUBMB Joint Commission on Biochemical Nomenclature). Unless otherwise specified, the amino acids mentioned in this invention are L-amino acids.
[0020] In this invention, the homology of the base sequence and the homology of the amino acid sequence are calculated using BLAST (Basic Local Alignment Search Tool) (https: / / blast.ncbi.nlm.nih.gov / Blast.cgi).
[0021] In this invention, recombinase is a general term for enzymes that produce recombinant nucleic acids, and includes integrase. RRS is an abbreviation for recombinase recognition site.
[0022] In this invention, when referring to the orientation or base sequence of the RRS (recombinase recognition site), the DNA strand on the side of the two DNA strands that constitute double-stranded DNA that displays the recognition sequence of the recombinase is called the sense strand, and the complementary strand of the sense strand is called the antisense strand. In this invention, the homology of the base sequence of RRS refers to the homology of the base sequence read along the 5'→3' direction of the positive strand (i.e., the DNA strand showing the recognition sequence of the recombinase).
[0023] <Cell Preparation Methods> This invention provides a method for preparing cells that highly express a target gene. The cell preparation method of the present invention is a method of preparing cells by integrating the target gene into the genome of the host cell using a recombinase and a donor vector.
[0024] The cell preparation method of the present invention includes the following steps: Introduce the donor vector of the target gene into the host cell; The recombinase was reacted in the host cells that had been introduced with the donor vector; and Cells expressing the target gene are selected from host cells after the recombinase reaction.
[0025] The source, size, and base sequence of the target gene are not limited. Examples of target genes include genes encoding at least one of the following groups: enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, viral agents, vaccines, medical proteins, their subunits and their fragments. That is, examples of the protein encoded by the target gene (referred to as the "target protein" in this invention) include at least one selected from the group consisting of enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, proteins constituting viral preparations, vaccines, medical proteins, their subunits and fragments.
[0026] In this invention, antibodies are not limited to immunoglobulins; any molecule that binds to an antigen is acceptable. In this invention, the term "antibody" encompasses both antibody fragments and antigen-binding molecules. In this invention, the heavy chain of an antibody is also referred to as the H chain, and the light chain of an antibody is also referred to as the L chain.
[0027] A target gene contains all the sequences required to express a target protein. That is, a target gene includes the coding sequence of the target protein and all the nucleic acids required for transcription and translation of that coding sequence within the host cell (e.g., promoters, transcription terminators, polyadenylation sequences). A target gene may contain one copy of the coding sequence of the target protein, or it may contain two or more copies of the coding sequence of the target protein. For example, to express all subunits of a heteropolymer protein, a target gene may contain at least one copy of the coding sequence for each subunit. For example, a target gene may contain at least one copy of the sequence encoding the H chain and the sequence encoding the L chain of an antibody.
[0028] The target gene may further include a sequence encoding at least one of the following groups: nucleic acids constituting viral agents, transcription control nucleic acids, and non-coding RNAs. Examples of non-coding RNAs (ncRNAs) include miRNA (microRNA), shRNA (short hairpin RNA), siRNA (small interfering RNA), snRNA (small nuclear RNA), rRNA (ribosomal RNA), and tRNA (transfer RNA).
[0029] The host cell can be either a prokaryotic cell or a eukaryotic cell. Examples of prokaryotic cells include bacterial cells. Examples of eukaryotic cells include fungi, yeast, insect cells, and mammalian cells.
[0030] Examples of bacterial cells include Gram-negative bacterial cells such as *Escherichia coli*, *Salmonella typhimurium*, *Serratia marcescens*, *Pseudomonas putida*, and *Pseudomonas aeruginosa*; and Gram-positive bacterial cells such as *Bacillus subtilis*. Preferred bacterial cells are those from the family Enterobacteriaceae, with *Escherichia coli*, especially strains B or K12, being more preferred.
[0031] As an example of fungi, Aspergillus oryzae can be cited.
[0032] Examples of yeasts include Saccharomyces cerevisiae, Pichia pastoris, and Hansenula polymorpha.
[0033] Examples of insect cells include BmN cells derived from the silkworm (Bombyx mori), Sf9 and Sf21 cells derived from the fall armyworm (Spodoptera frugiperda), S2 cells derived from the Drosophila melanogaster, and Pv11 cells derived from the sleeper chironomid (Polypedilum vanderplanki).
[0034] Examples of mammalian cells include Chinese hamster ovary cells (CHO cells), young hamster kidney cells (BHK cells), human embryonic kidney cell lines (e.g., HEK293 cells), human retinoblastoma cell lines (e.g., PER.C6 cells), mouse myeloma cell lines (e.g., NS0 cells and SP2 / 0 cells), and cell lines derived from these cells.
[0035] Examples of CHO cells include CHO-DG44 cells, CHO-K1 cells, CHO-DXB11 cells, and CHOpro3 cells. - Cells and cell lines derived from these cells.
[0036] Examples of mammalian cells include those capable of differentiating into other cell types. Examples include pluripotent stem cells such as ES cells (embryonic stem cells) and iPS cells (induced pluripotent stem cells); and multipotent stem cells such as mesenchymal stem cells, tissue stem cells, and somatic stem cells.
[0037] Examples of methods for introducing donor vectors into host cells include electroporation, lipid transfection, microinjection, and cell infection with viral vectors. From the viewpoints of high safety, high delivery efficiency, and low cytotoxicity, electroporation is preferred.
[0038] The recombinase reaction is carried out in the host cells in which the donor vector has been introduced, for example by maintaining the culture environment of the host cells at the optimal temperature for the recombinase.
[0039] Recombinases can be enzymes inherent in host cells, enzymes introduced into host cells via expression vectors, or enzymes added to host cells as proteins or RNA. Recombinase expression vectors can integrate into the host genome or exist within host cells as extrachromosomal factors.
[0040] One embodiment of the cell preparation method of the present invention includes the step of introducing an expression vector for a recombinant enzyme into a host cell. From the viewpoint of reliably functioning within the host cell during the target period, recombinases are preferably introduced into the host cell via expression vectors.
[0041] There are no restrictions on the order in which the recombinase expression vector and the target gene donor vector are introduced into the host cell. They can be introduced together or separately. From the viewpoint of not increasing the steps and time required to prepare the target cells, it is preferable to introduce the recombinase expression vector and the target gene donor vector together into the host cells.
[0042] The nucleic acid and base sequence of the backbone for constructing the expression vector of recombinase are not limited. Examples of nucleic acids that can serve as the backbone include viral vectors, non-viral vectors, and artificial nucleic acids. The backbone nucleic acid can be circular or linear. Examples of viral vectors include nucleic acids derived from adenoviruses, adeno-associated viruses, retroviruses, vaccinia viruses, poxviruses, lentiviruses, herpesviruses, baculoviruses, or bacteriophages. Examples of non-viral vectors include artificial plasmids and bacterial vectors that have altered the genes of bacteria.
[0043] There are no restrictions on the source, type, or method of recombinase. Recombinases, widely used in genetic engineering, include phage-derived enzymes such as serine recombinases (types with serine residues at their active sites) and tyrosine recombinases (types with tyrosine residues at their active sites). These were discovered as enzymes that integrate the phage genome into the bacterial genome during phage infection. Some serine and tyrosine recombinases have been confirmed to function in mammalian cells.
[0044] Preferred characteristics of the recombinase used in the cell preparation method of the present invention include high specificity of the base sequence of the recognition site, no need for other factors besides the recombinase in the recombination reaction, and irreversibility of the recombination reaction.
[0045] From the viewpoint of having all the above-described characteristics, the recombinase used in the cell preparation method of the present invention is preferably a serine recombinase. From the viewpoint of being capable of performing mammalian genome recombination, the serine recombinase is preferably selected from Bxb1. C31, TP901, A118, SPβc, TG1, BT1 Rv1、 One of the group consisting of 370.1, Wβ, PaO1, and PaO3. Among them, from the viewpoint of excellent precision and efficiency of recombination reaction, Bxb1 recombinase (also known as Bxb1 integrase) is preferred.
[0046] When constructing expression vectors for recombinases derived from bacteriophages, the codons of the recombinase gene are optimized to be codons that can be expressed in host cells. Preferably, a coding sequence for a nuclear localization signal is appended to the recombinase gene.
[0047] Cells expressing the target gene are selected from host cells after the recombinase reaction, for example, based on the concentration and / or purity of the target protein. This can be achieved by setting a baseline value for the concentration and / or purity of the target protein and selecting cells that meet that baseline value; or by selecting cells with relatively high concentrations and / or purity of the target protein. Specifically, the following steps (S1) to (S4) are performed.
[0048] (S1) Add the selection agent to the culture medium of the host cell. (S2) Single-cell host cells. (S3) Collect a portion of the culture medium of the single-celled cells and determine the concentration and / or purity of the target protein. (S4) Select cells with a concentration and / or purity of the target protein that is above the baseline value or relatively high.
[0049] The purity of a target protein refers to the proportion of the target protein in the total amount of various proteins (based on mass or quantity). When the target protein is a multimeric protein, proteins that are not in their original shape may sometimes be produced (e.g., proteins with missing subunits or proteins where one subunit has been replaced by another). It is preferable to have a low proportion of proteins that are not in their original shape, i.e., high purity of proteins in their original shape.
[0050] The host cell genome (also referred to as "host genome" in this invention) and donor vector used in the cell preparation method of the present invention are provided in the following manner (1) to (4).
[0051] (1) The host cell genome has regions R that sequentially contain one recognition site for recombinase, namely RRS1, RRS2, RRS3 and RRS4. (2) The donor vector has RRS5 and RRS6 as recognition sites for recombinase, and the target gene positioned between RRS5 and RRS6. (3) RRS1 and RRS4 can recombine with RRS5 but cannot recombine with RRS6. (4) RRS2 and RRS3 can recombine with RRS6 but cannot recombine with RRS5.
[0052] Figure 1 The diagram shows how region R of the host genome recombines with the donor vector. Figure 1 The diagram shows how region R can have a selectable marker gene, but region R can also be without a selectable marker gene. Figure 1 The abbreviations in [the document] have the following meanings. • GoI: Target gene ·1st MG: First-selective marker gene ·2nd MG: Second-selective marker gene
[0053] The cell preparation method of the present invention integrates two target genes into region R of the host genome by using a host genome and a donor vector having methods (1) to (4). By setting region R in a highly expressed region of the host genome, the integration of two target genes into the highly expressed region can be effectively achieved.
[0054] Another example of how the host cell's genome is implemented is as follows (5). (5) RRS1 and RRS4 are the same sequence, and RRS2 and RRS3 are the same sequence.
[0055] By using the host genome as a means (5), the reliability of integrating two target genes into region R of the host genome is improved.
[0056] Another example of the implementation of the donor carrier is as follows (6). (6) The transcription direction of the target gene located between RRS5 and RRS6 is from RRS6 to RRS5.
[0057] The transcriptional direction of the target gene is the potential transcriptional direction in the donor vector. Method (6) refers to the arrangement of the elements constituting the target gene, namely the coding sequence of the target protein and all nucleic acids (e.g., promoter, transcription terminator, polyadenylated sequence) required for transcription and translation in the host cell in the order and orientation that enable transcription and translation, from RRS6 to RRS5.
[0058] Using a donor vector (6), recombination results in the transcription directions of two target genes located close to each other on the host genome becoming mutually distant (←→). Compared to the transcription directions of two adjacent target genes being mutually close (→←), the method of transcribing two adjacent target genes in mutually distant directions (←→) can be expected to result in high expression of the target genes.
[0059] Figure 1 The donor carrier shown is method (6). Figure 1 In the recombination pattern shown, the transcription directions of the two target genes located close to each other on the host genome are in directions that are far apart (←→).
[0060] Another example of how the host cell's genome is implemented is as follows (7). (7) A first selectable marker gene is configured between RRS1 and RRS2, and a second selectable marker gene is configured between RRS3 and RRS4 in region R.
[0061] By means of the host genome (7), after the recombinase reaction, host cells that have undergone recombination within region R are easily selected and enriched, or before the recombinase reaction, host cells containing region R within the genome are easily selected and enriched.
[0062] Figure 1 The region R of the host genome shown is mode (7). The transcription direction of the first selectable marker gene and the transcription direction of the second selectable marker gene can be any one of the following: the same direction (→→ or ←←), the direction that is close to each other (→←), or the direction that is far away from each other (←→).
[0063] Another example of the implementation of the donor carrier is as follows (8). (8) It has a third selectable marker gene configured between RRS5 and RRS6.
[0064] By using a donor vector (8), it is easy to select host cells and enrich the target gene into the genome.
[0065] The following provides a detailed description of RRS1 to RRS6, the host cell genome, and the donor vector.
[0066] [RRS1~RRS6] First, the characteristics of the RRS of serine recombinases will be explained. The recombinant lines (RRS) of serine recombinases are usually associated with their phage origin and are referred to as attP (phage attachment site) and attB (bacterial attachment site). Serine recombinases DNA between attP and attB. Sequences that have a base sequence similar to the natural attP or attB and are recognized by serine recombinases are called pseudo attP and pseudo attB, respectively. The number of bases in attP and pseudoattP can range from 1bp to 1000bp, usually from 10bp to 300bp, and more often from 20bp to 200bp. The number of bases in attB and pseudoattB can range from 1bp to 1000bp, usually from 10bp to 300bp, and more often from 20bp to 200bp.
[0067] The following is an example of the RRS of serine recombinases, showing a case of native attP and native attB and pseudo attP and pseudo attB of Bxb1 recombinase (also known as Bxb1 integrase). The RRS of serine recombinases sometimes determines whether attP and attB can recombine based on the similarity or difference of two bases in or near the center of the sequence (referred to as the "central portion" in this invention). Bxb1 recombinases typically determine whether attP and attB can recombine based on the similarity or difference of two bases in the central portion. In the sequences below, the two bases in the central portion relevant to the recombination of attP and attB are underlined.
[0068] Natural attP Serial Number 1: 5'-GTCGTGGTTTGTCTGGTCAACCACCGCG GT CTCAGTGGTGTACGGTACAAACCCCGAC-3' Natural attB Serial Number 2: 5'-TCGGCCGGCTTGTCGACGACGGCG GT CTCCGTCGTCAGGATCATCCGGGC-3'
[0069] An example of a fake attP Serial Number 3: 5'-GTCGTGGTTTGTCTGGTCAACCACCGCG CT CTCAGTGGTGTACGGTACAAACCCCGAC-3' An example of fake attB Serial Number 4: 5'-TCGGCCGGCTTGTCGACGACGGCG CT CTCCGTCGTCAGGATCATCCGGGC-3'
[0070] Sequence number 3 is a sequence in which the first base of the central part "GT" of sequence number 1 is changed to "CT". Serial number 4 is a sequence in which the first base of the central part "GT" of serial number 2 is changed to "CT".
[0071] The RRS of serine recombinases sometimes determines the feasibility of recombination between attP and attB based on the similarity or difference of two bases in the central region. For sequence numbers 1 through 4, the feasibility of recombination is as follows. Sequence numbers 1 and 2, which share two identical bases in the central region, can recombine. Sequence numbers 3 and 4, which share two identical bases in the central region, can also recombine. Sequence number 1 and sequence number 4, which have two different bases in the central region, cannot recombine. Sequence number 3 and sequence number 2, which have two different bases in the central region, cannot recombine. In this invention, "cannot recombine" includes both the inability to recombine and the probability of recombination occurring being lower than expected.
[0072] RRS1 to RRS6 preferably take into account the sequence of the two bases in the central portion and are designed as base sequences that can be recognized by the same type of serine recombinase. Specifically, it is preferable to have the following methods (a) to (h). Methods (3) and (4) can be easily implemented by methods (a) to (h).
[0073] (a) RRS1 and RRS4 have two identical bases in the central portion, and the base sequences as a whole are homologous. The overall homology of the base sequences is preferably 80% or more, more preferably 90% or more, further preferably 95% or more, and most preferably 100%. (b) The two bases in the central portion of RRS2 and RRS3 are identical, and the base sequences as a whole are homologous. The overall homology of the base sequences is preferably 80% or more, more preferably 90% or more, further preferably 95% or more, and most preferably 100%. (c) One or both of the two bases in the central portion of RRS1 (and RRS4) are different from those in RRS2 (and RRS3), and the overall base sequence is homologous. The overall homology of the base sequence is preferably 80% or more, more preferably 90% or more, and even more preferably 95% or more. RRS1 (and RRS4) and RRS2 (and RRS3) are preferably identical sequences except for one or both of the two bases in the central portion. (d) The number of bases in RRS1 to RRS4 is preferably 1 bp to 1000 bp, more preferably 10 bp to 300 bp, and even more preferably 20 bp to 200 bp. The difference in the number of bases in RRS1 to RRS4 is preferably 30% or less, more preferably 20% or less, and even more preferably 15% or less. The number of bases in RRS1 to RRS4 is most preferably the same.
[0074] (e) The two bases in the central part of RRS5 are the same as the two bases in the central part of RRS1 and RRS4. (f) The two bases in the central part of RRS6 are the same as the two bases in the central part of RRS2 and RRS3. (g) One or both of the two bases in the central portion of RRS5 and RRS6 are different, and the overall base sequence is homologous. The overall homology of the base sequence is preferably 80% or more, more preferably 90% or more, and even more preferably 95% or more. RRS5 and RRS6 are preferably identical sequences except for one or both of the two bases in the central portion. (h) The number of bases in RRS5 and RRS6 is preferably 1 bp to 1000 bp, more preferably 10 bp to 300 bp, and even more preferably 20 bp to 200 bp. The difference in the number of bases between RRS5 and RRS6 is preferably 30% or less, more preferably 20% or less, and even more preferably 15% or less. The number of bases in RRS5 and RRS6 is most preferably the same.
[0075] When serine recombinases are used in cell preparation, examples of implementations of RRS1 to RRS6 include the following methods (i) and (j).
[0076] (i) RRS1–RRS4 are the natural attP and pseudo attP of serine recombinase, and RRS5 and RRS6 are the natural attB and pseudo attB of serine recombinase. RRS1, RRS4 and RRS5 (or RRS2, RRS3 and RRS6) are the natural att of serine recombinase.
[0077] (j) RRS1 to RRS4 are the natural attB and pseudo attB of serine recombinase, and RRS5 and RRS6 are the natural attP and pseudo attP of serine recombinase. RRS1, RRS4 and RRS5 (or RRS2, RRS3 and RRS6) are the natural att of serine recombinase.
[0078] The natural recognition sequences of serine recombinases can be obtained from academic papers, technical literature, and other sources. There are 16 possible sequences that can fit the two bases in the central portion of attP and attB. Specifically, these are “GT”, “CT”, “AT”, “TT”, “GA”, “CA”, “AA”, “TA”, “GC”, “CC”, “AC”, “TC”, “GG”, “CG”, “AG”, and “TG” in the 5’→3’ direction. The two bases in the central portion of pseudo-att are selected from these sequences.
[0079] In the case of using Bxb1 recombinase in cell preparation, as an example of an implementation method for RRS1 to RRS6, the following can be cited. This method is referred to as Ex(1). RRS1 and RRS4 are serial numbers 1. RRS2 and RRS3 are serial numbers 3. RRS5 is serial number 2. RRS6 is serial number 4.
[0080] In the case of using Bxb1 recombinase in cell preparation, as an example of an implementation method for RRS1 to RRS6, the following can be cited. This method is referred to as Ex(2). RRS1 and RRS4 are serial numbers 3. RRS2 and RRS3 are serial numbers 1. RRS5 is serial number 4. RRS6 is serial number 2.
[0081] In the case of using Bxb1 recombinase in cell preparation, as an example of an implementation method for RRS1 to RRS6, the following can be cited. This method is referred to as Ex(3). RRS1 and RRS4 are serial numbers 2. RRS2 and RRS3 are serial numbers 4. RRS5 is serial number 1. RRS6 is serial number 3.
[0082] In the case of using Bxb1 recombinase in cell preparation, as an example of an implementation method for RRS1 to RRS6, the following can be cited. This method is referred to as Ex(4). RRS1 and RRS4 are serial numbers 4. RRS2 and RRS3 are serial numbers 2. RRS5 is serial number 3. RRS6 is serial number 1.
[0083] As another example of the implementation of RRS1 to RRS6, a method can be given in which the two bases in the central part are changed based on Ex(1) to Ex(4). Two kinds are selected from 16 kinds of two-base sequences, one of which is set as the two bases in the central part of RRS1, RRS4 and RRS5, and the other is set as the two bases in the central part of RRS2, RRS3 and RRS6.
[0084] An example of an implementation of Ex(1) is shown in Table 1. In Table 1, only the bases of the sense strand (the DNA strand showing the recognition sequence of the recombinase) are shown, and the bases of the antisense strand are omitted. The arrows are in the 5'→3' direction of the sense strand. In each sequence shown in Table 1, the two bases of the central portion that are related to the recombination between RRS are underlined.
[0085] [Table 1]
[0086] Ex(1) In addition to the methods shown in Table 1, there are other methods for the orientation of each RRS. The orientation of each RRS is not limited to the methods shown in Table 1, as long as it is the orientation that enables the target gene to move from the donor vector to the two locations within region R.
[0087] When using serine recombinase in cell preparation, the orientation of RRS1 to RRS4 in region R and the orientation of RRS5 and RRS6 in the donor vector are preferably as follows to effectively achieve the movement of the target gene from the donor vector to two locations within region R. In the following description, the orientation of the RRS is shown in the 5'→3' direction on the positive strand (the DNA strand that shows the recognition sequence of the recombinase).
[0088] • When the orientation of RRS5 and RRS6 is "→ target gene ←", the orientation of RRS1, RRS2, RRS3, and RRS4 is "→ ← → ←". (Table 1 shows the orientation) • When the orientation of RRS5 and RRS6 is “← target gene →”, the orientation of RRS1, RRS2, RRS3 and RRS4 is “← → ← →”. • When the orientation of RRS5 and RRS6 is “→ target gene →”, the orientation of RRS1, RRS2, RRS3 and RRS4 is “→ → ← ←”. • When the orientation of RRS5 and RRS6 is “← target gene ←”, the orientation of RRS1, RRS2, RRS3 and RRS4 is “← ← → →”. RRS1 and RRS4 are oriented in opposite directions. That is, the positive strand of RRS1 and the positive strand of RRS4 are different DNA strands. RRS2 and RRS3 are oriented in opposite directions. That is, the positive strand of RRS2 and the positive strand of RRS3 are different DNA strands.
[0089] [Host cell genome] The host cell's genome (also referred to as the "host genome" in this invention) has a region R. Region R consists of regions containing one RRS1, one RRS2, one RRS3, and one RRS4, which serve as recognition sites for recombinases. The RRSs in region R are arranged in the order RRS1, RRS2, RRS3, and RRS4. Region R is a contiguous region. The host genome can have one region R throughout the entire genome, or it can have two or more regions R.
[0090] Region R can insert the target gene between RRS1 and RRS2 through recombination with the donor vector, and can also insert the target gene between RRS3 and RRS4. Therefore, each region R can insert two target genes.
[0091] One example of a host genome implementation has a first selectable marker gene configured between RRS1 and RRS2, and a second selectable marker gene configured between RRS3 and RRS4 in region R. The first and second selectable marker genes each contain all the nucleic acids required for gene expression. The size and base sequence of the first and second selectable marker genes are not limited.
[0092] The first and second selection marker genes can be the same gene or different genes. From the viewpoint of not increasing the steps and time required for cell selection and enrichment, it is preferable that the first and second selection marker genes are the same gene.
[0093] One example of the implementation of the first and second selector genes is a negative selector gene used to select and enrich host cells that have undergone recombination within region R. Examples of negative selection genes include suicide genes that induce cell death through specific drugs. Examples of suicide genes include the thymidine kinase gene (selective drug: ganciclovir) derived from herpes simplex virus, the inducible caspase 9 gene (selective drug: AP1903), and the cytosine deaminase gene (selective drug: 5-fluorocytosine).
[0094] One example of the implementation of the first and second selection marker genes is a gene that expresses a positive selection marker for selecting and enriching host cells containing region R in the genome. Fluorescent proteins can be cited as an example of positive selection markers. Any known fluorescent protein can be used as the fluorescent protein. Preferably, the fluorescent protein is a monomeric high-brightness fluorescent protein.
[0095] In one embodiment, one of a negative selection gene and a positive selection gene is disposed between RRS1 and RRS2. In another embodiment, both a negative selection gene and a positive selection gene are disposed between RRS1 and RRS2. In one embodiment, one of a negative selection gene and a positive selection gene is configured between RRS3 and RRS4. In another embodiment, both a negative selection gene and a positive selection gene are configured between RRS3 and RRS4.
[0096] In region R, the number of bases between the outer end of RRS1 and the outer end of RRS4, which is the recognition site furthest from RRS1, is, for example, less than 100 kbp, less than 70 kbp, less than 50 kbp, less than 30 kbp, or less than 10 kbp. The number of bases between the outer ends of RRS1 and RRS4 is, for example, more than 100 bp, more than 1 kbp, or more than 2 kbp.
[0097] In region R, the number of bases between the outer end of RRS2 (the end closest to RRS1) and the outer end of RRS3 (the end closest to RRS4) is preferably 50 bp or more, more preferably 100 bp or more, and even more preferably 200 bp or more. According to this method, the two target genes inserted into region R by recombination are arranged close together at an appropriate distance, and high expression of the target genes can be expected.
[0098] Region R can be a region that already exists in the host genome or a region that is newly formed in the host genome.
[0099] Region R is formed in the host genome, for example, by integrating region R into the host genome using a vector carrying region R (referred to in this invention as a "host genome construction vector").
[0100] The host genome construction vector has at least RRS1, RRS2, RRS3, and RRS4 in sequence. One embodiment of the host genome construction vector has a first selectable marker gene configured between RRS1 and RRS2, and a second selectable marker gene configured between RRS3 and RRS4.
[0101] The nucleic acid and base sequences of the backbone of the vector used to construct the host genome are not limited. Examples of nucleic acids that can serve as the backbone include viral vectors, non-viral vectors, and artificial nucleic acids. The backbone nucleic acids can be circular or linear. Examples of viral vectors include nucleic acids derived from adenoviruses, adeno-associated viruses, retroviruses, vaccinia viruses, poxviruses, lentiviruses, herpesviruses, baculoviruses, or bacteriophages. Examples of non-viral vectors include artificial plasmids and bacterial vectors that have altered the genes of bacteria.
[0102] As an example of a host genome implementation, a host genome in which region R is inserted in at least one of the safe harbors within the host genome and which has region R within the safe harbor can be cited.
[0103] A safe harbor within the genome refers to a region where the inserted gene survives in the host cell and is expressed. Safe harbors within the genome are identified by chromosome number or accession number and base number from a public base sequence database. Examples of public base sequence databases include INSD (the International Nucleotide Sequence Databases) and RefSeq (NCBI Reference Sequence Database). Safe havens within the genome are sometimes referred to by the well-known gene names that exist within or near their regions.
[0104] The safe harbors within the genome of the inserted region R can be well-known safe harbors or newly discovered safe harbors. Well-known safe harbors can be identified from publicly available databases, academic papers, technical literature, etc.
[0105] When multiple safe harbors exist, at least one can be selected as the insertion region for region R. Methods for selecting a safe harbor include, for example, selecting a safe harbor where the expression level (pg / cell / copy) of the protein encoded by the inserted gene is relatively high; or selecting a safe harbor where the expression level (pg / cell / copy) of the protein encoded by the inserted gene exceeds a pre-defined benchmark. The protein expression level of the safe harbor can be data obtained from publicly available databases, academic papers, technical literature, etc., or it can be data obtained by actually inserting the gene into the safe harbor and measuring the protein expression level.
[0106] Using well-known genome editing technologies, it is possible to insert regions R by targeting safe havens within the genome.
[0107] [Donor Carrier] The donor vector has RRS5 and RRS6 as recognition sites for recombinases, and the target gene positioned between RRS5 and RRS6.
[0108] A target gene contains all the sequences required to express a target protein. That is, a target gene includes the coding sequence of the target protein and all the nucleic acids required for transcription and translation of that coding sequence within the host cell (e.g., promoters, transcription terminators, polyadenylation sequences). A target gene may contain one copy of the coding sequence of the target protein, or it may contain two or more copies of the coding sequence of the target protein. For example, to express all subunits of a heteropolymer protein, a target gene may contain at least one copy of the coding sequence for each subunit. For example, a target gene may contain at least one copy of the sequence encoding the H chain and the sequence encoding the L chain of an antibody.
[0109] Promoters that can be used in prokaryotic cells include those disclosed in J. Mol. Biol. 1986; 189(1): 113-30, phage polymerase promoters, and E. coli polymerase promoters. Specific examples include T7A1, T7A2, T7A3, λpL, λpR, lac, lacUV5, trp, tac, trc, phoA, and rrnB.
[0110] Examples of promoters that can be used in yeast cells include the gal promoter, AOX1 promoter, AOX2 promoter, GAP promoter, GAL1 promoter, and GAL10 promoter.
[0111] Examples of promoters that can be used in insect cells include the polyhedral protein promoter, the P10 promoter, the early expression protein of viral infection (IE-1) promoter, the MT promoter, the COPIA promoter, the CMV promoter, the RSV promoter, the SV40 promoter, the heat shock protein promoter, the OPIE2 promoter, and the actin 5C promoter.
[0112] Examples of promoters that can be used in mammalian cells include viral promoters and promoters derived from housekeeping genes. Examples of viral promoters include the human CMV promoter, rat CMV promoter, SV40 promoter, RSR-LTR promoter, and HSK-TK promoter. Examples of housekeeping gene promoters include the hEF-1α promoter, the Chinese hamster EF-1α promoter, the β-actin promoter, and the mouse phosphoglycerate kinase (mPGK) promoter. Preferred examples of promoters that can be used in mammalian cells are the EF-1α promoter, and more preferably the hEF-1α promoter.
[0113] To facilitate the extracellular transport or secretion of a target protein, the target gene may contain a coding sequence for a secretory signal peptide. A secretory signal peptide is a type of signal peptide that induces the extracellular transport or secretion of a polypeptide.
[0114] When the target gene contains a coding sequence for a secretory signal peptide, the coding sequences for the secretory signal peptide and the target protein are configured with the same reading frame. Here, "configured with the same reading frame" means that the two coding sequences for the secretory signal peptide and the target protein are arranged in a way that allows them to be expressed as a single polypeptide. In the target gene, there may or may not be a linker or spacer coding sequence between the coding sequences for the secretory signal peptide and the target protein. A preferred embodiment is that the coding sequence of the target protein is arranged downstream of the coding sequence of the secretory signal peptide, with the same reading frame. A fusion protein with the secretory signal peptide is expressed from the target gene of this embodiment at the N-terminus of the target protein. A more preferred embodiment is that the coding sequence of the target protein is arranged sequentially downstream of the coding sequence of the secretory signal peptide, with the same reading frame. A fusion protein with the secretory signal peptide is expressed from the target gene of this embodiment at the N-terminus of the target protein. Here, "downstream" refers to the arrangement order of the two coding sequences. When two coding sequences are arranged in a manner where coding sequence A is transcribed followed by coding sequence B, it can be said that coding sequence B is arranged downstream of coding sequence A. The secretion signal peptide of a fusion protein is usually cleaved from the fusion protein during its transport or secretion.
[0115] Examples of secretory signal peptides include fibronectin secretory signal peptide, collagen secretory signal peptide, and albumin secretory signal peptide. From the viewpoint of high extracellular secretion rate of fused proteins, fibronectin secretory signal peptide is preferred.
[0116] Examples of fibronectin secretion signal peptides include those from amphibians and mammals. Specifically, the fibronectin secretion signal peptide from the African clawed frog (Xenopus laevis) is an example of an amphibian secretion signal peptide. Examples of mammalian fibronectin secretion signal peptides include various fibronectin secretion signal peptides from humans, rats, mice, cattle, pigs, dogs, cats, and Chinese hamsters, as well as their functional equivalents.
[0117] The source organism for the fibronectin secretion signal peptide is preferably selected based on the type of host cell. When the host cell is a human cell, the human fibronectin secretion signal peptide is preferably used for the target gene. When the host cell is a rat cell, the rat fibronectin secretion signal peptide is preferably used for the target gene. When the host cell is a CHO cell, the Chinese hamster fibronectin secretion signal peptide is preferably used for the target gene.
[0118] One example of an implementation of the target gene includes an hEF-1α promoter operatively linked to each other, a coding sequence for a fibronectin secretion signal peptide, a coding sequence for the target protein, and a PolyA sequence.
[0119] The transcription direction of the target gene positioned between RRS5 and RRS6 is preferably from RRS6 towards RRS5. According to this method, the transcription directions of the two target genes inserted into region R of the host genome are in directions that are far apart from each other (←→).
[0120] One embodiment of the donor vector has a third selectable marker gene configured between RRS5 and RRS6. Third-selection marker genes contain all the nucleic acids required for gene expression. The size and base sequence of third-selection marker genes are not limited. Third-selection marker genes are genes that express positive selection markers in host cells for selecting and enriching target genes for integration into the genome.
[0121] Examples of third-selection marker genes include genes that exhibit resistance to selectable agents. Examples of selectable agents include antibiotics and enzyme inhibitors.
[0122] When the chosen agent is an antibiotic, the antibiotic resistance gene, which is the gene for the antibiotic-degrading enzyme, is the selection marker gene. Examples include puromycin resistance genes, hygromycin resistance genes, neomycin resistance genes, chloramphenicol resistance genes, tetracycline resistance genes, erythromycin resistance genes, spectinomycin resistance genes, kanamycin resistance genes, G418 resistance genes, blastomycin resistance genes, zebufenozide resistance genes, phleomycin resistance genes, and ampicillin resistance genes.
[0123] As an example of selecting an enzyme inhibitor as the agent, there is the DHFR-MTX system. In the DHFR-MTX system, the agent selected is methotrexate (MTX), and the marker gene selected is the dihydrofolate reductase (DHFR) gene. The DHFR-MTX system is effective in host cells lacking the DHFR gene (e.g., CHO-DG44 cells).
[0124] As an example of selecting an enzyme inhibitor, there is the GS-MSX system. In the GS-MSX system, the selected agent is methionine sulfonylimine (MSX), and the selected marker gene is the glutamine synthase (GS) gene. The GS-MSX system is effective in host cells lacking the GS gene (e.g., GS knockout CHO cells).
[0125] As an example of a third-selection marker gene, a fluorescent protein gene can be cited. Any known fluorescent protein can be used as the fluorescent protein. Preferably, the fluorescent protein is a monomeric high-brightness fluorescent protein. In cases where the host genome contains genes with fluorescent proteins, it is preferable to avoid overlap in excitation wavelengths and fluorescence wavelengths between fluorescent proteins.
[0126] As a third-choice marker gene, multiple genes mentioned above can be used in combination. For example, a drug resistance gene and a fluorescent protein gene can be configured between RRS5 and RRS6.
[0127] The nucleic acid and base sequence used to construct the backbone of the donor vector are not limited. Examples of nucleic acids that can serve as the backbone include viral vectors, non-viral vectors, and artificial nucleic acids. The nucleic acid in the backbone can be either circular or linear. Examples of viral vectors include nucleic acids derived from adenoviruses, adeno-associated viruses, retroviruses, vaccinia viruses, poxviruses, lentiviruses, herpesviruses, baculoviruses, or bacteriophages. Examples of non-viral vectors include artificial plasmids and bacterial vectors that have altered the genes of bacteria.
[0128] <Cell> This invention provides a cell that highly expresses a target gene. The cells of this invention are cells in which an exogenous target gene is integrated into the genome.
[0129] The source, size, and base sequence of the target gene are not limited. Examples of target genes include genes encoding at least one of the following groups: enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, viral agents, vaccines, medical proteins, their subunits and their fragments. That is, examples of target proteins may be selected from at least one of the following groups: enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, proteins constituting viral preparations, vaccines, medical proteins, their subunits and their fragments.
[0130] A target gene contains all the sequences required to express a target protein. That is, the target gene contains the coding sequence of the target protein and all the nucleic acids required for transcription and translation of that coding sequence within the cell (e.g., promoters, transcription terminators, polyadenylation sequences). A target gene may contain one copy of the coding sequence of the target protein, or it may contain two or more copies of the coding sequence of the target protein. For example, to express all subunits of a heteropolymer protein, the target gene may contain at least one copy of the coding sequence for each subunit. For example, a target gene may contain at least one copy of the sequence encoding the H chain and the sequence encoding the L chain of an antibody.
[0131] The target gene may further include a sequence encoding at least one of the following groups: nucleic acids constituting viral agents, transcription control nucleic acids, and non-coding RNAs. Examples of non-coding RNAs (ncRNAs) include miRNA (microRNA), shRNA (short hairpin RNA), siRNA (small interfering RNA), snRNA (small nuclear RNA), rRNA (ribosomal RNA), and tRNA (transfer RNA).
[0132] The cells of this invention can be prokaryotic cells or eukaryotic cells. Examples of prokaryotic cells include bacterial cells. Examples of eukaryotic cells include fungi, yeast, insect cells, and mammalian cells. Specific examples of bacterial cells, fungi, yeast, and insect cells are the same as those given in the description of the cell preparation method.
[0133] Examples of mammalian cells include Chinese hamster ovary cells (CHO cells), young hamster kidney cells (BHK cells), human embryonic kidney cell lines (e.g., HEK293 cells), human retinoblastoma cell lines (e.g., PER.C6 cells), mouse myeloma cell lines (e.g., NS0 cells and SP2 / 0 cells), and cell lines derived from these cells.
[0134] Examples of CHO cells include CHO-DG44 cells, CHO-K1 cells, CHO-DXB11 cells, and CHOpro3 cells. - Cells and cell lines derived from these cells.
[0135] Examples of mammalian cells include cells derived from mammalian cells capable of differentiation. For instance, these are cells obtained by introducing a target gene into pluripotent stem cells (ES cells, iPS cells, etc.) or multipotent stem cells (mesenchymal stem cells, tissue stem cells, adult stem cells, etc.) and causing them to differentiate.
[0136] The cells of the present invention are provided in the following manner (A) to (C).
[0137] (A) A region G in the genome that contains, in sequence, one site each of site 1, site 2, site 3 and site 4, which are sites formed by recombination through recombination recognition sites of recombinase. (B) Site 1 and site 4 are sequence homologous, and site 2 and site 3 are sequence homologous. (C) The region G contains the target gene located between site 1 and site 2, and the target gene located between site 3 and site 4.
[0138] In this invention, the homology of the base sequences at sites 1 to 4 refers to the homology of the base sequences read along the 5'→3' direction of the DNA strand facing the target gene adjacent to each site. The reading strand at site 1 and the reading strand at site 4 are different DNA strands, and the reading strand at site 2 and the reading strand at site 3 are different DNA strands.
[0139] The sequence homology between site 1 and site 4 is, for example, over 80%, over 90%, over 95%, or 100%. The sequence homology between site 2 and site 3 is, for example, over 80%, over 90%, over 95%, or 100%.
[0140] The cells of the present invention can be prepared by means of a recombinase and a host genome and a donor vector having methods (1) to (4). Methods (A) to (C) of the cells of the present invention are achieved by means of host genome and donor vector having methods (1) to (4). Figure 1 The example shown is a method of preparing region G by means of host genome and donor vector having methods (1) to (4).
[0141] In the case where the cells of the present invention are cells prepared by means of a recombinase and a host genome and a donor vector having methods (1) to (4), Site 1 is a site formed through recombination of RRS1 and RRS5. Site 2 is a site formed through recombination of RRS2 and RRS6. Site 3 is a site formed through recombination of RRS3 and RRS6. Site 4 is a site formed through recombination of RRS4 and RRS5. The number of bases at sites 1 to 4 can range from 1 bp to 1000 bp, typically from 10 bp to 300 bp, and more typically from 20 bp to 200 bp.
[0142] One example of the implementation of sites 1 to 4 is a site formed by recombination of the recognition site of a serine recombinase. As serine recombinases, examples include those selected from Bxb1, C31, TP901, A118, SPβc, TG1, BT1 Rv1、 One of the groups consisting of 370.1, Wβ, Pa01, and Pa03.
[0143] One example of the implementation of sites 1 to 4 has the following manner (a) to (d).
[0144] (a) The two bases in the central portion of sites 1 and 4 are identical, and the overall base sequence is homologous. The overall homology of the base sequence is, for example, 80% or more, 90% or more, 95% or more, or 100%. (b) The two bases in the central portion of sites 2 and 3 are identical, and the overall base sequence is homologous. The overall homology of the base sequence is, for example, 80% or more, 90% or more, 95% or more, or 100%. (c) One or both of the two bases in the central portion of sites 1 (and 4) and 2 (and 3) are different, and the overall base sequence is homologous. The overall homology of the base sequence is, for example, 80% or more, 90% or more, or 95% or more. Sites 1 (and 4) and 2 (and 3) are sometimes identical sequences except for one or both of the two bases in the central portion. (d) The number of bases at sites 1 to 4 can be in the range of 1bp to 1000bp, usually in the range of 10bp to 300bp, and more usually in the range of 20bp to 200bp.
[0145] In one example of the implementation of sites 1 to 4, sites 1 and 4 are identical sequences, and sites 2 and 3 are identical sequences. Cells with this configuration can be prepared using a host genome and a donor vector having configurations (1) to (4) and (5).
[0146] Another example of an embodiment of the cell of the present invention is the following manner (D). (D) The transcription direction of the target gene located between site 1 and site 2 is from site 2 to site 1, and the transcription direction of the target gene located between site 3 and site 4 is from site 3 to site 4.
[0147] Mode (D) refers to the transcription directions of two target genes arranged in region G that are far apart (←→). Compared to the transcription directions of two adjacent target genes that are close together (→←), mode (where the transcription directions of two adjacent target genes are far apart (←→)) can be expected to result in high expression of the target genes.
[0148] Cells with mode (D) can be prepared using host genomes and donor vectors with modes (1) to (4) and mode (6).
[0149] Figure 1 The recombination pattern shown is pattern (D). The transcription directions of the two target genes arranged in region G are in directions away from each other (←→).
[0150] One embodiment of the cell of the present invention has a selection marker gene (1) configured between site 1 and site 2 and a selection marker gene (2) configured between site 3 and site 4 in region G. The selection marker genes (1) and (2) are genes expressing positive selection markers used for selection and enrichment of the cells of the present invention. Cells having this configuration can be prepared using host genomes and donor vectors having configurations (1) to (4) and configuration (8). The specific examples of selected marker genes (1) and (2) are the same as the third selected marker gene given in the description of the donor vector.
[0151] Region G is a contiguous region. The cells of this invention may have one region G or more regions G throughout the genome.
[0152] In region G, the number of bases between the outer end of site 1 and the outer end of site 4, which is the site farthest from site 1, is, for example, less than 100 kbp, less than 70 kbp, less than 50 kbp, less than 30 kbp, or less than 10 kbp. The number of bases between the outer end of site 1 and the outer end of site 4 is, for example, more than 100 bp, more than 1 kbp, or more than 2 kbp.
[0153] In region G, the number of bases between the outer end of site 2 (the end closest to site 1) and the outer end of site 3 (the end closest to site 4) is preferably 50 bp or more, more preferably 100 bp or more, and even more preferably 200 bp or more. According to this method, the two target genes present in region G are arranged close together at an appropriate distance, which can lead to high expression of the target genes.
[0154] <Methods for manufacturing proteins> This invention provides a method for producing a highly productive protein. The method utilizes cells that highly express a target gene, resulting in a highly productive target protein.
[0155] In the protein manufacturing method of the present invention, the cells of the present invention are cultured, and the protein encoded by the target gene is expressed. Through cell culture, the target protein is produced within the cells, and the target protein accumulates in the culture medium and / or cells.
[0156] The cell culture method and culture medium composition can be selected based on the cell type. Culture conditions (e.g., culture scale, cell density, temperature, and CO2 concentration) can also be selected based on the cell type.
[0157] One embodiment of the protein manufacturing method of the present invention includes a step of recovering the target protein from a culture medium. Examples of methods for recovering the target protein from the culture medium include centrifugation, filtration, dialysis filtration, ion exchange chromatography, affinity chromatography, hydrophobic interaction chromatography, gel filtration chromatography, and high-performance liquid chromatography (HPLC). The recovered target protein can be used, for example, in the manufacture of pharmaceutical compositions.
[0158] One embodiment of the protein manufacturing method of the present invention includes a step of recovering cells containing a target protein from a culture medium. Examples of methods for recovering cells from the culture medium include centrifugation and filtration. The target protein accumulates inside or on the surface of the cells depending on its properties. The recovered cells can be used, for example, for drug administration, infusion, or transplantation into mammals. Example
[0159] The following specific examples will be used to describe the cell preparation method of the present invention in more detail. The materials, processing steps, etc., shown in the following specific examples can be appropriately modified without departing from the spirit of the present invention. The scope of the cell preparation method of the present invention should not be limited by the specific examples shown below.
[0160] The base sequences of RRS1 to RRS6 in the following examples are as follows. In each of the sequences below, the two bases in the central portion that are related to the recombination between RRSs are underlined.
[0161] RRS1 and RRS4 Serial Number 1: 5'-GTCGTGGTTTGTCTGGTCAACCACCGCG GT CTCAGTGGTGTACGGTACAAACCCCGAC-3' RRS5 Serial Number 2: 5'-TCGGCCGGCTTGTCGACGACGGCG GT CTCCGTCGTCAGGATCATCCGGGC-3'
[0162] RRS2 and RRS3 Serial Number 3: 5'-GTCGTGGTTTGTCTGGTCAACCACCGCG CT CTCAGTGGTGTACGGTACAAACCCCGAC-3' RRS6 Serial Number 4: 5'-TCGGCCGGCTTGTCGACGACGGCG CT CTCCGTCGTCAGGATCATCCGGGC-3'
[0163] RRS1 and RRS4 are the natural attP of Bxb1 recombinase (also known as Bxb1 integrase). RRS5 is the natural attB of Bxb1 recombinase. RRS2 and RRS3 are sequences that change two bases "GT" in the central region of the native attP of the Bxb1 recombinase to "CT". RRS6 is a sequence that changes two bases "GT" in the central region of the native attB of the Bxb1 recombinase to "CT".
[0164] <Construction of vectors for host genome construction> A vector for constructing the host genome was created using a commissioned artificial gene synthesis service. This vector will be referred to as "Vector A" below. Vector A has RRS1 to RRS4, with a first negative selection gene between RRS1 and RRS2, and a second negative selection gene between RRS3 and RRS4. The first and second negative selection genes are thymidine kinase genes derived from herpes simplex virus. Vector A contains an ampicillin resistance gene that serves as a replication origin and selection marker for amplification using E. coli.
[0165] Figure 2 The diagram shows a schematic representation of vector A. The gene sequence and transcription direction are as follows. Figure 2 As shown in Table 1. The orientations of RRS1 to RRS4 are shown in Table 1. The total length of vector A is approximately 8 kbp. The number of bases between the outer ends of RRS1 and RRS4 is approximately 4.5 kbp, and the number of bases between the outer ends of RRS2 (the end closest to RRS1) and RRS3 (the end closest to RRS4) is approximately 300 kbp.
[0166] <Construction of donor vector> The following DNA fragments (1) and (2) were synthesized using a commissioned artificial gene synthesis service. • DNA fragment (1): red fluorescent protein mCherry gene - puromycin resistance gene. Contains all the nucleic acids required for gene expression, and there is a coding sequence for a 2A self-cleaving peptide between the two genes. • DNA fragment (2): L-chain gene of the antibody - H-chain gene. Each chain contains all the nucleic acids required for gene expression.
[0167] A vector was fabricated using the In-Fusion HD Cloning Kit (Takara Bio Inc., product code 639648) to ligate DNA fragment (1) and DNA fragment (2). DNA fragment (2) was further ligated into the fabricated vector to obtain the following DNA fragment (3). DNA fragment (3): red fluorescent protein mCherry gene - puromycin resistance gene - L-chain gene - H-chain gene - L-chain gene - H-chain gene
[0168] A DNA fragment with RRS5 appended to one end and RRS6 appended to the other end was prepared by PCR, and the DNA fragment was then linked to a backbone vector to create a donor vector. Hereinafter, this vector will be referred to as "donor vector B".
[0169] Donor vector B contains an antibody gene (L-chain gene-H-chain gene-L-chain gene-H-chain gene) between RRS5 and RRS6 as the target gene. Donor vector B contains the red fluorescent protein mCherry gene-purinemycin resistance gene as a selection marker gene between RRS5 and RRS6. Donor vector B contains an ampicillin resistance gene that serves as a replication origin and selection marker for amplification using E. coli. Hereinafter, the gene cluster located between RRS5 and RRS6 will be referred to as "GoI-MG".
[0170] Figure 3 The diagram shows a schematic representation of donor vector B. The gene sequence and transcription direction are as follows. Figure 3 As shown in Table 1, the orientations of RRS5 and RRS6 are as follows.
[0171] <Construction of Bxb1 expression vector> An expression vector for the Bxb1 recombinase was created using a commissioned gene synthesis service. This expression vector will be referred to as "vector C" below. Vector C contains an ampicillin resistance gene and a Bxb1 gene, serving as the origin of replication and selection marker for amplification using E. coli. The Bxb1 gene is a gene with codons optimized for expression in mammalian cells and has a nuclear localization signal sequence derived from SV40 appended to the 5' side.
[0172] Figure 4 The diagram shows a schematic representation of vector C. The gene sequence and transcription direction are as follows. Figure 4 As shown.
[0173] <Cell Culture> CHO-DG44 cells were used as the host cells. In the maintenance passage of CHO-DG44 cells, liquid medium supplemented with hypoxanthine / thymidine (Thermo Fisher Scientific KK, HT Supplement (100X)) in serum-free basal medium (Thermo Fisher Scientific KK, CD OptiCHO Medium) was used. In the cloning experiments with one cell, liquid medium supplemented with 10% (v / v) fetal bovine serum in IMDM basal medium was used.
[0174] <Establishment of Host Cells> Vector A was introduced into CHO-DG44 cells via electroporation. This treatment was performed using a 4D-Nucleofector device and the SF Cell Line 4D-NucleofectorX Kit L (Lonza, registered trademark "Nucleofector"). The amount of vector A used in the treatment was 11 μg.
[0175] After introducing vector A, the cells were maintained and passaged using a culture medium. On day 6 of culture, one cell was seeded in each well of a 96-well plate to achieve cell monoclonalization. Genomes were extracted from the 24 clones established, and one clone with one copy of vector A inserted into the genome was obtained using a digital PCR system (Bio-Rad Laboratories, ddPCR Supermix for Probes (No dUTP) #1863024). This clone will be referred to as "CHO-159B3 cell" below.
[0176] Region R, which is the region containing vector A, within the genome of CHO-159B3 cells was amplified by PCR and then subjected to Sanger sequencing analysis (using commissioned analysis services from Fasmac Co., Ltd.). Based on the sequence analysis results, RRS1–RRS4, the first TK gene, and the second TK gene were confirmed to exist in region R as designed. That is, the sequence of RRS1–RRS4, the first TK gene, and the second TK gene in region R is as follows: Figure 1 As shown in the schematic diagram, the orientations of RRS1 to RRS4 in region R are shown in Table 1. The number of bases between the outer ends of RRS1 and RRS4 is approximately 4.5 kbp, and the number of bases between the outer ends of RRS2 (the end closest to RRS1) and RRS3 (the end closest to RRS4) is approximately 300 kbp.
[0177] The copy numbers of the first and second TK genes (relative to the copy number of the Txnip gene) in the genome of CHO-159B3 cells were measured, and both the first and second TK genes had approximately one copy.
[0178] Integration of the target gene into the host genome Donor vectors B and C were introduced into CHO-159B3 cells via electroporation. This treatment was performed using a 4D-Nucleofector device and the SF Cell Line 4D-NucleofectorX Kit L (Lonza). The amount of donor vector B used was 12 μg, and the amount of vector C used was 6 μg.
[0179] After introducing donor vectors B and C, cells were maintained by passage using subculture medium to promote the expression and reaction of Bxb1 recombinase. On day 11 of culture, 30 cells were seeded per well in 96-well plates for selection based on ganciclovir and puromycin, as well as visual selection based on red fluorescence.
[0180] Genomic DNA was extracted from the 21 clones established, and the region corresponding to region R was amplified by PCR and then subjected to Sanger sequencing analysis. Clones containing sites formed by recombination via Bxb1 recombinase at the RRS1 and RRS4 positions in region R were screened.
[0181] The clones selected in the first screening were subjected to sequence analysis based on the total length of region G, which is formed by recombination of region R with donor vector B. One clone was selected that contains sites 1 to 4 and GoI-MG as designed. Sequence analysis was performed using the MinION Mk1C long-read sequencer (Oxford Nanopore Technologies).
[0182] The selected clone has region G containing sites 1–4, a GoI-MG located between sites 1 and 2, and a GoI-MG located between sites 3 and 4. The transcription directions of the two GoI-MGs are opposite to each other (←→). The base sequence of region G has more than 99% homology to the designed base sequence.
[0183] Figure 5 The diagram shows a schematic representation of region G of the cloned gene. The gene sequence and transcription direction are as follows. Figure 5 As shown. Figure 5 The study details the GoI-MGs located between loci 1 and loci 2 in two GoI-MGs. The GoI-MG located between loci 3 and loci 4 contains the same gene population as the GoI-MG located between loci 1 and loci 2, but in the opposite orientation.
[0184] All documents, patent applications and technical references described in this specification are incorporated herein by reference to the same extent as the specific and individually described documents, patent applications and technical references incorporated herein by reference.
[0185] The disclosure of Japanese Patent Application No. 2023-202229, filed on November 29, 2023, is incorporated herein by reference in its entirety.
Claims
1. A method for preparing cells, wherein the method uses a recombinase and a donor vector to integrate a target gene into the genome of a host cell to prepare cells, the method comprising the following steps: The donor vector of the target gene is introduced into the host cell; The recombinase is reacted in the host cells in which the donor vector has been introduced; and Cells expressing the target gene are selected from the host cells after the recombinase reaction. The host cell genome and the donor vector are as follows (1) to (4): (1) The genome of the host cell has regions R that sequentially contain one RRS1, one RRS2, one RRS3, and one RRS4 as recognition sites for the recombinase; (2) The donor vector has RRS5 and RRS6 as recognition sites for the recombinase, and the target gene disposed between RRS5 and RRS6; (3) RRS1 and RRS4 can recombine with RRS5 but cannot recombine with RRS6; and (4) RRS2 and RRS3 can recombine with RRS6 but cannot recombine with RRS5.
2. The method for preparing cells according to claim 1, wherein, The genome of the host cell is also as follows (5): (5) RRS1 and RRS4 are the same sequence, and RRS2 and RRS3 are the same sequence.
3. The method for preparing cells according to claim 1, wherein, The donor carrier is also the following (6): (6) The transcription direction of the target gene located between RRS5 and RRS6 is from RRS6 toward RRS5.
4. The method for preparing cells according to claim 1, wherein, The genome of the host cell is also as follows (7): (7) The region R contains a first selectable marker gene configured between RRS1 and RRS2, and a second selectable marker gene configured between RRS3 and RRS4.
5. The method for preparing cells according to claim 4, wherein, The donor carrier is also the following (8): (8) It has a third selectable marker gene configured between RRS5 and RRS6.
6. The method for preparing cells according to claim 1, further comprising the following steps: The expression vector of the recombinase is introduced into the host cell.
7. The method for preparing cells according to claim 1, wherein, The recombinase is a serine recombinase.
8. The method for preparing cells according to claim 1, wherein, The host cell is a mammalian cell.
9. The method for preparing cells according to claim 1, wherein, The host cell is a CHO cell.
10. The method for preparing cells according to any one of claims 1 to 9, wherein, The target gene is a gene encoding at least one of the following groups: enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, viral agents, vaccines, medical proteins, their subunits and their fragments.
11. A cell in which a target gene is integrated into its genome. The cells are as follows (A) to (C): (A) A region G in the genome that sequentially contains one site each of site 1, site 2, site 3 and site 4, which are sites formed by recombination through recombination recognition sites of recombinase; (B) Site 1 and site 4 are sequence homologous, and site 2 and site 3 are sequence homologous; and (C) The region G contains the target gene configured between site 1 and site 2, and the target gene configured between site 3 and site 4.
12. The cell according to claim 11, further comprising the following (D): (D) The transcription direction of the target gene configured between site 1 and site 2 is from site 2 toward site 1, and the transcription direction of the target gene configured between site 3 and site 4 is from site 3 toward site 4.
13. The cell according to claim 11, wherein, The recombinase is a serine recombinase.
14. The cell according to claim 11, wherein, The cells in question are mammalian cells.
15. The cell according to claim 11, wherein, The cells in question are CHO cells.
16. The cell according to claim 11, wherein, The target gene is a gene encoding at least one of the following groups: enzymes, antibodies, interleukins, cytokines, chemokines, hormones, growth factors, transcription factors, receptors, viral agents, vaccines, medical proteins, their subunits and their fragments.
17. A method for manufacturing a protein, the method comprising the following steps: The cells of any one of claims 11 to 16 are cultured, and the protein encoded by the target gene is expressed.