Identification and production of antigen-specific antibodies

The combination of mass spectrometry and next-generation sequencing addresses the inefficiencies of existing antibody production methods by enabling efficient identification and production of antigen-specific antibodies through sample concentration and sequence alignment, enhancing purification and specificity.

JP7883483B2Active Publication Date: 2026-07-01REGENERON PHARMACEUTICALS INC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
REGENERON PHARMACEUTICALS INC
Filing Date
2021-09-10
Publication Date
2026-07-01

AI Technical Summary

Technical Problem

Existing methods for producing monoclonal antibodies, such as hybridoma technology and DNA display, are inefficient and require high-quality purification due to limited throughput and impurities in antibody sources, making it difficult to achieve efficient production and isolation of antibodies with the necessary specificity and binding affinity to target antigens.

Method used

A method combining mass spectrometry (MS) and next-generation sequencing (NGS) is used to identify and select human immunoglobulin variable domain sequences and antibody complementarity-determining region (CDR) sequences from a host immunized with an antigen, involving concentration of antibody samples in vivo and ex vivo, and alignment of peptide sequences to obtain antigen-specific antibodies.

Benefits of technology

This approach enables efficient identification and production of high-quality antigen-specific antibodies by concentrating and aligning peptide sequences, overcoming the limitations of existing methods and improving antibody isolation and purification efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 0007883483000007
    Figure 0007883483000007
  • Figure 0007883483000008
    Figure 0007883483000008
  • Figure 0007883483000009
    Figure 0007883483000009
Patent Text Reader

Abstract

Methods for obtaining nucleotide sequences encoding immunoglobulin variable domains of antibodies specific for a particular antigen from immunized genetically modified non-human mammals are disclosed. Also disclosed are methods for producing antibodies against a particular antigen.
Need to check novelty before this filing date? Find Prior Art

Description

[Technical Field]

[0001] Cross-reference of related applications This application claims the interests of U.S. Provisional Patent Application No. 63 / 077133 and U.S. Provisional Patent Application No. 63 / 077140, both filed on September 11, 2020, the contents of both of which are incorporated herein by reference in their entirety.

[0002] A method is provided for obtaining nucleic acids encoding antigen-specific antibody amino acid sequences, such as variable domain amino acid sequences. Disclosed is a method comprising obtaining nucleic acid sequences encoding antibody sequences from a first sample, and a plurality of antibodies targeting a target antigen from a second sample, from an immunized host to obtain nucleotide sequences encoding human immunoglobulin variable domains specific to the antigen or a portion thereof. A method for producing antibodies targeting a target antigen is also disclosed. [Background technology]

[0003] Antibodies typically contain heavy chain components, with each heavy chain monomer associated with a light chain. The variable domains of these chains combine to form an antigen-binding site. Antibodies, especially monoclonal antibodies, have a wide range of applications in diagnosis and therapy.

[0004] Two conventional approaches have been used to prepare monoclonal antibodies: hybridoma technology and DNA display (e.g., in phages, yeast, or bacterial systems). Hybridoma technology typically involves fusing B cells from immunized animals with myeloma cell lines to produce a hybridoma strain that secretes the antigen. The cells producing the desired monoclonal antibody are isolated, cultured and grown, and the resulting antibody is purified. High-quality purification is crucial for removing impurities. Therefore, antibody isolation using hybridoma technology is inefficient due to the limited throughput of hybridoma culture.

[0005] Display techniques involve the production of lead antibody candidates from phage, yeast, or mammalian libraries. While direct isolation of DNA from antibody-expressing B cells can be utilized, DNA libraries are expressed in cell expression systems such as phage, yeast, or bacterial lines, and then "panning" or titrating to select high-affinity antibodies. Display techniques can yield high-quality protein libraries, although the resulting diversity is limited. Consequently, affinity maturation based on in vitro mutagenesis is often the next step in generating high-affinity antibodies from such libraries.

[0006] Furthermore, antibodies are often expressed and isolated from plasma, serum, ascites fluid, cell culture media, and bacterial cultures. All of these sources contain numerous impurities. Therefore, efficient purification of antibodies from such sources is required. Consequently, the efficient production and isolation of antibodies with the necessary specificity and binding affinity to target antigens remains a need in this field. [Overview of the project] [Means for solving the problem]

[0007] This disclosure describes, in particular, a method for obtaining antibodies using a combination of mass spectrometry ("MS") and next-generation sequencing ("NGS"). Methods for antibody production are also disclosed.

[0008] The provided method enables efficient identification and / or selection of human immunoglobulin variable domain sequences and / or antibody complementarity-determining region (CDR) sequences, particularly antibodies from a host immunized with the antigen of interest (e.g., genetically modified non-human animals, e.g., rodents). In some embodiments, the provided method includes the step of comparing and / or matching multiple host antibody sequences (e.g., a library of antibody sequences generated by NGS) with and / or MS analysis of antibody peptides from the host. As used herein, “database” may be an exemplary “library.”

[0009] In some embodiments, the provided method includes obtaining and / or producing a plurality of immunoglobulin variable domains and / or CDR sequences (e.g., a library) from a host immunized with the antigen of interest (e.g., from a non-human animal, e.g., rodent B cells). In some embodiments, the antibody sequence library includes a plurality of nucleic acid sequences obtained by NGS. In some embodiments, the antibody sequence library includes a plurality of CDR3 sequences.

[0010] In some embodiments, the methods provided involve MS analysis of an antibody sample obtained from a host (e.g., a rodent) immunized with the antigen of interest. The disclosure encompasses the recognition that an antibody sample for MS analysis may be concentrated in vivo and / or ex vivo for desired features. For example, an antibody sample may be concentrated based on in vivo localization. Thus, in some embodiments, an antibody sample may be obtained from any desired source within a host, e.g., serum, plasma, lymphoid organs, intestine, cerebrospinal fluid, brain, spinal cord, placenta, or a combination thereof. In some embodiments, an antibody sample may be concentrated ex vivo for one or more desired features (e.g., antigen binding, cell binding, etc.). The disclosure provides the insight that such concentration, combined with the methods provided, enables the identification of antibodies that may be difficult to identify by other methods (e.g., due to being present at low titers). In some embodiments, the disclosure provides a method for obtaining a human immunoglobulin variable domain or complementarity-determining region (CDR) of an antigen-specific antibody. In some embodiments, the method described herein includes matching the amino acid sequences of multiple human immunoglobulin variable domains from a first sample with the peptide sequences of heavy chain and / or light chain variable domains of an antibody population from a second sample. In some cases, performing the matching step yields the human immunoglobulin variable domain or CDR sequence of an antigen-specific antibody. In some embodiments, the matching includes aligning the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population with each other and with the amino acid sequences of the multiple immunoglobulin variable domains.

[0011] In some embodiments, the human immunoglobulin variable domain or CDR (e.g., CDR3) of an antigen-specific antibody is obtained from a host immunized with the specific antigen. In some embodiments, the host is a genetically modified non-human mammal. In some embodiments, the host has one or more human heavy chain V gene segments (human V) in its genome, e.g., its germline genome. H (Also called a gene segment) and one or more human D gene segments (Human DH (Also called a gene segment) and one or more human heavy chain J gene segments (Human J H It includes an immunoglobulin heavy chain variable region (also referred to as a gene segment). In some embodiments, the heavy chain variable region is operably linked to a constant region (e.g., an immunoglobulin heavy chain constant region).

[0012] In some embodiments, the host has one or more human light chain V gene segments (human V) in its genome, for example, its germline genome. L (Also referred to as having a gene segment) and one or more human light chain J gene segments (human J L It includes an immunoglobulin light chain variable region (also referred to as having a gene segment). In some embodiments, the light chain is operably linked to a constant region (e.g., an immunoglobulin light chain constant region).

[0013] In some embodiments, the method described herein includes obtaining a plurality of nucleic acids encoding a plurality of human immunoglobulin variable domains from a first sample from an immunized host and determining the amino acid sequences of the encoded plurality of immunoglobulin variable domains. In some embodiments, the method described herein includes obtaining a second sample from an immunized host containing an antibody population targeting an antigen and determining the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population therefrom.

[0014] In some embodiments, the host is a rodent, such as a rat or mouse.

[0015] In some embodiments, the present disclosure provides a method for identifying a human immunoglobulin heavy chain variable domain or CDR sequence (e.g., a CDR3 sequence) of an antigen-specific antibody, comprising: (i) obtaining or determining a plurality of peptide sequences of human immunoglobulin heavy chain and / or light chain variable domains obtained from a sample comprising an antibody population produced by rodents immunized with the antigen; and (ii) matching a library of human immunoglobulin heavy chain and / or light chain variable domain sequences with a plurality of peptide sequences determined by MS (wherein this library comprises a plurality of human immunoglobulin heavy chain and / or light chain variable domain sequences encoded by B cells of immunized rodents), thereby obtaining a human immunoglobulin variable domain or CDR sequence of an antigen-specific antibody.

[0016] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin variable domains or CDR sequences (e.g., CDR3 sequences) of antigen-specific antibodies, comprising: (i) obtaining a library of human immunoglobulin heavy and / or light chain variable domain sequences comprising a plurality of human immunoglobulin heavy and / or light chain variable domain sequences encoded by B cells of rodents immunized with the antigen; and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin heavy and / or light chain variable domains obtained from a sample comprising an antibody population produced by rodents immunized with the antigen.

[0017] In some embodiments, the immunized rodent has in its germline genome an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0018] In some embodiments, the immunized rodent has a limited immunoglobulin light chain repertoire in its germline genome. In some embodiments, the immunized rodent has a single rearranged human light chain V / J in its germline genome. In some embodiments, the immunized rodent has two human light chain V gene segments and one or more human light chain J segments in its germline genome.

[0019] In some embodiments, immunized rodents produce antibodies comprising two immunoglobulin heavy chains and two immunoglobulin light chains. In some embodiments, immunized rodents do not produce single-domain antibodies, heavy chain-only antibodies, and / or nanobodies. In some embodiments, immunized rodents have a limited immunoglobulin heavy chain repertoire in their germline genome, e.g., universal heavy chains.

[0020] In some embodiments, the immunized rodents have a CH1 deletion modification in their germline genome. In some embodiments, the immunized rodents produce single-domain antibodies, heavy-chain-only antibodies, and / or nanobodies.

[0021] In some embodiments, the first sample (i.e., the sample for sequence analysis) includes a population of B cells from primary or secondary lymphoid organs, such as B cells from bone marrow and / or spleen samples, B cells from lymph nodes, B cells from Peyer's patches, etc. In some embodiments, obtaining multiple nucleic acid sequences encoding multiple immunoglobulin variable domains from the first sample includes preparing cDNA from the nucleic acid sequences and sequencing the rearranged heavy-chain VDJ sequences and / or rearranged light-chain VJ sequences in the first sample. In some embodiments, obtaining multiple nucleic acid sequences encoding multiple immunoglobulin variable domains from the first sample includes using DNA sequencing techniques such as next-generation DNA sequencing.

[0022] In some embodiments, the second sample (i.e., the sample for peptide sequence analysis) is any body fluid containing antibodies, or includes them. In some embodiments, the second sample is serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, placenta, or a combination thereof, or includes them. In some embodiments, the peptide sequence of the second sample is obtained by mass spectrometry (MS) analysis (e.g., by a combination of liquid chromatography and mass spectrometry (LC-MS)) of the heavy chain and / or light chain variable domains of the antibody population in the second sample. Furthermore, in some embodiments, proteolytic digestion of the heavy chain and / or light chain variable domains of the antibody population may be performed prior to mass spectrometry analysis.

[0023] In some embodiments, a sample of antibodies for peptide sequence analysis may be enriched ex vivo for one or more desired features (e.g., before MS analysis). In some embodiments, obtaining a second sample further includes depleting the second sample of antibodies that do not target a specific antigen. In some embodiments, obtaining a second sample further includes enriching the second sample for antibodies that target a specific antigen.

[0024] In some embodiments, matching the amino acid sequences of multiple immunoglobulin variable domains from a first sample with the peptide sequences of heavy chain and / or light chain variable domains of an antibody population from a second sample includes aligning the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population with the amino acid sequences of multiple immunoglobulin variable domains, and optionally with each other.

[0025] In some embodiments, the method described herein includes expressing the obtained nucleotide sequence encoding a human immunoglobulin variable domain in a second recombinant antibody. In some embodiments, the nucleotide sequence encoding the human variable domain may be operably ligated to a human immunoglobulin constant region and expressed in a cell line. More specifically, in some embodiments, the human variable domain is a human heavy chain variable domain that is operably ligated to a human immunoglobulin heavy chain constant region and expressed to generate a human immunoglobulin heavy chain. In some embodiments, the human immunoglobulin heavy chain is expressed in a cell line together with a human immunoglobulin light chain. In one embodiment, the human variable domain is a human light chain variable domain, which may be operably ligated to a human immunoglobulin light chain constant region and expressed to generate a human immunoglobulin light chain. In some embodiments, the human immunoglobulin light chain is expressed in a cell line together with a human immunoglobulin heavy chain.

[0026] In some embodiments, the method described herein further comprises expressing the obtained nucleotide sequence encoding a human immunoglobulin variable domain in a recombinant antigen-binding protein.

[0027] In some embodiments, the recombinant antigen-binding protein is a human antibody, such as a human bispecific antibody.

[0028] In some embodiments, the recombinant antigen-binding protein is purified. In some embodiments, the affinity and / or specificity of the purified recombinant antigen-binding protein for a specific antigen is determined.

[0029] In some embodiments, the host is a genetically modified mouse whose genome (e.g., its germline genome) comprises an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to the murine constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to the murine constant region. In some embodiments, the immunoglobulin heavy chain variable region is operably linked to the mouse heavy chain constant region, and / or the immunoglobulin light chain variable region is operably linked to the mouse light chain constant region. Furthermore, the immunoglobulin heavy chain variable region may be operably ligated to the mouse heavy chain constant region located at the endogenous mouse heavy chain locus, and / or the immunoglobulin light chain variable region operably ligated to the mouse light chain constant region may be located at the endogenous mouse light chain locus.

[0030] In some embodiments, the host is a genetically modified mouse whose genome (including its germline genome) comprises an immunoglobulin heavy chain variable region comprising a plurality of human heavy chain V gene segments, a plurality of human D gene segments, and a plurality of human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to the murine heavy chain constant region, and an immunoglobulin light chain variable region operably linked to the murine light chain constant region, comprising strictly two unreorganized human Vκ gene segments and five unreorganized human Jκ gene segments. In some embodiments, the strictly two unreorganized human Vκ gene segments are the human Vκ1-39 gene segment and the human Vκ3-20 gene segment.

[0031] In some embodiments, the host can be a genetically modified mouse, the genome of which (e.g., the germline genome) has, at the endogenous heavy chain locus, (i) an immunoglobulin heavy chain variable region comprising a plurality of unrearranged human V H gene segments, a plurality of unrearranged human D H gene segments, and a plurality of unrearranged human J H gene segments, operably linked to a mouse heavy chain constant region; (ii) a restricted unrearranged heavy chain variable region comprising a single human V H gene segment, one or more unrearranged human D H gene segments, and one or more unrearranged human J H gene segments, operably linked to a mouse heavy chain constant region; (iii) a universal heavy chain coding sequence comprising a single rearranged human heavy chain variable region operably linked to a mouse heavy chain constant region; (iv) a histidine-modified unrearranged heavy chain variable region comprising one or more unrearranged human V H gene segments, one or more unrearranged human D H gene segments, and one or more unrearranged human J H gene segments, and further comprising a substitution or insertion of at least one histidine for a non-histidine residue; (v) a sequence encoding an immunoglobulin of only the heavy chain, comprising an immunoglobulin heavy chain variable region comprising one or more unrearranged human V H gene segments, one or more unrearranged human D H gene segments, and one or more unrearranged human J H gene segments, operably linked to a heavy chain constant region (a non-IgM gene, e.g., a sequence lacking a sequence encoding a functional CH1 domain of an IgG gene); or (vi) one or more unrearranged human V L gene segments and one or more unrearranged human J LIt includes a gene segment and an engineered endogenous rodent immunoglobulin heavy chain locus. In some embodiments, the host may be a genetically modified mouse whose genome (e.g., germline genome) includes, at an endogenous light chain locus, (i) an immunoglobulin light chain variable region operably linked to the mouse light chain constant region and comprising multiple unreorganized human Vκ gene segments and multiple unreorganized human Jκ gene segments; (ii) a universal light chain coding sequence comprising a single reorganized human light chain variable region operably linked to the mouse light chain constant region; (iii) a restricted light chain variable region operably linked to the mouse light chain constant region and comprising two unreorganized human Vκ gene segments and one or more unreorganized human Jκ gene segments; or (iv) a histidine-modified light chain variable region operably linked to the mouse light chain constant region and comprising one or more human light chain V gene segments and one or more human light chain J gene segments, further comprising at least one histidine substitution or insertion for a non-histidine residue.

[0032] In some embodiments, the host comprises a functional ADAM6 gene, optionally, the host is a genetically modified mouse, and the functional ADAM6 gene is the mouse ADAM6 gene. In some embodiments, the host may comprise and / or express an exogenous terminal deoxynucleotidyltransferase (TdT) gene.

[0033] The Disclosure also provides a method for obtaining an immunoglobulin variable domain or CDR of an antigen-specific antibody, comprising matching the peptide sequences of the heavy chain and / or light chain variable domains of an antibody population from a sample obtained from a host immunized with the antigen against a library of amino acid sequences containing multiple human immunoglobulin variable domains, thereby obtaining a human immunoglobulin variable domain or CDR sequence of an antibody specific to the antigen. In some embodiments, the method comprises obtaining a sample containing an antibody population targeting an antigen from a host immunized with the antigen. In some embodiments, the method comprises determining the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population.

[0034] The Disclosure also provides a method for identifying a human immunoglobulin variable domain or CDR of an antibody specific to a particular antigen, comprising: comparing a plurality of amino acid sequences encoded by a plurality of nucleic acid sequences encoding a plurality of human immunoglobulin variable domains produced by an animal immunized with the antigen with an amino acid sequence comprising peptide fragments from light chain and / or heavy chain variable domains produced from an antibody population targeting the antigen; and thereby identifying a human immunoglobulin variable domain or CDR sequence of an antibody specific to the antigen.

[0035] In some embodiments, the immunized host is a genetically modified non-human mammal whose germline genome includes an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the immunoglobulin light chain variable region is operably linked to a constant region.

[0036] In some embodiments, the Disclosure also provides a method for obtaining human immunoglobulin heavy chain variable domains or CDRs of antibodies specific to a particular antigen from a host immunized with that antigen, comprising: obtaining amino acid sequences of a plurality of human immunoglobulin variable domains encoded by a plurality of nucleic acid sequences obtained from the host; determining the peptide sequences of human heavy chain variable domains of an antibody population obtained from the immunized host; and matching the amino acid sequences of the encoded plurality of human immunoglobulin heavy chain variable domains with the peptide sequences of human heavy chain variable domains of an antibody population, thereby obtaining human immunoglobulin heavy chain variable domains or CDRs of antibodies specific to that antigen. In some embodiments, the host is a genetically modified mouse whose genome (including its germline genome) comprises an immunoglobulin heavy chain variable region comprising a plurality of human heavy chain V gene segments, a plurality of human heavy chain D gene segments, and a plurality of human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a murine constant region; and an immunoglobulin light chain variable region which is a single reorganized human light chain variable region comprising a single human light chain V gene segment and a single human light chain J gene segment, wherein the human immunoglobulin light chain variable region is operably linked to a murine light chain constant region.

[0037] In some embodiments, a single rearranged human light chain variable region is a single rearranged human kappa light chain variable region comprising a single human light chain Vκ gene segment and a single human light chain Jκ gene segment. In some embodiments, the single human light chain Vκ gene segment is a Vκ1-39 or Vκ3-20 gene segment, and the single human light chain Jκ gene segment is a Jκ1 or Jκ5 gene segment. In some embodiments, a single rearranged human kappa light chain variable region comprises a Vκ1-39 gene segment and a Jκ5 gene segment. In some embodiments, a single rearranged human kappa light chain variable region comprises a Vκ3-20 gene segment and a Jκ1 gene segment.

[0038] In some embodiments, the murid light chain constant region is the mouse kappa light chain constant region. In some embodiments, a single rearranged human light chain variable region is operably linked to the mouse kappa light chain constant region. In some embodiments, a single rearranged human light chain variable region is operably linked to the mouse kappa light chain constant region located at the endogenous mouse kappa light chain locus.

[0039] In some embodiments, the host comprises a functional ADAM6 gene or a fragment thereof, and optionally, the host is a genetically modified mouse, and the functional ADAM6 gene is the mouse ADAM6 gene.

[0040] In some embodiments, the first sample includes a population of B cells from primary or secondary lymphoid organs, such as B cells from bone marrow and / or spleen samples, B cells from lymph nodes, B cells from Peyer's patches, and so on.

[0041] In some embodiments, obtaining multiple nucleic acid sequences encoding multiple human immunoglobulin heavy chain variable domains from a first sample involves preparing cDNA from the nucleic acid sequences and sequencing the rearranged heavy chain VDJ sequences in the first sample.

[0042] In certain embodiments, multiple nucleic acid sequences encoding multiple immunoglobulin variable domains obtained from a first sample are determined using DNA sequencing techniques.

[0043] In some embodiments, the second sample is any body fluid containing antibodies, or includes them. In some embodiments, the second sample is serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, or placenta, or includes them. In some embodiments, determining the peptide sequence from the second sample involves analysis of the heavy chain variable domains of the antibody population in the second sample by mass spectrometry (including, for example, liquid chromatography and mass spectrometry (LC-MS)). The methods described herein may include proteolytic digestion of the heavy chain variable domains of the antibody population prior to mass spectrometry analysis.

[0044] In some embodiments, the method described herein includes depleting antibodies from a second sample that do not target a specific antigen. In some embodiments, the method described herein includes depleting antibodies from a second sample that target different antigens and / or different epitopes of the same antigen (e.g., those used to immunize a host). In some embodiments, the method described herein includes concentrating the second sample with antibodies that target the antigen of interest (e.g., those used to immunize a host).

[0045] In some embodiments, matching the amino acid sequences of multiple human immunoglobulin heavy chain variable domains with the peptide sequences of human heavy chain variable domains of an antibody population includes aligning the peptide sequences of heavy chain and / or light chain variable domains of an antibody population with the amino acid sequences of multiple immunoglobulin variable domains, and optionally with each other.

[0046] In some embodiments, the present disclosure provides a method for identifying a human immunoglobulin heavy chain variable domain or CDR sequence (e.g., a CDR3 sequence) of an antigen-specific antibody, comprising: (i) obtaining a plurality of peptide sequences of human immunoglobulin heavy chain and / or light chain variable domains obtained from a sample comprising an antibody population produced by rodents immunized with the antigen; and (ii) matching a library of human immunoglobulin heavy chain and / or light chain variable domain sequences with the plurality of peptide sequences (wherein the library comprises a plurality of human immunoglobulin heavy chain and / or light chain variable domain sequences encoded by B cells of immunized rodents), thereby obtaining a human immunoglobulin variable domain or CDR sequence of an antigen-specific antibody.

[0047] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin variable domains or CDR sequences (e.g., CDR3 sequences) of antigen-specific antibodies, comprising: (i) obtaining a library of human immunoglobulin heavy and / or light chain variable domain sequences comprising a plurality of human immunoglobulin heavy and / or light chain variable domain sequences encoded by B cells of rodents immunized with the antigen; and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin heavy and / or light chain variable domains obtained from a sample comprising an antibody population produced by rodents immunized with the antigen.

[0048] In some embodiments, the immunized rodent has an immunoglobulin heavy chain variable region in its germline genome comprising multiple human heavy chain V gene segments, multiple human D gene segments, and multiple human heavy chain J gene segments, and an immunoglobulin light chain variable region which is (i) operably linked to the mouse light chain constant region and a single human V L Gene segment and single human light J L (ii) a universal light chain coding sequence containing a rearranged human light chain variable region including a gene segment; (ii) two unrearranged human V light chain constant regions operably linked to the mouse light chain constant region. LGene segment and one or more unreorganized human J L A restricted light chain variable region comprising a gene segment; or (iii) an immunoglobulin light chain variable region comprising a histidine-modified light chain variable region operably linked to a mouse light chain constant region, comprising one or more human light chain V gene segments and one or more human light chain J gene segments, further comprising the substitution or insertion of at least one histidine to a non-histidine residue. In some embodiments, the method provided comprises (ii) obtaining a library of human immunoglobulin heavy chain variable domain sequences comprising a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of rodents immunized with an antigen, and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by rodents immunized with that antigen.

[0049] In some embodiments, immunized rodents have their germline genome operably linked to the constant region of the mouse light chain, and multiple unreorganized human V L Gene segments and multiple unreorganized human J L A gene segment comprising an immunoglobulin light chain variable region and an immunoglobulin heavy chain variable region, wherein (i) it is operably linked to the mouse heavy chain constant region and is a single human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H (ii) A restricted, non-reorganized heavy chain variable region including a gene segment; (ii) operably linked to the mouse heavy chain constant region and a single human V H Gene segment and single human D H Gene segment and a single human J H (iii) a universal heavy chain coding sequence containing a single rearranged human heavy chain variable region including a gene segment; or (iii) one or more unrearranged human V operably linked to a mouse heavy chain constant region. H Gene segment and one or more unreorganized human D HGene segment and one or more unreorganized human J H The immunoglobulin heavy chain variable region comprises a gene segment and a histidine-modified non-reorganized heavy chain variable region further comprising the substitution or insertion of at least one histidine to a non-histidine residue. In some embodiments, the provided method comprises (ii) obtaining a library of human immunoglobulin light chain variable domain sequences comprising a plurality of human immunoglobulin light chain variable domain sequences encoded by B cells of rodents immunized with an antigen, and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin light chain variable domains obtained from a sample comprising an antibody population produced by rodents immunized with the antigen.

[0050] In some embodiments, the method described herein may include obtaining a nucleotide sequence of the human heavy chain variable domain of an antigen-specific antibody and expressing the obtained nucleotide sequence encoding the human immunoglobulin heavy chain variable domain in an antigen-binding protein. In some embodiments, the antigen-binding protein is a second (e.g., recombinant) antibody.

[0051] In some embodiments, a nucleotide sequence encoding a human heavy chain variable domain is operably ligated to a human immunoglobulin heavy chain constant region and expressed in a cell line to generate a human immunoglobulin heavy chain. In some embodiments, the human immunoglobulin heavy chain may be expressed in a cell line together with a human immunoglobulin light chain. In some embodiments, the human immunoglobulin light chain may be derived from the same single rearranged variable region sequence present in mice, or a somatic variant thereof.

[0052] In some embodiments, the method described herein involves expressing a obtained nucleotide sequence encoding a human immunoglobulin variable domain in a recombinant antigen-binding protein. In some embodiments, the recombinant antigen-binding protein is a second recombinant antibody. In some embodiments, the second antibody is a human antibody and may be a bispecific antibody. The second antibody can be purified, and the affinity and / or specificity of the purified second antibody can be determined for a particular antigen.

[0053] In some embodiments, the sample for determining the peptide sequences of the heavy chain and / or light chain variable domains is any body fluid containing an antibody, or includes such fluid. In some embodiments, the second sample is serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, or placenta, or a combination thereof, or includes such fluid. In some embodiments, determining the peptide sequences of the heavy chain and / or light chain variable domains involves MS analysis (e.g., LC / MS analysis). In some embodiments, determining the peptide sequences of the heavy chain and / or light chain variable domains involves MS analysis (e.g., LC / MS analysis) of a sample containing an antibody obtained from a host immunized with an antigen.

[0054] In some embodiments, the library of amino acid sequences containing multiple human immunoglobulin variable domains is encoded by multiple nucleic acids obtained from a host immunized with an antigen. In some embodiments, the library of amino acid sequences containing multiple human immunoglobulin variable domains is encoded by multiple nucleic acids obtained from a B cell sample, such as a bone marrow and / or spleen sample.

[0055] These and other features and advantages provided in this disclosure will be better understood from the following detailed description in conjunction with the attached claims. It should be noted that the claims are defined by the statements within the claims, and not by any specific discussion of the features and advantages described herein. [Brief explanation of the drawing]

[0056] [Figure 1] This includes a schematic outline of an exemplary method for obtaining antibodies for an exemplary target antigen using LC-MS in parallel with next-generation sequencing.

[0057] [Figure 2A] This includes a graph showing the diversity of human heavy chain V gene use in IgG obtained from the spleen and bone marrow of CD22-immunized mouse donors (expressed as a percentage of the sequence on the Y-axis). [Figure 2B] This includes a graph showing the diversity of human heavy chain J gene use in IgG obtained from the spleen and bone marrow of CD22-immunized mouse donors (expressed as a percentage of the sequence on the Y-axis).

[0058] [Figure 3A] Next-generation sequencing analysis revealed the duplication of HCDR3 in the spleen from different mice (approximately 2% duplication). [Figure 3B] Next-generation sequencing analysis revealed duplication of HCDR3 (10-14% duplication) in the bone marrow and spleen from the same mice.

[0059] [Figure 4] An example of selecting an anti-CD22 antibody based on mass spectral matching and NGS count from a group of antibodies containing homologous CDR3 sequences is shown. The top of Figure 4 shows the sequences of the variable domains of the heavy chain of the anti-CD22 antibody. The dashed boxes depict the CDR1, CDR2, and CDR3 sequences (from left to right, respectively). The underlines indicate sequence coverage by mass spectrometry analysis, with CDR1 having 100% coverage, CDR2 having 0% coverage, and CDR3 having 100% coverage.

[0060] [Figure 5] This study demonstrates the diversification of antibodies based on expressed CDR3 sequences obtained from universal light chain mice. Antibodies were classified based on differences in their CDR3 sequences, and a diverse repertoire was selected for further cloning and characterization. [Modes for carrying out the invention]

[0061] This disclosure provides a method for obtaining antibodies having human variable domains using a combination of mass spectrometry and next-generation sequencing. This disclosure further provides a method for producing antibodies.

[0062] Specific definition When used in accordance with this disclosure, unless otherwise indicated, the following terms shall be understood to have the following meanings: Unless otherwise required by context, singular terms shall include plural forms, and plural terms shall include singular forms.

[0063] Furthermore, unless otherwise explicitly indicated by the context, the singular forms "a," "an," and "the" include plural referents. Thus, for example, a reference to "a method" includes one or more methods and / or steps of the type described herein, and / or will be apparent to those skilled in the art by reading this disclosure.

[0064] The terms "approximately" or "about" imply that the value falls within a meaningful range. The permissible variation encompassed by the terms "approximately" or "about" depends on the specific system being tested and will be readily apparent to those skilled in the art.

[0065] The term "antigen" refers to any active substance (e.g., proteins, peptides, polysaccharides, lipids, glycoproteins, glycolipids, nucleotides, nucleic acids, polymers, and / or parts or combinations thereof) that, when introduced into an immunocompetent host, is recognized by the host's immune system and triggers an immune response by the host. In some embodiments, the antigen triggers a humoral response (e.g., the production of antigen-specific antibodies).

[0066] Terms such as “antibody,” “antigen-binding protein,” or “epitope-binding protein” refer to monoclonal antibodies, IgA antibodies, IgG antibodies, IgE antibodies, or IgM antibodies, multispecific antibodies, human antibodies, humanized antibodies, chimeric antibodies, reverse chimeric antibodies, antibodies containing a light chain variable gene segment in the heavy chain, antibodies containing a heavy chain variable gene segment in the light chain, as well as single-chain Fv(scFv), single-chain antibodies, Fab fragments, F(ab') fragments, disulfide-bonded Fv(sdFv), intrabodies, minibodies, diabodies, and anti-idiotype (anti-Id) antibodies (e.g., anti-Id antibodies against antigen-specific TCRs), as well as any of the epitope-binding fragments described above. Accordingly, the “antigen-binding fragment,” “antigen-binding moiety,” and “epitope-binding fragment” of antigen-binding molecules are also included herein, and these refer to fragments that retain the ability to bind to an antigen. The term “antigen-binding protein” also includes, for example, single-domain antibodies, heavy-chain-only antibodies, covalent diabodies such as those disclosed in U.S. Patent Application Publication 20070004909 (which is incorporated herein by reference in its entirety), and Ig-DARTS such as those disclosed in U.S. Patent Application Publication 20090060910 (which is incorporated herein by reference in its entirety). In some specific embodiments, the antibody is a canonical antibody comprising at least two heavy (H) chains and two light (L) chains (for example, interconnected by disulfide bonds).

[0067] Terms such as "specifically binding," "binding in a specific manner," and "antigen-specific" indicate that the molecules involved in specific binding are (1) able to stably bind to each other under physiological conditions (e.g., associate, e.g., form intermolecular non-covalent bonds), and (2) are unable to stably bind to other molecules other than the specific binding pair under physiological conditions. Specific binding also involves the equilibrium dissociation constant (K) in the range from low micromolars to picomoles. D) can be characterized by the following: High specificity can be found in the low nanomolar range, and very high specificity can be found in the picomolar range. Methods for determining whether two molecules specifically bind are well known in the art and include, for example, equilibrium dialysis and surface plasmon resonance.

[0068] A "host" refers to an animal or non-human mammal that produces immune system proteins in response to a foreign molecule or antigen introduced into the host via injection or other appropriate route. The introduction of an antigen or other foreign substance into the host triggers antibody production and associated immune responses.

[0069] Terms such as “non-human mammal” refer to any vertebrate that is not human. In some embodiments, non-human animals are cyclostomes, bony fish, cartilaginous fish (e.g., sharks or rays), amphibians, reptiles, mammals, and birds. In some embodiments, non-human animals are mammals. In some embodiments, non-human mammals are primates, goats, sheep, pigs, dogs, cattle, or rodents. Various non-human animals are described further later herein. Furthermore, as used herein, the term “genetically modified non-human mammal” refers to the “non-human mammal” described above in which the genetic material of the non-human mammal has been modified using genetic engineering techniques, for example, to introduce, delete, enhance, suppress, or mutate the gene sequences of the non-human mammal.

[0070] Terms such as "humanized," "chimeric," and "human / non-human" are commonly used to refer to antibodies (or antigen-binding proteins, or antibody components) containing a sequence (e.g., nucleic acids, proteins, etc.) in which at least a portion of the sequence is derived from humans, or at least a portion of the sequence is of non-human origin (e.g., from rodents, e.g., mice), and the modified (e.g., humanized, chimeric, human / non-human, etc.) molecule has been modified by replacing the corresponding portion of the corresponding human antibody (or antigen-binding protein, or antibody component) sequence so that the modified molecule retains its biological function and / or maintains the structure that performs the retained biological function. For example, a chimeric antibody is a variant of the V sequence found in the first species (e.g., humans). H and V L It includes a region sequence and a constant region sequence found in a second different species (e.g., a non-human animal, e.g., a rodent, e.g., a mouse). In some embodiments, a human V is linked to a non-human constant region (e.g., a mouse constant region). H and V L Antibodies containing a specific region are called "reverse chimeric antibodies." In contrast, "human" antibodies, for example, contain sequences that are of human origin only (e.g., human nucleotides and / or protein sequences).

[0071] The terms “genetically modified non-human animal” and “genetically engineered non-human animal” are used interchangeably herein and refer to any non-human animal that does not exist in nature (e.g., rodents, e.g., rats or mice) in which one or more cells of the non-human animal contain, in whole or in part, heterologous nucleic acids and / or heterologous genes encoding a polypeptide of interest. For example, in some embodiments, “genetically modified non-human animal” or “genetically engineered non-human animal” refers to a non-human animal containing a transgene or transgene construct described herein. In some embodiments, heterologous nucleic acids and / or heterologous genes are introduced into cells directly or indirectly by intentional gene manipulation, e.g., by introduction into progenitor cells by microinjection or infection with recombinant viruses. The term gene manipulation does not include classical mating techniques and rather refers to the introduction of recombinant DNA molecules. These molecules may be incorporated into chromosomes. The terms "genetically modified non-human animals" or "genetically engineered non-human animals" refer to animals that are heterozygous or homozygous with respect to heteronucleotides and / or heterogenes, and / or animals that have one or more copies of heteronucleotides and / or heterogenes.

[0072] As used herein, the term “germline composition” refers to the arrangement of sequences (e.g., gene segments) found in the endogenous germline genome of wild-type animals (e.g., mice, rats, or humans). An example of germline composition of immunoglobulin gene segments can be found, for example, in LeFranc, MP., The Immunoglobulin FactsBook, Academic Press, May 23, 2001 (referred to herein as “LeFranc 2001”): Exemplary configurations of human heavy chain variable region gene segments and human heavy chain constant region genes can be seen on page 47 of LeFranc 2001; Exemplary configurations of human λ light chain variable region gene segments and human λ light chain constant region genes can be seen on p. 61 of LeFranc 2001; Exemplary configurations of human κ-light chain variable region gene segments and human κ-light chain constant region genes can be seen on page 53 of LeFranc 2001; • Exemplary structures of mouse heavy chain variable region gene segments and mouse heavy chain constant region genes are described in Lucas, J. et al., Chapter 1: The Structure and Regulation of the Immunoglobulin Loci, Molecular Biology of B Cells, 2 nd It can be seen in Edition, Academic Press, 2015 (Lucas); The exemplary configurations of the mouse λ light chain variable region gene segment and the mouse λ light chain constant region gene are described in LeFranc, MP et al., Chapter 4: Immunoglobulin Lambda (IGL) Genes of Human and Mouse, Molecular Biology of B Cells, 1 st This can be seen in Edition, Academic Press, 2004 (LeFranc 2004); and Exemplary structures of mouse κ light chain variable region gene segments and mouse κ light chain constant region genes can be found in Christele, MJ, et al., Nomenclature and Overview of the Mouse (Mus musculus and Mus sp.) Immunoglobulin Kappa (IGK) Genes, Exp Clin Immunogenet 2001, 18:255-279 (Christele).

[0073] Each of the cited sections of LeFranc 2001, Lucas, LeFranc 2004, and Christele listed above is incorporated herein by reference.

[0074] As used herein, the term “germline genome” refers to the genome found in embryonic cells used in animal formation (e.g., gametes, e.g., sperm or eggs). For animal cells, the germline genome is the source of genomic DNA. Therefore, an animal with a modified germline genome (e.g., a mouse or rat) is considered to have a modification in all of its genomic DNA.

[0075] As used herein, the term “germline sequence” refers to a DNA sequence found in the endogenous germline genome of a wild-type animal (e.g., mouse, rat, or human), or an RNA or amino acid sequence encoded by a DNA sequence found in the endogenous germline genome of an animal (e.g., mouse, rat, or human). Representative germline sequences of immunoglobulin gene segments can be found, for example, in LeFranc 2001: Human V may be used in some embodiments described herein. H Representative germline nucleotide sequences of gene segments and human V H Representative germline amino acid sequences of gene segments can be found on pages 107–234 of LeFranc 2001; Representative germline nucleotide sequences and representative germline amino acid sequences of human D gene segments that may be used in some embodiments described herein can be found on pages 98-100 of LeFranc 2001; Human J may be used in some embodiments described herein. H Representative germline nucleotide sequences of gene segments and human J H Representative germline amino acid sequences of gene segments can be found on page 104 of LeFranc 2001; Representative germline nucleotide sequences and representative germline amino acid sequences of human Vλ gene segments that may be used in some embodiments of non-human animals described herein can be found on pages 350-428 of LeFranc 2001; and Representative germline nucleotide sequences and representative germline amino acid sequences of human Jλ gene segments that may be used in some embodiments of non-human animals described herein can be found on page 346 of LeFranc 2001.

[0076] Each of the cited sections of LeFranc 2001 listed above is incorporated herein by reference.

[0077] The term “complementarity-determining region” or “CDR” refers to an amino acid sequence encoded by the nucleic acid sequence of an immunoglobulin gene in an organism, which is typically (i.e., in wild-type animals) found between two framework (FR) regions of the light or heavy chain variable domain of an immunoglobulin molecule (e.g., an antibody). CDRs can be encoded, for example, by germline sequences or rearranged or unrearranged sequences, and by, for example, naive B cells or mature B cells. CDRs can be somatically mutated (unlike sequences encoded in, for example, animal germline sequences), humanized, and / or modified by amino acid substitutions, additions, or deletions. In some situations (e.g., CDR3), a CDR may be encoded by two or more sequences (e.g., germline sequences) which are not adjacent (e.g., in unrearranged nucleic acid sequences) but are adjacent in B cell nucleic acid sequences as a result of, for example, sequence linking (e.g., VDJ recombination to form heavy chain CDR3). Specific systems for defining CDR boundaries have been established in the art (e.g., Kabat, Chothia). A person skilled in the art can understand the differences between these systems and comprehend the CDR boundaries to the extent necessary to understand and implement the claimed invention.

[0078] The terms "gene segment" or "segment" include variable (V) gene segments (e.g., immunoglobulin light chain variable (V)). L ) Gene segment or immunoglobulin heavy chain variable (V H (Genetic segment), immunoglobulin heavy chain diversity (D) gene segment, or conjugation (J) gene segment, for example, immunoglobulin light chain conjugation (J) L ) Gene segment or immunoglobulin heavy chain conjugation (J H ) This includes references to gene segments, which are involved in rearrangement (e.g., mediated by endogenous recombinases) of the rearranged light chain V L / J L or a reorganized heavy chain V H / D / J H It contains non-reorganized sequences at immunoglobulin loci that can form sequences. Unless otherwise indicated, non-reorganized V, D, and J segments are V according to the 12 / 23 rule. L / J L Recombinant or V H / D H / J H It contains a recombinant signal sequence (RSS) that enables recombination.

[0079] As used herein, the term “reorganized” refers to a DNA sequence in which two or more immunoglobulin gene segments are joined together (directly or indirectly) to form a DNA sequence in which the joined gene segments together have a DNA sequence encoding the variable region of an immunoglobulin. Two or more immunoglobulin gene segments of a reorganized DNA sequence no longer have a functional recombination signal sequence (RSS) and therefore cannot undergo further reorganization. While two or more immunoglobulin gene segments of a reorganized DNA sequence may not be able to undergo further reorganization, those skilled in the art will recognize that this does not mean that other immunoglobulin gene segments within the same locus cannot undergo, for example, secondary reorganization. Those skilled in the art will understand that reorganized gene segments (e.g., within a reorganized immunoglobulin variable region) can be joined together via the natural VDJ recombination process. Those skilled in the art will also understand that reorganized gene segments (e.g., within a reorganized immunoglobulin variable region) can be manipulated to be joined together, for example, by joining gene segments using standard recombination techniques. A rearranged immunoglobulin variable region typically contains two or more conjugated immunoglobulin gene segments. For example, a rearranged immunoglobulin λ light chain variable region may contain a Jλ gene segment and a conjugated Vλ gene segment. A rearranged immunoglobulin heavy chain variable region may contain a conjugated V H Gene segment, D gene segment, J H This may include gene segments. A person skilled in the art will also understand that all or substantially all intergenetic sequences are typically removed between immunoglobulin gene segments of a rearranged immunoglobulin variable region. A person skilled in the art will further understand that the rearranged sequences may include, in particular, introns in the gene segments.

[0080] As used herein, the term “unreorganized” refers to a DNA sequence comprising two or more immunoglobulin gene segments that have not undergone a recombination event or are otherwise not conjugated, and therefore contain intergenetic sequences between them. Those skilled in the art will understand that unreorganized V and J gene segments may be accompanied by intact recombination signal sequences (RSS). An unreorganized D gene segment may have two intact recombination signal sequences (RSS) adjacent to each other. Those skilled in the art will further understand that unreorganized gene segments (e.g., unreorganized V gene segments) may, in particular, contain introns.

[0081] As used herein, the terms “protein” or interchangeably “polypeptide” encompass all types of naturally occurring and synthetic proteins, including but not limited to protein fragments, peptides, fusion proteins, and modified proteins of any length (including but not limited to glycoproteins), as well as all other types of modified proteins (including, but not limited to, proteins resulting from phosphorylation, acetylation, myristoylation, palmitoylation, glycosylation, oxidation, formylation, amidation, polyglutamylation, ADP-ribosylation, PEGylation, and biotinylation).

[0082] The terms “nucleic acid” and “nucleotide” encompass both DNA and RNA unless otherwise specified. In particular, the terms “nucleic acid” and “nucleotide sequence” are used interchangeably herein.

[0083] Terms such as “operatably ligated” refer to a juxtaposition in which the listed components are related in a way that enables them to function as intended. For example, an unreorganized variable region gene segment is “operatably ligated” to an adjacent constant region gene if that segment can be reorganized to form a reorganized variable region gene that is expressed in B cells or their progenitor cells together with the constant region gene as a polypeptide chain of an antigen-binding protein. Regulatory sequences “operatably ligated” to a coding sequence are arranged in such a way that the expression of the coding sequence is achieved under conditions that are compatible with the regulatory sequence. “Operatally ligated” sequences include both expression regulatory sequences adjacent to the target gene and expression regulatory sequences that act in trans or at a fixed distance to regulate the target gene (or target sequence). The term “expression regulatory sequence” includes polynucleotide sequences that are necessary to result in the expression and processing of the coding sequence to which they are ligated. Expression regulatory sequences include appropriate transcription start, termination, promoter, and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (i.e., Kozak consensus sequences); sequences that enhance polypeptide stability; and, if necessary, sequences that enhance polypeptide secretion. The nature of such regulatory sequences varies depending on the host organism. For example, in prokaryotes, such regulatory sequences generally include promoters, ribosome binding sites, and transcription termination sequences, while in eukaryotes, such regulatory sequences usually include promoters and transcription termination sequences. The term “regulatory sequence” is intended to include components whose presence is essential and beneficial for expression and processing, and may also include additional components whose presence is advantageous, such as leader sequences.

[0084] The term “heterogeneous” refers to an active substance or entity from a different source. For example, when used in relation to a polypeptide, gene, or gene product present in a particular cell or organism, the term clarifies that the polypeptide, gene, or gene product in question is 1) manipulated by human hands; 2) introduced into a cell or organism (or its precursor) through human hands (e.g., via genetic engineering); and / or 3) not produced in nature by, nor present in, the cell or organism in question (e.g., the cell type or organism type). “Hexerogeneous” also includes polypeptides, genes, or gene products that are normally present in a particular natural cell or organism but have been altered or modified, for example, by mutation or by being under control not present in nature (and, in some embodiments, by non-endogenous regulatory elements (e.g., promoters)).

[0085] An antibody "heavy chain" typically contains an immunoglobulin heavy chain variable domain and an immunoglobulin heavy chain constant domain. The variable domain can be further subdivided into highly variable regions called complementarity-determining regions (CDRs), which are interspersed with more conserved regions called framework regions (FRs). Unless otherwise specified, a heavy chain variable domain contains three heavy chain CDRs and four FR regions (e.g., FR1-CDR1-FR2-CDR2-FR3-CDR3-FR4). Fragments of a heavy chain include CDRs, CDRs and FRs, and combinations thereof. Generally, a full-length heavy chain contains a heavy chain variable domain, from N-terminus to C-terminus, including FR1-CDR1-FR2-CDR2-FR3-CDR3-FR4, a CH1 domain, a hinge, a CH2 domain, and a CH3 domain. In some embodiments, a full-length heavy chain also includes a CH4 domain (e.g., IgE and IgM isotype antibodies). Functional fragments of heavy chains can specifically recognize epitopes (e.g., K in the micromolar, nanomolar, or picomolar range). D The fragment includes a cell that can recognize an epitope, is expressible and secretible from cells, and contains at least one CDR.

[0086] The term "light chain" includes immunoglobulin light chain sequences from any organism, and unless otherwise specified, includes human κ and λ light chains, as well as alternative light chains (e.g., VpreB, λ5, etc.). Unless otherwise specified, light chain variable domains typically contain three light chain CDRs and four framework (FR) regions. Generally, a full-length light chain contains V1-CDR1-FR2-CDR2-FR3-CDR3-FR4 from the amino terminus to the carboxyl terminus. L The light chain includes domains and constant domains of the light chain. Light chains include, for example, those that do not selectively bind to either a first or second epitope selectively bound by the epitope-binding protein on which the light chain appears. Light chains also include those that bind to and recognize one or more epitopes selectively bound by the epitope-binding protein on which the light chain appears, or those that assist the heavy chain in binding to and recognizing them. Examples of light chains include universal or common light chains, such as those derived from a single rearranged human light chain variable region, such as human Vκ1-39Jκ5 or human Vκ3-20Jκ1 as described herein, and their somatically mutant (e.g., affinity-matured) versions.

[0087] When used in reference to a rearranged variable region gene or variable domain that "derives" from a non-rearranged variable region and / or non-rearranged variable region gene segment, the term "derives from" means that the sequence of the rearranged variable region gene or variable domain can be traced back to a set of non-rearranged variable region gene segments that were rearranged to form the rearranged variable region gene expressing this variable domain (considering splicing differences and somatic mutations, where applicable). For example, a rearranged variable region gene that has undergone somatic hypermutation does not change the fact that it originates from a non-rearranged variable region gene segment. Furthermore, in the context of universal light chains, the term "derives from" may mean that the expressed antibody sequence can be traced back to a universal or single rearranged light chain present in the mouse genome. Such a light chain that originates from a single rearranged light chain sequence in the genome may differ from the single rearranged light chain sequence due to somatic hypermutation.

[0088] As used herein, the term “locus” refers to a region on a chromosome that contains a set of related genetic elements (e.g., genes, genetic segments, or regulatory elements). For example, a non-reorganized immunoglobulin locus may include an immunoglobulin variable region genetic segment, one or more immunoglobulin constant region genes, and related regulatory elements (e.g., promoters, enhancers, switch elements, etc.) that induce V(D)J recombination and immunoglobulin expression. Loci can be endogenous or non-endogenous. The term “endogenous locus” refers to a location on a chromosome where a particular genetic element is found naturally.

[0089] Conventional molecular biology, microbiology, and recombinant DNA techniques within the scope of the art can be used in accordance with the disclosures herein. Such techniques are well described in the literature. For example, Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1989 (herein referred to as "Sambrook et al., 1989"), DNA Cloning: A Practical Approach, Volumes I and II (DNGlover ed.1985), Oligonucleotide Synthesis (MJGait ed.1984), Nucleic Acid Hybridization[BDHames & SJHiggins eds.(1985)], Transcription And Translation[BDHames & SJHiggins, eds.(1984)], Animal Cell Culture[RIFreshney, ed.(1986)], Immobilized Cells And Enzymes[IRL Press,(1986)], B.Perbal,A Practical Guide To Molecular Cloning (1984), Ausubel, FMet See al. (eds.). Current Protocols in Molecular Biology. John Wiley & Sons, Inc., 1994 (each of these publications is incorporated herein by reference in its entirety). These techniques include site-directed mutagenesis.For example, Kunkel, Proc. Natl. Acad. Sci. USA 82:488-492 (1985), U.S. Patent No. 5,071,743, Fukuoka et al., Biochem. Biophys. Res. Commun. 263:357-360 (1999), Kim and Maas, BioTech. 28:196-198 (2000), Parikh and Guengerich, BioTech. 24:4 28-431 (1998), Ray and Nickoloff, BioTech. 13:342-346 (1992), Wang et al., BioTech. 19:556-559 (1995), Wang and Malcolm, BioTech. 26:680-682 (1999), Xu and Gong, BioTech.26:639-641(1999), U.S. Patent Nos. 5,789,166 and 5,932,419, Hogrefe, Strategies 14.3:74-75(2001), U.S. Patent Nos. 5,702,931, 5,780,270 and 6,242,222, Angag and Schutz, Biotech.30:486-488(2001), Wang and Wilkinson, Biotech.29:976-978(2000), Kang et al., Biotech.20:44-46(1996), Ogel and McPherson, Protein Engineer.5:467-468(1992), Kirsch and See also Joly, Nucl. Acids. Res. 26:1848-1850 (1998), Rhem and Hancock, J. Bacteriol. 178:3346-3349 (1996), Boles and Miogsa, Curr. Genet. 28:197-198 (1995), Barrenttino et al., Nucl. Acids. Res. 22:541-542 (1993), Tessier and Thomas, Meths. Molec. Biol. 57:229-237, and Pons et al., Meth. Molec. Biol. 67:209-218 (each of these publications is incorporated herein by reference in its entirety).

[0090] Method for identifying antigen-specific antibodies This disclosure provides methods for identifying and / or selecting sequences of antigen-binding proteins (e.g., antibodies) having human variable domains. Various methods described herein utilize nucleic acid sequencing and mass spectrometry (MS) to select antibody sequences (e.g., variable domain sequences or CDR sequences) that bind to a specific antigen. In exemplary embodiments, LC-MS and next-generation sequencing (NGS) are used to select an antibody or variable domain sequence from a group of variable domain sequences. In some embodiments, LC-MS and NGS utilize information about human immunoglobulin variable domains to identify and acquire an antibody targeting a given antigen. In some embodiments, the complementarity-determining region 3 (CDR3) of the antibody of interest is identified and acquired.

[0091] In various embodiments, the methods described herein enable the identification of antigen-specific antibody sequences from genetically modified non-human animals that cannot be readily detected by conventional methods. Known methods for identifying antibodies from genetically modified animals generally rely on the presence of viable B cells and / or antibody expression on the surface of B cells (e.g., by hybridoma technology). The methods provided herein enable the identification / isolation of antibodies in the absence of viable cells (e.g., B cells). In some embodiments, the methods provided herein enable the identification / isolation of secreted antibodies (e.g., in serum). The methods provided herein also enable the identification of antibodies from antibody sources not typically used in conventional antibody identification methods.

[0092] In some embodiments, the methods provided herein can be used in conjunction with conventional antibody identification / isolation methods to enrich and / or increase the pool of antibodies obtained from genetically modified animals against a target antigen. For example, the methods described herein can be used in conjunction with hybridoma technology or with methods including direct isolation from antigen-positive B cells. See, for example, U.S. Patent No. 7,582,298 (which is incorporated herein by reference in its entirety).

[0093] Adaptive immune responses are highly specific and function as long-term immune defenses that retain memory for future antigen encounters. Adaptive immune responses are antigen-specific and are partially mediated by V(D)J recombination or rearrangement. Immunoglobulin V(D)J recombination occurs in developing B cells in the bone marrow, enabling broad antigen recognition. VDJ rearrangement is the rearrangement of the variable (V), conjugation (J), and diversity (D) gene segments in the heavy chain of immunoglobulins. This process is similar in the light chain, but only VJ rearrangement occurs because the light chain lacks the D gene segment.

[0094] Importantly, V(D)J recombination, as well as other antibody diversification processes such as junctional nucleotide addition / removal and somatic hypermutation, generate a large repertoire of antibodies from a limited number of genes. These processes enable the production of specific, high-affinity antibodies against various antigens. This ability to generate antibodies is utilized in genetically modified animals to produce therapeutic antibodies against human targets.Genetically modified mice containing the human V(D)J gene segment (e.g., U.S. Patent Nos. 5,633,425, 5,770,429, 5,814,318, 6,075,181, 6,114,598, 6,150,584, 6,998,514, 7,795,494, 7,910,798, 8,232,449, 8,703,485, 8,907,157, and 9,145,588 (each of which is the whole) (which are incorporated herein by reference), and U.S. Patent Publications 2008 / 0098490, 2010 / 0146647, 2013 / 0145484, 2012 / 0167237, 2013 / 0167256, 2013 / 0219535, 2012 / 0207278, and 2015 / 0113668 (each of which is incorporated herein by reference in its entirety), and PCT Publications WO2007117410, WO 2008151081, WO2009157771, WO2010039900, WO2011004192, WO2011123708, WO2014093908, WO2014093908 , WO2006008548, WO2010109165, WO2016062990, WO2018039180, WO2011158009, WO2013041844, WO201304 Immunizing the target antigen with the antibodies described in Nos. 1846, WO2013079953, WO2013061098, WO2013144567, WO2013144566, WO2013171505, WO2019008123, and WO2020169022 (each of which is incorporated herein by reference in its entirety), antigen-specific antibodies are identified, purified, and subsequently screened for desired therapeutic properties.Other genetically modified mice containing the human V(D)J gene segment (e.g., U.S. Patent Nos. 6,596,541, 6,586,251, 8,642,835, 9,706,759, 10,238,093, 8,754,287, 10,143,186, 9,796,788, 10,130,081, U.S. Patent Publications No. 9,226,484, No. 9,012,717, No. 10,246,509, No. 9,204,624, and No. 9,686,970 (each of which is incorporated herein by reference in its entirety), as well as U.S. Patent Publications 2013 / 0212719, 2015 / 0289489, 2017 / 0347633, and 20 Publications 19 / 0223418, 2018 / 0125043, 2019 / 0261612, and 2019 / 0380316 (each of which is incorporated herein by reference in its entirety), PCT Publications WO2013138680, WO2013138712, WO2013138681, and WO2015042250 Immunizing mice with the antigen of interest (as described in WO2012148873, WO2013134263, WO2013184761, WO2014160179, WO2017214089, WO2016149678, and WO2017123808, as described in Murphy, A., “VelocImmune:Immunoglobulin Variable Region Humanized Mouse,” in Recombinant Antibodies for Immunotherapy, New York, NY, Cambridge University Press, 101-107 (2009) (each of which is incorporated herein by reference in its entirety)) to identify and purify antigen-specific antibodies, which are then screened for desired therapeutic properties. Specific exemplary genetically modified non-human animals that can be used in the methods described herein, such as rodents, such as rats or mice, are described in more detail in a separate section below.Various embodiments of the present invention make it possible to obtain therapeutic antibodies with desired properties from secreted antibody molecules obtained directly from immunized animals. To obtain secreted antibody molecules, the presence of viable cells expressing antibodies on the cell surface is not required. As described herein, obtaining antibodies with desired properties from an antibody population can be achieved using mass spectrometry, as discussed herein.

[0095] In the various embodiments described herein, the antibodies obtained / identified by this method may be of any isotype, such as IgM, IgD, IgG, IgA, and IgE. In some embodiments, the antibodies obtained / identified by this method are of the IgG isotype. In other embodiments, the antibodies obtained / identified by this method are of the IgM isotype.

[0096] In some embodiments, the antibodies or antigen-binding proteins obtained / identified by the methods provided herein are not single-domain antibodies, heavy-chain-only antibodies, and / or nanobodies.

[0097] In various embodiments, methods are provided herein for obtaining human immunoglobulin variable domains of antibodies specific to a particular antigen, the methods comprising: obtaining a plurality of nucleic acid sequences encoding a plurality of immunoglobulin variable domains obtained from a first sample from a host immunized with a particular antigen; determining the peptide sequences of the heavy chain and / or light chain variable domains of an antibody population obtained from a second sample from a host, which includes an antibody population targeting the antigen; and matching the amino acid sequences of the encoded plurality of immunoglobulin variable domains with the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population, thereby obtaining human immunoglobulin variable domains of antibodies specific to the antigen. In some embodiments, the matching includes aligning the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population with each other and with the amino acid sequences of the plurality of immunoglobulin variable domains.

[0098] In various embodiments, the method further includes obtaining a nucleotide sequence of the human variable domain of an antigen-specific antibody. Due to the degeneracy of the genetic code, multiple nucleotide sequences may encode the human variable domain of an antigen-specific antibody, and in some embodiments described herein, the nucleotide sequence may be optimized, for example, for expression in cells, for example, in mammalian cells.

[0099] Samples for sequencing This disclosure includes the recognition that information regarding specific antibodies having specific binding properties may be identified using NGS and MS techniques, as further described herein. While the sources of nucleic acids encoding antibodies and the antibodies themselves for use in the methods disclosed herein are not limited to animals, the methods disclosed herein are particularly advantageous when animals (e.g., genetically modified animals as described herein) are the source of both nucleic acid and antibody samples. The methods disclosed herein can still be used in conjunction with other antibody platform technologies or other antibody expression technologies, including, for example, those using phage display or intelligent design approaches.

[0100] Furthermore, this disclosure provides the recognition that antibodies derived from restricted heavy or light chain variable sequences enable simplification of NGS and MS analyses. This is because these analyses can focus on determining the repertoire of variable domains or CDRs, e.g., CDR3, or unrestricted immunoglobulin chains only. This disclosure also recognizes that antibodies derived from restricted heavy or light chain variable sequences can be obtained from genetically modified non-human animals, e.g., non-human animals containing restricted heavy or light chain variable sequences. Such animals offer advantages, for example, in that the antibodies they produce have undergone innate immune system processes, which may increase the likelihood of high-affinity and specific binding while simultaneously reducing the likelihood of immunogenicity.

[0101] In some embodiments, the antibody sequences analyzed by NGS include a population of antibodies with a limited light chain repertoire, such as a population of universal light chain antibodies. In some embodiments, the antibody sequences analyzed by NGS include a population of antibodies with a limited heavy chain repertoire, such as a population of universal heavy chain antibodies.

[0102] Nevertheless, current technologies enable the identification of full-length heavy and light chains in multiple immunoglobulin molecules using single-cell sequencing approaches (see, for example, DeKosky et al. (2015) Nat. Med. 21(1):85-91, Goldstein et al. (2019) Commun. Biol. 2:304, and Singh et al. (2019) Nat. Commun. 10(1):3120 (the entirety of which is incorporated herein by reference)). Thus, in some embodiments, multiple nucleic acid sequences encoding multiple immunoglobulin heavy and light chain variable domains can be obtained simultaneously from a first sample using a single B-cell next-generation sequencing approach, and therefore the method can encompass identification from non-human animal hosts without limitations on light or heavy chain sequences.

[0103] In some embodiments, the antigen of interest is a disease-associated antigen. In some embodiments, the disease-associated antigen is a tumor antigen. Various tumor antigens are listed in the database of T cell-defined tumor antigens (van der Bruggen P, Stroobant V, Vigneron N, Van den Eynde B. Peptide database: T cell-defined tumor antigens. Cancer Immun 2013). In some other embodiments, the antigen of interest is an infectious disease antigen, such as a viral or bacterial antigen. Non-human animals can be immunized with the antigen of interest in DNA or protein form using techniques known in the art.

[0104] In some embodiments, the first sample comprises a population of B cells. In some embodiments, the population of B cells is isolated from a bone marrow sample and / or a spleen sample. In additional embodiments, the first sample may be obtained from other lymphoid organs, such as lymph nodes or Peyer's patches in the intestines.

[0105] Those skilled in the art will understand that “B cells” may refer to a broad range of B cell subtypes, including but not limited to plasmablasts, plasma cells (e.g., long-lived plasma cells), memory B cells, and B-2 cells, FO B cells, and MZ B cells. Those skilled in the art will understand that B cells from different sources may be used in the first sample, depending on the desired source of the antibody obtained by the method described herein.

[0106] Sequencing analysis Sample preparation In some embodiments, the methods provided herein may include creating a nucleic acid library containing multiple nucleic acid molecules. In some embodiments, creating a nucleic acid library includes isolating multiple nucleic acids from a host. In some embodiments, the multiple nucleic acids are multiple RNA molecules, such as mRNA molecules.

[0107] In some embodiments, creating a nucleic acid library includes creating a cDNA library. In some embodiments, the cDNA library includes multiple cDNA molecules corresponding to multiple mRNA molecules isolated from a host. In some embodiments, the multiple cDNA molecules are double-stranded cDNA molecules.

[0108] In various embodiments of the present invention, multiple nucleic acid sequences encoding multiple immunoglobulin variable domains or CDRs are obtained from a sample obtained from an immunized host (i.e., the sequencing sample or first sample described above).

[0109] In some embodiments, multiple nucleic acid sequences encoding multiple immunoglobulin variable domains or CDRs are obtained from a first sample after obtaining a first sample from an immunized host. In some embodiments, the multiple nucleic acid sequences obtained from the first sample encoding multiple immunoglobulin variable domains include preparing cDNA from the nucleic acid sequences and sequencing the rearranged heavy chain VDJ sequences and / or rearranged light chain VJ sequences in the first sample.

[0110] In some embodiments, creating a nucleic acid library involves enriching multiple nucleic acid molecules. In some embodiments, enriching multiple nucleic acid molecules involves amplifying multiple nucleic acid molecules by, for example, PCR, for example, nested PCR. In some embodiments, enriching multiple nucleic acid molecules involves capturing multiple nucleic acid molecules. Capture techniques may include, for example, hybrid capture techniques.

[0111] In some embodiments, the methods provided herein involve attaching an index to each nucleic acid molecule in a nucleic acid library. The index may be sample-specific. In some embodiments, the index is 1 to 25 nucleotides long. In some embodiments, the index is 1 to 10 nucleotides long.

[0112] In some embodiments, the methods provided herein include attaching sequencing primers and / or their complementary sequences to each nucleic acid molecule in a nucleic acid library.

[0113] In some embodiments, multiple nucleic acid molecules in a nucleic acid library are fragmented. In some embodiments, nucleic acid molecules are fragmented by mechanical (e.g., sonication) or chemical (e.g., enzymatic) methods.

[0114] In some embodiments, the methods provided herein include performing size selection on nucleic acid molecules in a nucleic acid library. Size selection parameters may be determined based on the type of sequencing performed. In exemplary size selection, nucleic acids are sized to lengths in the range of 200–1000 bp, e.g., 400–900 bp, e.g., 400–700 bp.

[0115] In some embodiments, the methods provided herein include quantifying the amount of nucleic acid in a nucleic acid library. In some embodiments, the amount may be the total amount, for example, nanograms of nucleic acid. In some embodiments, the amount may be the concentration, for example, nanograms per milliliter of nucleic acid.

[0116] In some embodiments, multiple nucleic acid sequences encoding multiple immunoglobulin variable domains are determined using next-generation sequencing technology. In some embodiments, the multiple nucleic acid sequences encode a sufficient number of amino acid sequences to identify immunoglobulin variable domains that bind to a particular antigen. Examples of typical amino acid sequence counts may include tens, hundreds, thousands, or tens of thousands of sequences. In some embodiments, single-read sequences (e.g., sequences that produce only a single read sequence during a sequencing run) may be excluded from the final reference sequence database constructed from the multiple immunoglobulin variable regions determined using next-generation sequencing technology to reduce the impact of sequencing errors. Thus, in some embodiments, the number of unique amino acid sequences encoded by the nucleic acid sequences may be determined after excluding such single-read sequences.

[0117] Next-generation sequencing (NGS) The methods provided herein may include performing NGS sequencing. In some embodiments, the methods provided herein may include performing one or more NGS techniques.

[0118] Next-generation sequencing (NGS), also known as massively parallel sequencing or deep sequencing as used herein, relates to sequencing techniques that can sequence millions of tiny DNA fragments in parallel and detect variants in nucleic acid sequences. In some embodiments, nucleic acids are sequenced multiple times to obtain high-fidelity and high-depth results. NGS sequencing can be performed without physically separating individual reactions. While not wishing to be bound by theory, following nucleic acid extraction, NGS sequencing can be performed using a wide range of instruments and techniques, including targeted sequencing, whole exome sequencing, and whole genome sequencing, followed by library or template generation, and data analysis using bioinformatics. Generally, a wide range of platforms and bioinformatics tools exist for performing NGS and data analysis. See, for example, Levy SE and Myers RM, 2016 Annu. Rev. Genom. Hum. Genet. 17:95-115, Behjati S. and Tarpey PS, 2013 Arch Dis Child Pract Ed. 98(6):236-238, and Alekseyev, et al. 2018 Academic Pathology, 5:1-11. In some embodiments of the methods described herein, deeper sequencing increases the coverage of the antibody repertoire.

[0119] Exemplary NGS methods for use in accordance with this disclosure include sequencing techniques, including “second-generation sequencing,” “third-generation sequencing,” and “fourth-generation sequencing” techniques.

[0120] In some embodiments, the methods provided herein include sequencing by techniques including, but not limited to, 454 pyrosequencing, Ion Torrent sequencing, and Illumina sequencing.

[0121] In some embodiments, the methods provided herein include sequencing by 454 pyrosequencing. 454 pyrosequencing detects pyrophosphate, a byproduct of nucleotide incorporation, and reports whether a particular base has been incorporated into the elongating DNA strand (see Ronaghi, Karamohamed, Pettersson, Uhlen, & Nyren, Anal. Biochem. 1996 Nov 1;242(1):84-9, and also Slatko, Gardner, & Ausubel, Curr. Protoc. Mol. Biol. 2018;122(1):e59, both of which are incorporated herein by reference in their entirety). In a typical 454 sequencing method, individual DNA fragments (e.g., 400–900 bp in length, e.g., 400–700 bp) are ligated to an adapter and amplified by PCR in individual emulsion "beads" (emPCR) reactions. Since the DNA sequence on the beads can be complementary to the sequence on the adapter, DNA fragments can bind directly to the beads, ideally one fragment binding to each bead. Next, chemical detection of DNA synthesis and the subsequent DNA synthesis reaction is performed, and pyrophosphate release is measured. A picoliter-sized chamber containing the sample is filled with a sequencing reagent containing one of four nucleotides. When the correct nucleotide is incorporated into the synthesis chain, the release of pyrophosphate is measured using a photogenerative reaction. Homopolymers of nucleotides in the sequence, known as "runs," can be detected by measuring the intensity of the light produced by the reaction. Historically, 454 sequencing technology has been used for genome sequencing and metagenomic samples due to the long read lengths typically achieved (up to 600-800 nt) and relatively high throughput (over 99% accuracy in a 25 million base, 4-hour run) that facilitate genome assembly.

[0122] In some embodiments, the methods provided herein include sequencing by Ion Torrent sequencing. Ion Torrent® technology directly converts nucleotide sequences into digital information on a semiconductor chip (Rothberg et al., Nature 475, 348-352 (2011) (the whole text is incorporated by reference)). In DNA synthesis reactions, hydrogen ions are released when the correct nucleotide is incorporated opposite the base complementary to that nucleotide in the elongating DNA strand. The release of hydrogen ions changes the pH of the solution, which can be recorded as a voltage change by an ion sensor very similar to a pH meter. If no nucleotide is incorporated, no voltage spike occurs. By continuously filling and flushing a “sequencing chamber” with a sequencing reagent containing only one of four nucleotides at a time, a voltage change occurs when the appropriate nucleotide is incorporated. If two adjacent nucleotides incorporate the same nucleotide, two hydrogens are released and the voltage doubles. Thus, “runs” of single nucleotides can also be determined.

[0123] Ion Torrent sequencing begins with fragmenting DNA into 200-1500 base pairs fragments, which are then ligated onto an adapter. The DNA fragments are attached to the beads by complementary sequences on the beads and adapter, and then amplified on the beads by emulsion PCR (emPCR). Next, the beads are flowed through the tip-containing wells, with only one bead per well. Then, the sequencing reagent is flowed through the wells, and a signal is recorded when the appropriate nucleotides are incorporated, releasing hydrogen ions.

[0124] In some embodiments, the methods provided herein include sequencing by Illumina sequencing. Illumina sequencing is based on a technique known as “bridge amplification,” in which a DNA molecule (approximately 500 bp) ligated at both ends with appropriate adapters is used as a substrate for repeated amplification synthesis reactions on a solid support containing oligonucleotide sequences complementary to the ligated adapters. The oligonucleotides on the support are spaced apart so that the DNA, which is then repeatedly subjected to amplification rounds, produces clone “clusters” consisting of approximately 1000 copies of each oligonucleotide fragment. Each support can contain millions of parallel cluster reactions. During the synthesis reaction, modified nucleotides, each with a different fluorescent label corresponding to each of the four bases, are incorporated and then detected. The nucleotides also act as terminators for the synthesis of each reaction, and are unblocked after detection in the next synthesis round. The reaction is repeated for 300 or more rounds. Using fluorescence detection increases the detection rate compared to direct imaging, as opposed to camera-based imaging.

[0125] In some embodiments, the methods provided herein involve sequencing by single-molecule real-time (SMRT) sequencing. SMRT sequencing can sequence very long fragments up to 30–50 kb or longer. SMRT sequencing involves aligning a manipulated DNA polymerase, along with the bound DNA to be sequenced, to the bottom of a well (zero-mode waveguide (ZMW)) in an SMRT flow cell. A ZMW is a small chamber that directs light energy to a region of small dimensions compared to the wavelength of the illumination light. Depending on the design of the ZMW and the wavelength of light used, imaging often occurs only at the bottom of the ZMW, where the DNA-bound DNA polymerase incorporates each base into the elongating chain. Four nucleotides are labeled with different phosphate-binding fluorophores for differential detection. When the nucleotides are incorporated into the elongating chain, imaging occurs in milliseconds when the correct fluorescently labeled nucleotide is bound. After incorporation, the phosphate-binding fluorescence moiety is released, and thereafter it becomes undetectable. Subsequently, the next nucleotide may be incorporated. The imaging is timed to match the rate of nucleotide incorporation so that each base is identified as it is incorporated into the elongating DNA strand. This is done simultaneously and in parallel across up to 1 million zeptolters of ZMW present on a single chip within the SMRT cell.

[0126] Template preparation for SMRT sequencing involves creating a "SMRTbell," a circular double-stranded DNA molecule with known adapter sequences complementary to the primers used to initiate DNA synthesis on the template. This configuration allows polymerase to repeatedly read the large template and construct a consensus sequence (CCS, circular consensus sequence) by moving back and forth between the circular molecules within each ZMW until the polymerase stops. Since the adapters ligated on both sides of the insert each have a DNA synthesis priming site, the sequencing polymerase can move back and forth between the circular SMRTbell in the 5' to 3' direction on either DNA strand, providing complementary information from both strands of the "SMRTbell."

[0127] In some embodiments, the methods provided herein include sequencing by nanopore sequencing. In some embodiments, the methods provided herein include sequencing by in situ sequencing (ISS).

[0128] Bioinformatics In some embodiments, bioinformatics is used to analyze the data generated by sequencing. For example, in some embodiments, bioinformatics can be used to describe a specific region of the antibody or antigen-binding protein under analysis, such as the nucleic acid sequence of the immunoglobulin variable region, the amino acid sequence of the immunoglobulin variable domain, the nucleic acid sequence encoding the framework region or complementarity-determining region, or the amino acid sequence of the framework region or complementarity-determining region.

[0129] NGS sequencing typically generates a large amount of sequencing data. In some embodiments, sequence reads can be demultiplexed. In some embodiments, demultiplexing involves in silico sorting of sequence reads based on the sample or source from which the sequenced nucleic acid was obtained. Demultiplexing can be performed by in silico sorting of sequence reads based on an associated index. In some embodiments, after demultiplexing, the index sequence can be removed from the sequence read. In some embodiments, the identification of the index, source, or sample can be added to the sequence information associated with the sequence read.

[0130] In some embodiments, sequence reads are removed from further analysis ("screened and removed") based on a quality score (e.g., a Phred score). In some embodiments, the quality score represents the probability that one or more nucleotides in the sequence read are miscalled. In some embodiments, the quality score is a method of assigning confidence to specific bases within the read.

[0131] In some embodiments, sequence reads are removed from further analysis ("selectively removed") based on their sequence read length. For example, sequence reads that are too short or too long may be removed from the analysis.

[0132] In some embodiments, sequence reads are removed from further analysis ("selectively removed") based on the identity of a portion of the sequence reads with respect to known sequences. For example, in some embodiments, sequence reads may be removed from further analysis if a portion of the sequence reads corresponding to a primer (e.g., an IgG constant region primer) has less than 90%, less than 95%, or less than 100% identity with respect to the known sequence of the primer.

[0133] In some embodiments, sequence reads are removed from further analysis because only a small number of reads were detected for a particular nucleic acid sequence.

[0134] In some embodiments, unproductive rearrangements (e.g., those containing stop codons or out-of-frame rearrangements) can be removed before analysis.

[0135] In some embodiments, the method described herein includes performing NGS, which includes performing paired-end sequencing, and the method includes merging duplicate paired-end reads.

[0136] In some embodiments, duplicate reads can be removed. Duplicate reads are reads that correspond to the same original DNA fragment. Duplicate reads may be generated, for example, by the amplification step in sequencing techniques. In some embodiments, the removal of duplicate reads is performed before determining the amino acid sequences encoded by multiple nucleic acid sequences in the nucleic acid sequence library.

[0137] In some embodiments, sequencing information obtained by performing NGS is used to determine the consensus sequence corresponding to the original sequenced DNA fragment.

[0138] In some embodiments, nucleotide sequences obtained from NGS are ranked. In some embodiments, nucleotide sequences are ranked based on cDNA abundance, read length, and / or nucleotide sequence confidence. In some embodiments, the top 1,000 sequences from NGS analysis are ranked. In some embodiments, the top 500 sequences from NGS analysis are ranked. In some embodiments, the top 400 peptides obtained by MS are ranked. In some embodiments, the top 300 sequences from NGS analysis are ranked. In some embodiments, the top 200 sequences from NGS analysis are ranked. In some embodiments, the top 100 sequences from NGS analysis are ranked.

[0139] In some embodiments, multiple nucleic acid sequences obtained via NGS (e.g., those encoding immunoglobulin variable domains) are aligned to germline V(D)J sequences. In some embodiments, multiple nucleic acid sequences obtained via NGS (e.g., those encoding immunoglobulin variable domains) are aligned to germline V(D)J sequences and further analyzed to extract information such as variable region sequences, variable domain sequences, framework sequences, and / or CDR sequences (e.g., CDR3 sequences).

[0140] In some embodiments, sequencing reads are analyzed to determine the amino acid sequences they encode (e.g., by in silico translation) and the sequences are assembled into unique in-frame full-length amino acid sequences. In some embodiments, the provided method includes generating a library of amino acid sequences by in silico translating sequencing reads (e.g., those from a sequence read library).

[0141] In some embodiments, the amino acid sequences of these extracted nucleic acid sequences or CDR3 sequences are analyzed to obtain the corresponding amino acid sequences of the nucleic acid or CDR3 sequences (e.g., by in silico translation), and their amino acid sequences are determined by assembling the sequences into unique in-frame full-length amino acid sequences. In some embodiments, these unique amino acid sequences are used to construct a library of amino acid sequences representing multiple immunoglobulin variable domains or immunoglobulin CDRs.

[0142] When used herein, nucleic acid sequences encoding multiple immunoglobulin variable domains are approximately 10,000, 15,000, 20,000, 25,000, 30,000, 35,000, 40,000, 45,000, 50,000, 55,000, 60,000, 65,000, 70,000, 75,000, 80,000, 85,000, 90,000, 95,000, 100,000, 110, It contains nucleic acid sequences encoding approximately 10,000 to 500,000 unique amino acid sequences, including approximately 000, approximately 120,000, approximately 130,000, approximately 140,000, approximately 150,000, approximately 160,000, approximately 170,000, approximately 180,000, approximately 190,000, approximately 200,000, approximately 250,000, approximately 300,000, approximately 350,000, approximately 400,000, approximately 450,000, or approximately 500,000 unique amino acid sequences. In some embodiments, the nucleic acid sequences encoding multiple immunoglobulin variable domains are approximately 10 to 100,000 unique amino acid sequences, or approximately 10, approximately 25, approximately 50, approximately 75, approximately 100, approximately 250, approximately 500, approximately 750, approximately 1000, approximately 1500, approximately 2000, approximately 2500, approximately 3000, approximately 3500, approximately 4000, approximately 4500, approximately 5000, approximately 10,000, approximately 15,000 It may include nucleic acid sequences encoding 0, approximately 20,000, approximately 25,000, approximately 30,000, approximately 35,000, approximately 40,000, approximately 45,000, approximately 50,000, approximately 55,000, approximately 60,000, approximately 65,000, approximately 70,000, approximately 75,000, approximately 80,000, approximately 85,000, approximately 90,000, approximately 95,000, or approximately 100,000 unique amino acid sequences. In some embodiments, a plurality of nucleic acid sequences may encode about 10,000 to 80,000 unique amino acid sequences and may include about 10,000, about 15,000, about 20,000, about 25,000, about 30,000, about 35,000, about 40,000, about 45,000, about 50,000, about 55,000, about 60,000, about 65,000, about 70,000, about 75,000, or about 80,000 unique amino acid sequences.Furthermore, in some embodiments, only a single amino acid sequence may be required to identify the immunoglobulin variable domain that binds to a specific antigen.

[0143] Samples for peptide analysis In some embodiments, the methods provided herein include obtaining and / or determining multiple peptide sequences of human immunoglobulin heavy and / or light chain variable domains obtained from an antibody sample. In some embodiments, the antibody sample includes an antibody population obtained from an immunized host.

[0144] This disclosure encompasses the recognition that samples for peptide analysis can be enriched in vivo for antibodies having desired characteristics. For example, antibody samples can be enriched based on in vivo localization. Thus, in some embodiments, antibody-containing samples can be obtained from any desired source within the host, such as serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, placenta, or a combination thereof.

[0145] In some embodiments, the sample for peptide analysis is any body fluid containing antibodies, or includes them. In some embodiments, the sample for peptide analysis is a sample obtained from serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, placenta, or a combination thereof, or includes them. In some specific embodiments, the sample for peptide analysis is an antibody obtained from the serum of an immunized host (e.g., a non-human animal, e.g., a rodent), or includes them. In some embodiments, the sample for peptide analysis ("second sample") can be obtained from tissue lysates. In some embodiments, the second sample may contain various levels of circulating antibodies that can be isolated and sequenced. As described above, in some embodiments, the second sample may originate from a specific antibody source, e.g., a secreted antibody source, if evaluation of antibodies from a specific antibody source is desired. In some embodiments, the sample for peptide analysis includes antibodies obtained from a specific tissue to concentrate antibodies localized to that tissue.

[0146] In some embodiments, the sample for peptide analysis comprises an antibody population. In some embodiments, the sample for peptide analysis is enriched ex vivo for antibodies having desired characteristics. In some embodiments, the sample is enriched for antibodies using chromatography, such as ion-exchange chromatography. In some embodiments, the sample is enriched for antibodies having affinity for a specific target using affinity chromatography, for example. In some embodiments, affinity chromatography is used to remove antibodies having certain undesirable (e.g., off-target) binding affinities. In some embodiments, the sample for peptide analysis is enriched for antibodies having desired characteristics by exposing the antibodies to one or more conditions, such as heat and / or oxidation, in order to select the stability of the antibodies.

[0147] In some embodiments, the second sample contains antibodies targeting the antigen of interest from an immunized host, and is depleted of antibodies not targeting the antigen of interest. Sample depletion can be achieved using a variety of methods, including but not limited to chromatography, affinity purification, size exclusion, buffer exchange, albumin depletion techniques, protease inhibitors, immunoglobulin depletion techniques, and high-abundance protein depletion. In some embodiments, if the immunogen forms a complex with an adjuvant during immunization of a non-human animal, the second sample can be depleted of antibodies targeting that adjuvant. In some embodiments, where the immunogen is fused to an Fc portion, the second sample is depleted of antibodies targeting Fc. In other embodiments, the immunogen may be fused to a tag, such as His, FLAG, Myc, HA, GST, GFP, V5, etc., and the second sample is depleted of antibodies targeting that tag.

[0148] In some embodiments, the second sample is enriched with antibodies targeting the antigen of interest. Similar to depletion methods, sample enrichment can be achieved using a variety of methods, including chromatography, affinity purification, and size exclusion. In some embodiments, the second sample may be enriched by a variety of methods, including binding to an antigen immunogen. Since the enrichment step may depend on the binding of the antibody to the polypeptide, in this step the antibody pool may be matched for specific characteristics of the antibody of interest. In one example, the second sample may be enriched for the antibody of interest based on its ability to bind to the antigen under specific binding conditions. For example, the second sample may be enriched for the antibody of interest based on its ability to bind to a specific isoform / variant of the antigen, a specific fragment / epitope of the antigen, the monomeric or oligomeric form of the antigen, or other desired conformation of the antigen. In some embodiments, the sample for peptide analysis is enriched for a specific Ig class by affinity chromatography using, for example, protein A (or anti-IgA and anti-IgM antibodies for affinity purification of other major Ig classes).

[0149] In some embodiments, a sample containing an antibody population is digested and / or fragmented before peptide analysis. In some embodiments, an antibody sample for peptide analysis is digested into peptides. In some embodiments, an antibody sample for peptide analysis is enzymatically digested into peptides (e.g., using trypsin and / or pepsin). In some embodiments, an antibody sample for peptide analysis is denatured and reduced before digestion. In some embodiments, an antibody sample for peptide analysis is alkylated (e.g., using iodoacetamide) before digestion. In some embodiments, an antibody sample for peptide analysis is denatured, reduced, and / or alkylated and then enzymatically digested (e.g., using trypsin and / or pepsin). In some embodiments, the sample is divided into multiple aliquots digested with different enzymes and / or over different time periods. In some embodiments, the sample is divided into at least two aliquots digested with at least two different enzymes.

[0150] In some embodiments, the antibody is digested into a peptide and sequenced using MS analysis (e.g., tandem mass spectrometry). In some embodiments, the peptide sequence from the MS analysis is matched against a library of antibody sequences.

[0151] In some embodiments, antibody peptides are separated and / or divided by chromatography, such as liquid chromatography. In some embodiments, antibody peptides are separated and / or divided by high-performance liquid chromatography. In some embodiments, antibody peptides are separated and / or divided by reverse-phase chromatography.

[0152] In certain embodiments, the CDR3 peptide can be enriched from unrelated peptides via specific conjugation of unique Cys molecules at the terminus of the CDR3 sequence using a thiol-specific reagent that enables the purification of such peptides. In some embodiments, an antibody sample for peptide analysis is digested into multiple peptides (e.g., enzymatically), and these multiple peptides are enriched for the CDR3 peptide using a thiol-specific reagent.

[0153] MS and Library Matching In some embodiments, the methods described herein utilize mass spectrometry (MS). Mass spectrometry obtains molecular weight and structural information of a chemical compound by ionizing molecules and measuring either their time of flight or the response of molecular orbitals to an electric and / or magnetic field.

[0154] This disclosure further intends that any MS method may be adapted for use in the methods of this disclosure. Exemplary MS methods include, but are not limited to, tandem MS (MS / MS), LC-MS, LC-MS / MS, matrix-assisted laser desorption / ionization mass spectrometry (MALDI-MS), Fourier transform mass spectrometry (FTMS), ion mobility separation with mass spectrometry (IMS-MS), electron transfer dissociation (ETD-MS), and combinations thereof. Such methods are described, for example, in Pitt, Clin. Biochem. Rev. 30:19-34 (2009). Mass spectrometers that can be used in the methods of this disclosure are known in the art and are commercially available, for example, from Agilent Inc., Bruker Corporation, and Thermo Scientific.

[0155] In some embodiments, the peptide sequence of the second sample is determined using mass spectrometry analysis of the heavy and / or light chain variable domains of the antibody population. In some embodiments, the mass spectrometry analysis is a combination of liquid chromatography and mass spectrometry (LC-MS) followed by proteolytic digestion of the heavy and / or light chain variable domains of the antibody population. However, alternative separation and mass spectrometry methods can be used, including accelerator mass spectrometry, gas chromatography-mass spectrometry (GC-MS), ion mobility spectroscopy-MS, matrix-assisted laser desorption-ionization time-of-flight (MALDI-TOF), and surface-enhanced laser desorption-ionization (SELDI-TOF). Generally, top-down proteomics can also be used, which preserves the mass information of intact proteins by analyzing them without digesting them. See Chen et al. 2018 Anal Chem. 90(1):110-127. In some embodiments, the provided method incorporates multidimensional high-pressure liquid chromatography (LC / LC) and / or tandem mass spectrometry (MS / MS).

[0156] In some specific embodiments, MS analysis is quantitative.

[0157] In some embodiments, peptide sequences obtained from MS analysis are ranked. In some embodiments, peptide sequences are ranked based on peptide abundance and / or peptide confidence. In some embodiments, the top 1,000 peptides obtained by MS are ranked. In some embodiments, the top 500 peptides obtained by MS are ranked. In some embodiments, the top 400 peptides obtained by MS are ranked. In some embodiments, the top 300 peptides obtained by MS are ranked. In some embodiments, the top 200 peptides obtained by MS are ranked. In some embodiments, the top 100 peptides obtained by MS are ranked. In some embodiments, the MS spectral quality of the top-ranked peptide sequences is manually verified.

[0158] In various embodiments, peptide sequences obtained through MS analysis (e.g., peptide sequences of heavy and / or light chain variable domains) are compared with amino acid sequences of multiple immunoglobulin variable domains obtained from sequence analysis (e.g., of the first sample). In some embodiments, peptide sequences are compared with amino acid sequences obtained by translation of nucleotide sequences obtained by NGS (e.g., of the first sample).

[0159] In some embodiments, matching the amino acid sequences of multiple immunoglobulin variable domains with the peptide sequences of heavy chain and / or light chain variable domains of an antibody population includes aligning the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population with each other and with the amino acid sequences of multiple immunoglobulin variable domains. As used herein, alignment also means comparing the peptide sequences of the heavy chain and / or light chain variable domains of an antibody population with the amino acid sequences of multiple immunoglobulin variable domains, and optionally with each other. The peptide sequences obtained by mass spectrometry analysis of a second sample may, in some embodiments, be screened against a library containing multiple variable domains obtained from a first sample. As intended by this disclosure, amino acid sequence matching may be carried out using a variety of methods.

[0160] In some embodiments, peptide sequences obtained through mass spectrometry analysis of a second sample are mapped and / or searched against a library of antibody sequences (e.g., variable domain sequences and / or CDR sequences) obtained from sequencing analysis (e.g., of a first sample) using commercially available software (e.g., Mascot (Martix Science), PEAKS (Bioinformatics Solutions, Inc.), Sequest (ThermoFisher Scientific), Byonic (Protein Metrics)). Based on various criteria, the sequence of the variable domain of the antibody of interest is obtained.

[0161] In some embodiments, obtaining antigen-specific human immunoglobulin heavy and / or light chain variable domains or CDRs is based on one or more of the following: (1) matching (e.g., specific homology) of a unique peptide obtained from a second sample with a CDR3 sequence in the amino acid sequence obtained from a first sample; (2) matching (e.g., specific homology) of a unique peptide obtained from a second sample with a CDR1 and / or CDR2 sequence in the amino acid sequence obtained from a first sample; (3) matching (e.g., specific homology) of one or more unique peptides obtained from a second sample with one or more framework sequences in the amino acid sequence obtained from a first sample; (4) next-generation sequencing count; (5) exclusion of CDR sequences containing methionine; and (6) exclusion of CDR sequences that may be N-glycosylated. In some embodiments, obtaining antigen-specific human immunoglobulin heavy and / or light chain variable domains or CDRs is based on combinations of two or more, three or more, four or more, five or more, or all six of these parameters.

[0162] In some embodiments, obtaining a human immunoglobulin heavy chain variable domain or CDR of an antigen-specific antibody is based on the homology of unique peptides obtained from MS analysis of CDR sequences and / or framework sequences in a library. In some embodiments, the library contains amino acid sequences of antibody heavy chain variable domains corresponding to nucleic acid sequences obtained by NGS (e.g., of a first sample obtained from an immunized host).

[0163] In some embodiments, the library is matched using peptide sequences obtained from MS analysis, and only amino acid sequences that share at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity are selected.

[0164] In some embodiments, matching includes querying a library for sequences homologous to the peptide sequence (e.g., CDR sequence) obtained through MS analysis. In some embodiments, matching includes querying a library for sequences homologous to the CDR3 peptide sequence obtained through MS analysis. In some embodiments, matching includes querying a library for sequences homologous to the CDR3 peptide sequence obtained through MS analysis by at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% homologous to the CDR3 peptide sequence obtained through MS analysis. In some embodiments, matching includes querying a library for sequences homologous to the CDR3 peptide sequence obtained through MS analysis by 100% homologous to the CDR3 peptide sequence.

[0165] In some embodiments, peptide sequences obtained through MS analysis of a second sample are searched against a library of antibody sequences (e.g., variable domain sequences and / or CDR sequences) obtained from sequencing analysis using one or more of the following search parameters: enzyme cleavage sites, enzyme digestion specificity, enzymatic miscleavage, mass tolerance, and / or fixation modifications. In some embodiments, peptide sequences corresponding to the CDR of the antibody variable domain (e.g., CDR3) obtained through MS of the sample are mapped and / or searched against a library of antibody sequences (e.g., CDR sequences) obtained from sequencing analysis (e.g., of a first sample) using commercially available software.

[0166] In various embodiments of the present invention, the agreement of a peptide obtained from mass spectrometry analysis of a second sample with a library of amino acid sequences generated via NGS includes peptides that are 80% or more identical to the sequences obtained by NGS. In some embodiments, the identity percentage of a peptide obtained from mass spectrometry analysis of a second sample with respect to a library of amino acid sequences generated via NGS is at least about 80%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identical to the sequences obtained by NGS. The term "identity," as used herein in relation to alignment or comparison of a peptide sequence with a sequence obtained by NGS, refers to identity determined by a number of different algorithms known in the art used to measure nucleotide sequence and / or amino acid sequence identity. In further embodiments, the agreement may be a strict agreement of the peptide sequence with the sequence obtained by NGS. In some embodiments, the peptides obtained from MS analysis may cover all or part of a CDR or framework sequence in an NGS database.

[0167] In some embodiments, the obtained antibodies, variable domains, and / or CDR sequences are selected based on one or more criteria. In some embodiments, antibody sequences (or parts thereof) are classified based on homology. In some embodiments, the obtained antibodies and / or variable domain sequences are classified based on homology of one or more CDRs. In some embodiments, the obtained antibodies and / or variable domain sequences are classified based on homology of CDR3.

[0168] In some embodiments, immunoglobulin heavy chain variable domain sequences are classified based on homology. In some embodiments, immunoglobulin light chain variable domain sequences are classified based on homology.

[0169] In some embodiments, peptide sequences mapped onto a library of antibody sequences (e.g., variable domain sequences and / or CDR sequences) obtained from sequencing analysis are ranked. In some embodiments, peptide sequences are ranked based on sequence coverage and / or peptide confidence. In some embodiments, the top 1,000 antibody hits are ranked. In some embodiments, the top 500 antibody hits are ranked. In some embodiments, the top 400 antibody hits are ranked. In some embodiments, the top 300 antibody hits are ranked. In some embodiments, the top 200 antibody hits are ranked. In some embodiments, the top 100 antibody hits are ranked. In some embodiments, the MS spectral quality of the top-ranked peptide sequences is manually verified.

[0170] In some embodiments, the identified immunoglobulin heavy chain and / or light chain variable domain sequences are expressed as recombinant antigen-binding proteins (e.g., antibodies). In some embodiments, the identified immunoglobulin heavy chain and / or light chain variable domain sequences are codon-optimized and expressed as recombinant antigen-binding proteins.

[0171] In some embodiments, recombinant antigen-binding proteins (e.g., antibodies) containing the identified variable domain sequence are characterized. In some embodiments, binding affinity to a target is evaluated for recombinant antibodies containing the identified variable domain sequence.

[0172] Non-human animals The methods provided herein involve the use of non-human animals. Exemplary non-human animals for use in the methods of this disclosure are described below in detail. However, in various embodiments, the host (e.g., an immunized host) is a genetically modified non-human animal, e.g., a non-human mammal, whose genome includes an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0173] In some embodiments, the genetically modified non-human animal can be any non-human animal. In some embodiments, the non-human animal is a vertebrate. In some embodiments, the non-human animal is a mammal. In some embodiments, the genetically modified non-human animals described herein may be selected from the group consisting of mice, rats, rabbits, pigs, bovids (e.g., cows, bulls, buffalo), deer, sheep, goats, llamas, chickens, cats, dogs, ferrets, and primates (e.g., marmosets, rhesus monkeys). For non-human animals in which suitable genetically modifiable ES cells are not readily available, other methods may be used to produce non-human animals containing the genetic modifications described herein. Such methods include, for example, modifying a non-ES cell genome (e.g., fibroblasts or induced pluripotent cells) and using nuclear transfer to transplant the modified genome into a suitable cell such as an oocyte, and growing the modified cell (e.g., modified oocyte) in a non-human animal under conditions suitable for embryo formation.

[0174] In some embodiments, the non-human animal is a mammal. In some embodiments, the non-human animal is a small mammal, for example, an animal of the superfamily Dipodoidea or Muroidea. In some embodiments, the non-human animal is a rodent. In certain embodiments, the rodent is a mouse, rat, or hamster. In some embodiments, the rodent is selected from the superfamily Muroidea. In some embodiments, the non-human animal is derived from a family selected from Calomyscidae (e.g., mouse-like hamster), Cricetidae (e.g., hamster, New World rat and mouse, field vole), Muridae (e.g., purebred mouse and rat, gerbil, spiny mouse, maned rat), Nesomyidae (e.g., tree mouse, rock mouse, white-tailed rat, Madagascar rat and mouse), Platacanthomyidae (e.g., spiny dormouse), and Spalacidae (e.g., blind mouse, bamboo mouse, and mole mouse). In some embodiments, the rodent is selected from purebred mice or rats (Muridae family), gerbils, spiny mice, and maned mice. In some embodiments, the mouse is derived from a species of the Muridae family. In some embodiments, the non-human animal is a rodent. In some embodiments, the rodent is selected from mice and rats. In some embodiments, the non-human animal is a mouse.

[0175] In some embodiments, the non-human animal is a mouse of the C57BL strain. In some embodiments, the C57BL strain is selected from C57BL / A, C57BL / An, C57BL / GrFa, C57BL / KaLwN, C57BL / 6, C57BL / 6J, C57BL / 6ByJ, C57BL / 6NJ, C57BL / 10, C57BL / 10ScSn, C57BL / 10Cr, and C57BL / Ola. In some embodiments, the non-human animal is a mouse of the 129 strain. In some embodiments, the 129 strain is selected from the group consisting of strains 129P1, 129P2, 129P3, 129X1, 129S1 (e.g., 129S1 / SV, 129S1 / SvIm), 129S2, 129S4, 129S5, 129S9 / SvEvH, 129S6 (129 / SvEvTac), 129S7, 129S8, 129T1, and 129T2. In some embodiments, the genetically modified mouse is a hybrid of the 129 strain and the C57BL strain. In some embodiments, the mouse is a hybrid of the 129 strain and / or the C57BL / 6 strain. In some embodiments, the hybrid of the 129 strain is the 129S6 (129 / SvEvTac) strain. In some embodiments, the mouse is the BALB strain (e.g., BALB / c). In some embodiments, the mouse is a hybrid of the BALB strain and another strain (e.g., the C57BL strain and / or the 129 strain). In some embodiments, the non-human animals provided herein may be mice derived from any combination of the aforementioned strains.

[0176] In some embodiments, the non-human animal provided herein is a rat. In some embodiments, the rat is selected from Wistar rat, LEA strain, Sprague-Dolly strain, Fisher strain, F344, F6, and Dark Agouti. In some embodiments, the rat strain is a mixture of two or more strains selected from the group consisting of Wistar, LEA, Sprague-Dolly, Fisher, F344, F6, and Dark Agouti.

[0177] Therefore, in some embodiments, the immunized non-human animal host is a rodent, e.g., a rat or mouse. Therefore, in some embodiments, the host is a genetically modified rodent having one or more human heavy chain V gene segments in its genome (human V H (Also called a gene segment) and one or more human D gene segments (Human D H (Also called a gene segment) and one or more human heavy chain J gene segments (Human J H A genetically modified rodent comprising: an immunoglobulin heavy chain variable region including (also called a gene segment), wherein the heavy chain variable region is operably linked to a constant region; an immunoglobulin light chain variable region including one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0178] In some embodiments, the host is a genetically modified mouse whose genome comprises an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a murine constant region, and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a murine constant region.

[0179] In one embodiment, an immunoglobulin heavy chain variable region containing human heavy chain V, D, and J gene segments is operably linked to the mouse heavy chain constant region, and an immunoglobulin light chain variable region containing human light chain V and J gene segments is operably linked to the mouse light chain constant region. In a further embodiment, an immunoglobulin heavy chain variable region containing human heavy chain V, D, and J gene segments operably linked to the mouse heavy chain constant region resides at the endogenous mouse heavy chain locus, and an immunoglobulin light chain variable region containing human light chain V and J gene segments operably linked to the mouse light chain constant region resides at the endogenous mouse light chain locus. Various embodiments of genetically modified non-human animals, such as rodents, such as mice, are described in more detail later herein.

[0180] In some embodiments, the host is a genetically modified non-human animal containing a restricted heavy-chain or restricted light-chain variable sequence (for example, a limited repertoire of heavy-chain or light-chain variable V(D)J gene segments, e.g., a single rearranged heavy-chain or light-chain variable sequence, as described later herein).

[0181] Genetically modified hosts for identifying antigen-specific antibodies The antibodies of the present invention are obtained by first immunizing a non-human animal host with the antigen of the choice. Therefore, in some embodiments, the immunized non-human animal host described herein is a rodent, e.g., a rat or mouse. In some embodiments, the immunized non-human animal host described herein is a genetically modified non-human animal host, e.g., a genetically modified rodent. Various embodiments of the genetically modified non-human animal, e.g., a rodent, e.g., a rat or mouse, are described in more detail later herein.

[0182] In some embodiments, the immunized non-human animal host is a rodent, such as a rat or mouse. In some embodiments, the host is a genetically modified rodent having a genome comprising: an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, the light chains operably linked to a constant region.

[0183] In some embodiments, the host is a genetically modified mouse whose genome comprises an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a murine (e.g., rat or mouse) constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a murine constant region.

[0184] In some embodiments, the immunoglobulin heavy chain variable region is operably linked to a murine heavy chain constant region, and the immunoglobulin light chain variable region is operably linked to a murine light chain constant region. In some embodiments, the immunoglobulin heavy chain variable region operably linked to the murine heavy chain constant region is present at an endogenous murine heavy chain locus, and the immunoglobulin light chain variable region operably linked to the murine light chain constant region is present at an endogenous murine light chain locus. One exemplary embodiment is described in Macdonald et al., Proc. Natl. Acad. Sci. USA 111:5147-52 and Supplementary Information (www.pnas.org / cgi / content / short / 1323896111) (which is incorporated herein by reference in its entirety). Various embodiments of genetically modified non-human animals, such as rodents, such as rats or mice, are described in more detail hereinbelow.

[0185] In some embodiments, the genetically modified rodent has in its genome (e.g., its germline genome), upstream of (e.g., operably linked to) one or more rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes (e.g., one or more endogenous rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes), one or more non-rearranged human V H gene segments, and one or more non-rearranged human D H gene segments, and one or more non-rearranged human J HThe invention includes a genetic segment and an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus). Such an engineered immunoglobulin heavy chain locus is referred to herein as the “HoH locus.” Rodents containing the HoH locus are exemplified, for example, in U.S. Patents 6,596,541, 8,642,835, and 8,697,940, and in Murphy, A., “VelocImmune:Immunoglobulin Variable Region Humanized Mouse,” in Recombinant Antibodies for Immunotherapy, New York, NY, Cambridge University Press, 101-107 (2009), each of which is incorporated in whole by reference. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH locus. In some embodiments, genetically modified rodents (e.g., rats or mice) are heterozygous at the HoH locus.

[0186] In some embodiments, one or more unreorganized human V H The gene segment contains at least six human V H Includes gene segments. In some embodiments, one or more unreorganized human V H The gene segment contains at least 18 human V H Includes gene segments. In some embodiments, one or more unreorganized human V H The gene segment contains at least 39 human V H Includes gene segments. In some embodiments, one or more unreorganized human V H The gene segment contains at least 80 human V H Includes gene segments. In some embodiments, one or more unreorganized human D H The gene segment contains at least 27 human D H Includes gene segments. In some embodiments, one or more unreorganized human J HThe gene segment includes at least six human J H gene segments.

[0187] In some embodiments, one or more non-rearranged human V H gene segments include all functional human V H gene segments. In some embodiments, one or more non-rearranged human V H gene segments include less than 80 human V H gene segments. In some embodiments, one or more non-rearranged human V H gene segments include less than 39 human V H gene segments. In some embodiments, one or more non-rearranged human V H gene segments include less than 18 human V H gene segments. In some embodiments, one or more non-rearranged human V H gene segments include less than 10 human V H gene segments.

[0188] In some embodiments, one or more non-rearranged human V H gene segments include at least 18 human V H gene segments and include one or more non-rearranged human D H gene segments include 27 human D H gene segments and include one or more non-rearranged human J H gene segments include six human J<\ H gene segments. Such engineered immunoglobulin heavy chain loci are referred to herein as "VelocImmune® 1 HoH loci". In some embodiments, one or more non-rearranged human V H gene segments include at least 39 human V H gene segments and include one or more non-rearranged human D H gene segments include 27 human D H gene segments and include one or more non-rearranged human J H It should be noted that in the original text, there is an incorrect tag <\ H which should be H . This has been corrected in the translation.The gene segment consists of six human J H This includes a gene segment. Such an engineered immunoglobulin heavy chain locus is referred to herein as the "VelocImmune® 2 HoH locus." In some embodiments, one or more unreorganized human V H The gene segment contains at least 80 human V H Includes gene segments and one or more unreorganized human D H The gene segment consists of 27 human D H Includes a gene segment and one or more unreorganized human J H The gene segment consists of six human J H This includes a gene segment. Such an engineered immunoglobulin heavy chain locus is referred to herein as the "VelocImmune®3 HoH locus".

[0189] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH locus produce antibodies, for example, in response to antigen stimulation, which include, in particular, heavy chains, each heavy chain comprising a human heavy chain variable domain operably linked to a rodent (e.g., rat or mouse) heavy chain constant domain.

[0190] In some embodiments, genetically modified rodents have one or more unreorganized human V genes in their genome (e.g., their germline genome). H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J HThe genetic segment includes an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) and further includes at least one histidine substitution or insertion of a non-histidine residue, so that the non-reorganized immunoglobulin heavy chain gene sequence includes at least one non-histidine codon substitution or at least one histidine codon insertion in the sequence encoding complementarity-determining region 3 (CDR3) (see, for example, PCT Publications WO2013 / 138712 and WO2013 / 138681, which are incorporated herein by reference in their entirety). Immunizing genetically modified rodents containing histidine substitution or histidine insertion facilitates the identification of antibodies exhibiting pH-dependent characteristics against antigens using a combination of repertory sequencing and MS methods as described herein and in the examples.

[0191] In some embodiments, genetically modified rodents (e.g., rats or mice) include engineered immunoglobulin heavy chain loci in their genome (e.g., their germline genome), such as those containing restricted heavy chain variable region sequences that include a limited repertoire of human heavy chain variable regions.

[0192] In some embodiments, the genetically modified rodent has a single human V gene in its genome (e.g., its germline genome) that is upstream (e.g., operably ligated to) one or more rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes (e.g., one or more endogenous rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes). H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J HThe genetic segment includes an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus). Genetically modified rodents having such an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) are exemplified, for example, in U.S. Patent Publication 2019 / 0261612 and U.S. Patent No. 10,238,093, each of which is incorporated by reference in whole.

[0193] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) in its genome (e.g., its germline genome) that includes a single rearranged human heavy chain variable region upstream of (e.g., operably linked to) one or more rodent (e.g., rat or mouse) constant region genes. Such an engineered immunoglobulin heavy chain locus is referred herein to as the “UHC locus,” “universal heavy chain locus,” or “common heavy chain locus.” A rodent containing a UHC locus is exemplified, for example, in U.S. Patent No. 9,204,624, which is incorporated in its entirety by reference.

[0194] In some embodiments, a single reorganized human heavy chain variable region is a single human V H Gene segment, single human D H Gene segment, and single human J H Includes a gene segment. In some embodiments, a single human V H The gene segment is human V H 3-23, single human D H The gene segment is human D H 4-4, single human J H The gene segment is human J H The answer is 4.

[0195] In some embodiments, a single reorganized human heavy chain variable region is a single human V HGenetic segments and single human J H It contains gene segments, which are separated by two amino acids. In some embodiments, a single human V H The gene segment is human V H 3-23, a single human J H The gene segment is human J H The result is 4, and the two amino acids are glycine and tyrosine.

[0196] In some embodiments, one or more rodent (e.g., mouse or rat) heavy chain constant region genes are one or more endogenous rodent (e.g., mouse or rat) heavy chain constant region genes.

[0197] In some embodiments, genetically modified rodents (e.g., rats or mice) containing a UHC locus produce antibodies, for example, in response to antigen stimulation, which include, among other things, immunoglobulin chains, each immunoglobulin chain containing a human heavy chain variable domain operably linked to a constant domain.

[0198] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has one or more unreorganized human V genes upstream of (e.g., operably linked to) one or more rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes (e.g., one or more endogenous rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes) in its genome (e.g., its germline genome). L Gene segment and one or more unreorganized human J LThe genetic segment includes an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus). In some embodiments, such a genetically modified rodent includes a hybrid heavy chain locus having both a light chain (e.g., light chain variable region) sequence and a heavy chain (e.g., heavy chain constant region) sequence. Such an engineered immunoglobulin heavy chain locus is referred to herein as a “LoH locus”. Rodents containing LoH loci are exemplified, for example, in U.S. Patent No. 9,686,970 and U.S. Patent Publication No. 2013 / 0212719, each of which is incorporated in whole by reference. In some embodiments, one or more unreorganized human V L Genetic segments and one or more unreorganized human J L The gene segment is one or more unreorganized human Vκ gene segments and one or more unreorganized human Jκ gene segments. In some embodiments, one or more unreorganized human V L Genetic segments and one or more unreorganized human J L The gene segment consists of one or more unreorganized human Vλ gene segments and one or more unreorganized human Jλ gene segments. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the LoH locus.

[0199] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the LoH locus produce antibodies, for example, in response to antigen stimulation, which include, among other things, immunoglobulin chains, each immunoglobulin chain containing a human light chain variable domain operably linked to a rodent (e.g., rat or mouse) heavy chain constant domain.

[0200] In some embodiments, immunized rodents produce antibodies containing two immunoglobulin heavy chains and two immunoglobulin light chains. In some embodiments, immunized rodents do not produce single-domain antibodies, heavy-chain-only antibodies, and / or nanobodies.

[0201] In some embodiments, the genetically modified rodents (e.g., rats or mice) provided herein have a genome (e.g., germline genome) that includes a modification comprising a deletion of the nucleic acid sequence encoding the CH1 domain of the endogenous IgG constant region gene, which is referred to herein as the "CH1 deletion modification." In some embodiments, the genetically modified rodents (e.g., rats or mice) comprising the CH1 deletion modification produce IgG heavy chain antibodies, among other things, including immunoglobulin heavy chains, each immunoglobulin heavy chain lacking all or part of the CH1 domain. In some embodiments, the genetically modified rodents (e.g., rats or mice) provided herein have a genome (e.g., germline genome) containing a sequence encoding a heavy-chain-only immunoglobulin, which includes an unreorganized human heavy-chain variable region in a operable linkage with an endogenous heavy-chain constant region, the endogenous heavy-chain constant region comprising (1) an intact endogenous IgM gene encoding an IgM isotype that associates with a light chain, and (2) a non-IgM gene, e.g., an IgG gene, lacking a sequence encoding a functional CH1 domain, the non-IgM gene encoding a non-IgM isotype lacking a CH1 domain that can covalently bind to the light-chain constant domain. In some embodiments, the produced IgG antibody also lacks a corresponding light chain and secretes an IgG heavy-chain-only antibody into its serum. Exemplary rodents containing CH1 deletion modifications are described, for example, in U.S. Patent No. 8,754,287, U.S. Patent Publication No. 2015 / 0289489, and PCT Publications WO2006 / 008548, WO2010 / 109165, and WO2016062990 (each incorporated herein by reference in its entirety). In some embodiments, immunized rodents produce single-domain antibodies, heavy-chain-only antibodies, and / or nanobodies.

[0202] In some embodiments, the present disclosure provides a method for identifying human immunoglobulin heavy chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having a heavy chain immunoglobulin variable region containing a CH1 deletion modification in its germline genome, comprising: (i) obtaining a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by genetically modified rodents immunized with the antigen; and (ii) matching a library of human immunoglobulin heavy chain variable domain sequences with the plurality of peptide sequences, wherein the library comprises a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of the immunized rodent.

[0203] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin heavy chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent whose germline genome contains a heavy chain immunoglobulin variable region including a CH1 deletion modification, comprising: (i) obtaining a library of human immunoglobulin heavy chain variable domain sequences comprising a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of a rodent immunized with the antigen; and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by a rodent immunized with the antigen.

[0204] In some embodiments, the genetically modified rodents (e.g., rats or mice) provided herein have a genome (e.g., germline genome) containing an engineered immunoglobulin heavy chain (e.g., HoH, UHC, LoH) locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) that lacks the functional endogenous rodent Adam6 gene. In some embodiments, the genetically modified rodents (e.g., rats or mice) provided herein have a genome (e.g., germline genome) containing one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, one or more nucleotide sequences. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6b or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a and mouse ADAM6b or include them. Rodents comprising one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments are exemplified, for example, in U.S. Patents 8,642,835, 8,697,940, 9,706,759, 10,130,081, 10,238,093, and U.S. Patent Publication 2013 / 0212719 (each of which is incorporated in whole by reference). In some embodiments, the provided genetically modified rodents (e.g., rats or mice) express one or more rodent (e.g., rats or mice) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments.In some embodiments, the provided genetically modified rodents (e.g., rats or mice) have a genome (e.g., germline genome) containing one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, located on the same chromosome as the engineered immunoglobulin heavy chain (e.g., HoH, UHC, LoH) locus. In some embodiments, the provided genetically modified rodents (e.g., rats or mice) have a genome (e.g., germline genome) that replaces the human Adam6 pseudogene, comprising one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments.

[0205] In some embodiments, the genetically modified rodents provided are first and second human V H One or more human V genes containing a gene segment H Genetic segment and the first human V H Genetic segments and the second human V HThe genome (e.g., germline genome) comprises one or more rodent (e.g., rat or mouse) ADAM6 polypeptides between gene segments, one or more nucleotide sequences encoding a functional orthologue, functional homolog, or functional fragment thereof. In some embodiments, the first human V H The gene segment is V H 1-2, and the second human V H The gene segment is V H The score is 6-1.

[0206] In some embodiments, one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, one or more nucleotide sequences encoding their functional orthologues, functional homologs, or functional fragments are used in human V H Genetic segments and human D H It is located between gene segments.

[0207] In some embodiments, one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides restore or enhance the fertilization ability of male rodents.

[0208] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has one or more unreorganized human V genes upstream of (e.g., operably linked to) one or more immunoglobulin light chain constant region genes in its genome (e.g., its germline genome). L Gene segment and one or more unreorganized human J L The gene segment includes an engineered immunoglobulin light chain locus (e.g., an engineered endogenous rodent immunoglobulin light chain locus). In some embodiments, one or more unreorganized human V L Genetic segments and one or more unreorganized human J L The gene segment is one or more unreorganized human Vκ gene segments and one or more unreorganized human Jκ gene segments. In some embodiments, one or more unreorganized human V LGene segments and one or more non-rearranged human J L The gene segments are one or more non-rearranged human Vλ gene segments and one or more non-rearranged human Jλ gene segments. In some embodiments, the one or more non-rearranged immunoglobulin light chain constant region genes are Cκ or include Cκ. In some embodiments, the one or more non-rearranged immunoglobulin light chain constant region genes are Cλ or include Cλ.

[0209] In some embodiments, the engineered immunoglobulin light chain locus (e.g., an engineered endogenous murine immunoglobulin light chain locus) includes a non-natural leader sequence. In some embodiments, the leader sequence includes a signal peptide. In some embodiments, the leader sequence includes a non-natural signal peptide.

[0210] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) includes in its genome (e.g., its germline genome) an engineered immunoglobulin light chain locus (e.g., an engineered endogenous rodent immunoglobulin light chain locus) comprising one or more unreorganized human Vκ gene segments and one or more unreorganized human Jκ gene segments upstream of (e.g., operably linked to) the Cκ gene. Such an engineered immunoglobulin light chain locus is referred to herein as the “KoK locus”. Rodents containing the KoK locus are exemplified, for example, in U.S. Patents 6,596,541, 8,642,835, and 8,697,940, each of which is incorporated in whole by reference. In some embodiments, the immunoglobulin κ light chain constant region gene of the KoK locus is the rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the immunoglobulin κ light chain constant region gene at the KoK locus is the endogenous rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the immunoglobulin κ light chain constant region gene at the KoK locus is the endogenous rodent (e.g., rat or mouse) Cκ gene located at the endogenous immunoglobulin κ light chain locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the KoK locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the KoK locus.

[0211] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the KoK locus produce antibodies, for example, in response to antigen stimulation, which include, in particular, κ light chains, each κ light chain comprising a human κ light chain variable domain operably linked to a rodent (e.g., rat or mouse) κ light chain constant domain.

[0212] In some embodiments, one or more unreorganized human Vκ gene segments include at least six human Vκ gene segments. In some embodiments, one or more unreorganized human Vκ gene segments include at least sixteen human Vκ gene segments. In some embodiments, one or more unreorganized human Vκ gene segments include at least thirty human Vκ gene segments. In some embodiments, one or more unreorganized human Vκ gene segments include at least forty human Vκ gene segments. In some embodiments, one or more unreorganized human Jκ gene segments include at least five human Jκ gene segments.

[0213] In some embodiments, one or more unreorganized human Vκ gene segments include at least 16 human Vκ gene segments, and one or more unreorganized human Jκ gene segments include at least 5 human Jκ gene segments. Such an engineered immunoglobulin light chain locus is referred herein to as the "VelocImmune® 1 KoK locus". In some embodiments, one or more unreorganized human Vκ gene segments include at least 30 human Vκ gene segments, and one or more unreorganized human Jκ gene segments include at least 5 human Jκ gene segments. Such an engineered immunoglobulin light chain locus is referred herein to as the "VelocImmune® 2 KoK locus". In some embodiments, one or more unreorganized human Vκ gene segments include at least 40 human Vκ gene segments, and one or more unreorganized human Jκ gene segments include at least 5 human Jκ gene segments. Such manipulated immunoglobulin light chain loci are referred to herein as the "VelocImmune(registered trademark)3 KoK locus."

[0214] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has an engineered immunoglobulin light chain locus (e.g., an engineered endogenous rodent immunoglobulin light chain locus) in its genome (e.g., its germline genome) that includes one or more unreorganized human Jλ gene segments and one or more unreorganized human Vλ gene segments upstream of (e.g., operably linked to) one or more Cλ genes. Such an engineered immunoglobulin light chain locus is referred to herein as the “LoL locus.” Mice containing the LoL locus are exemplified, for example, in U.S. Patent Nos. 9,012,717, 9,226,484, 9,029,628, and U.S. Patent Publication No. 2018 / 0125043, each of which is incorporated in whole by reference. In some embodiments, one or more unreorganized human Jλ gene segments and one or more Cλ genes of the LoL locus reside in a Jλ-Cλ cluster. In some embodiments, one or more Cλ genes at the LoL locus include one or more human Cλ genes. In some embodiments, one or more Cλ genes at the LoL locus include one or more mouse Cλ genes. In some embodiments, one or more Cλ genes at the LoL locus include one or more human Cλ genes and one or more mouse Cλ genes. In some embodiments, one or more mouse Cλ genes at the LoL locus include a mouse Cλ1 gene. In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous at the LoL locus. In some embodiments, genetically modified rodents (e.g., rats or mice) are heterozygous at the LoL locus.

[0215] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the LoL locus produce antibodies, for example, in response to antigen stimulation, particularly those containing λ light chains, each λ light chain comprising a human λ light chain variable domain operably linked to a rodent (e.g., rat or mouse) λ light chain constant domain.

[0216] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has an engineered immunoglobulin light chain locus in its genome (e.g., its germline genome) that includes one or more unreorganized human Vλ gene segments and one or more unreorganized human Jλ gene segments upstream of (e.g., operably linked to) the Cκ gene. Such an engineered immunoglobulin light chain locus is referred to herein as the “LoK locus”. Rodents containing the LoK locus are exemplified, for example, in U.S. Patents 9,006,511 and 9,035,128, each of which is incorporated in whole by reference. In some embodiments, the Cκ gene of the LoK locus is a rodent (e.g., a rat or mouse) Cκ gene. In some embodiments, the Cκ gene of the LoK locus is an endogenous rodent (e.g., a rat or mouse) Cκ gene. In some embodiments, the Cκ gene at the LoK locus is the endogenous rodent (e.g., rat or mouse) Cκ gene located at the endogenous immunoglobulin κ light chain locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoK locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the LoK locus.

[0217] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the LoK locus produce antibodies, for example, in response to antigen stimulation, which include, in particular, light chains, each light chain comprising a human λ light chain variable domain operably linked to a rodent (e.g., rat or mouse) κ light chain constant domain.

[0218] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) includes in its genome (e.g., its germline genome) an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) comprising one or more unreorganized human Vλ gene segments and one or more unreorganized human Jλ gene segments upstream of (e.g., operably linked to) the Cλ gene. Such an engineered immunoglobulin light chain locus is referred to herein as the “LiK locus”. A rodent containing the LiK locus is exemplified, for example, in U.S. Patent Publication 2019 / 0223418 (published as U.S. Patent No. 11,051,498), which is incorporated in whole by reference. In some embodiments, the Cλ gene of the LiK locus is the rodent (e.g., a rat or mouse) Cλ gene. In some embodiments, the Cλ gene of the LiK locus is the mouse Cλ1 gene. In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous at the LiK locus. In some embodiments, genetically modified rodents (e.g., rats or mice) are heterozygous at the LiK locus.

[0219] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the LiK locus produce antibodies, for example, in response to antigen stimulation, which include, among other things, λ light chains, each λ light chain comprising a human λ light chain variable domain operably linked to a rodent (e.g., rat or mouse) λ light chain constant domain.

[0220] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) in its genome (e.g., its germline genome) that includes one or more unreorganized human Jλ gene segments and one or more unreorganized human Vλ gene segments upstream of (e.g., operably linked to) one or more human Cλ genes. In some embodiments, one or more unreorganized human Jλ gene segments and one or more Cλ genes of such an engineered immunoglobulin κ light chain locus reside in a Jλ-Cλ cluster. In some embodiments, the genetically modified rodent (e.g., a rat or mouse) is homozygous for such an engineered immunoglobulin κ light chain locus. In some embodiments, the genetically modified rodent (e.g., a rat or mouse) is heterozygous for such an engineered immunoglobulin κ light chain locus. In some embodiments, genetically modified rodents (e.g., rats or mice) containing such an engineered immunoglobulin κ light chain locus produce antibodies, for example, in response to antigen stimulation, which include, among other things, λ light chains, each λ light chain containing a human λ light chain variable domain operably linked to a human λ light chain constant domain.

[0221] In some embodiments, genetically modified rodents (e.g., rats or mice) have a germline genome that includes a limited repertoire of human light chain variable regions.

[0222] Exemplary genetically modified rodents comprising a human V(D)J gene segment having a germline genome containing a limited human light chain variable region repertoire are described, for example, in U.S. Patents 9,796,788, 10,130,081, 10,143,186, 10,167,344, 10,412,940, and 10,130,081, as well as WO2019 / 008123, WO2020 / 247623, and WO2020 / 132557, each of which is incorporated herein by reference in whole. In some embodiments, the limited human light chain variable region repertoire is a limited number of human V L Includes gene segments. In some embodiments, a limited number of human V L The gene segment consists of two human V L Includes gene segments. In some embodiments, a limited number of human V L The gene segment is one human V L Includes gene segments. For example, in some embodiments, a limited number of human V L The gene segment is one human Vκ gene segment. One human Vκ gene segment may be, for example, the human Vκ1-39 gene segment, the human Vκ3-15 gene segment, the human Vκ3-11 gene segment, or the human Vκ3-20 gene segment. In some embodiments, a limited number of human V L A gene segment is a single human Vλ gene segment. A single human Vλ gene segment could be, for example, the human Vλ1-51 gene segment, the human Vλ5-45 gene segment, the human Vλ1-44 gene segment, the human Vλ1-40 gene segment, the human Vλ3-21 gene segment, or the human Vλ2-14 gene segment.

[0223] In some embodiments, a limited human light chain variable region repertoire is one or more J L Includes gene segments. In some embodiments, a limited human light chain variable region repertoire is one J L Includes a gene segment. In some embodiments, one JL The gene segment is the Jκ gene segment. In some embodiments, one J L The gene segment is the Jλ gene segment. In some embodiments, one J L The gene segment is human J L It is a gene segment. In some embodiments, one J L The gene segment is mouse J L It is a gene segment.

[0224] In some embodiments, the limited human light chain variable region repertoire includes (i) a human Vκ gene segment and a human Jκ gene segment, (ii) a human Vκ gene segment and a mouse Jκ gene segment, (iii) a human Vκ gene segment and a human Jλ gene segment, or (iv) a human Vκ gene segment and a mouse Jλ gene segment.

[0225] In some embodiments, the limited repertoire of human light chain variable regions includes (i) a human Vλ gene segment and a human Jλ gene segment, (ii) a human Vλ gene segment and a mouse Jλ gene segment, (iii) a human Vλ gene segment and a human Jκ gene segment, or (iv) a human Vλ gene segment and a mouse Jκ gene segment.

[0226] In some embodiments, the limited human light chain variable region repertoire includes (i) the human Vκ1-39 gene segment and the human Jκ5 gene segment, (ii) the human Vκ1-39 gene segment and the human Jκ1 gene segment, (iii) the human Vκ3-20 gene segment and the human Jκ1 gene segment, or (iv) the human Vκ3-20 gene segment and the human Jκ5 gene segment.

[0227] In some embodiments, the limited human light chain variable region repertoire includes (i) the human Vκ1-39 gene segment and the mouse Jκ2 gene segment, (ii) the human Vκ3-20 gene segment and the mouse Jκ2 gene segment, or (iii) the human Vκ3-15 gene segment and the mouse Jκ2 gene segment.

[0228] In some embodiments, the limited human light chain variable region repertoire includes (i) the human Vλ1-51 gene segment and the human Jλ2 gene segment, (ii) the human Vλ5-45 gene segment and the human Jλ2 gene segment, (iii) the human Vλ1-44 gene segment and the human Jλ2 gene segment, (iv) the human Vλ1-40 gene segment and the human Jλ2 gene segment, (v) the human Vλ3-21 gene segment and the human Jλ2 gene segment, or (vi) the human Vλ2-14 gene segment and the human Jλ2 gene segment.

[0229] In some embodiments, a limited repertoire of human light chain variable regions is operably linked to the Cκ gene segment. In some embodiments, the Cκ gene segment is human. In some embodiments, the Cκ gene segment is mouse. In some embodiments, the mouse Cκ gene segment is an endogenous mouse Cκ gene segment located, for example, at the endogenous mouse immunoglobulin κ light chain locus. In some embodiments, the mouse Cκ gene segment is located at the endogenous mouse immunoglobulin λ light chain locus.

[0230] In some embodiments, a limited repertoire of human light chain variable regions is operably linked to the Cλ gene segment. In some embodiments, the Cλ gene segment is human. In some embodiments, the Cλ gene segment is mouse. In some embodiments, the mouse Cλ gene segment is an endogenous mouse Cλ gene segment located, for example, at the endogenous mouse immunoglobulin λ light chain locus. In some embodiments, the mouse Cλ gene segment is located at the endogenous mouse immunoglobulin κ light chain locus.

[0231] In some embodiments, genetically modified mice are heterozygous for a limited repertoire of human light chain variable regions. In some embodiments, genetically modified mice are homozygous for a limited repertoire of human light chain variable regions.

[0232] In some embodiments, genetically modified rodents include an engineered immunoglobulin light chain locus (e.g., an engineered endogenous rodent immunoglobulin light chain locus) that includes a limited repertoire of human light chain variable regions, comprising a restricted light chain variable region sequence. In some embodiments, the limited repertoire of human light chain variable regions includes one or two human light chain V gene segments and one or more human light chain J gene segments. In some embodiments, the limited repertoire of human light chain variable regions is operably ligated to a light chain constant region gene segment. In some embodiments, genetically modified rodents (e.g., rats or mice) including a limited repertoire of human light chain variable regions include in their genome (e.g., their germline genome) strictly two unreorganized human light chain V gene segments and one or more unreorganized human light chain J gene segments operably ligated to a light chain constant region sequence. Such an engineered immunoglobulin light chain locus is referred to herein as a “DLC locus”. In some embodiments, genetically modified rodents containing a limited repertoire of human light chain variable regions include a single rearranged light chain variable region locus in their genome (e.g., their germline genome) that contains a single human light chain V gene segment rearranged into a single human light chain J gene segment. In some embodiments, genetically modified rodents (e.g., rats or mice) containing a limited repertoire of human light chain variable regions include a single rearranged light chain variable region locus in their genome (e.g., their germline genome) that is operably linked to a light chain constant region sequence, and this single rearranged light chain variable region locus contains a single human light chain V gene segment rearranged into a single human light chain J gene segment. Such an engineered immunoglobulin light chain locus is referred to herein as a “ULC locus.” As used herein, the term “ULC locus” is interchangeable with “universal light chain locus” or “common light chain locus.”

[0233] In some embodiments, genetically modified rodents (e.g., rats or mice) have a germline genome containing a limited human κ light chain variable region repertoire. In some embodiments, genetically modified rodents contain an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) containing a limited human κ light chain variable region repertoire. In some embodiments, the limited human κ light chain variable region repertoire includes one or two human Vκ gene segments and one or more human Jκ gene segments. In some embodiments, the limited human κ light chain variable region repertoire is operably linked to a light chain constant region gene segment. In some embodiments, the provided genetically modified rodents contain a limited human κ light chain variable region repertoire operably linked to a Cκ gene segment.

[0234] In some embodiments, genetically modified rodents (e.g., rats or mice) have a limited repertoire of human κ light chain variable regions in their genome (e.g., their germline genome), which includes a single rearranged human κ light chain variable region (Vκ / Jκ). The single rearranged human κ light chain variable region includes a human Vκ gene segment conjugated to a human Jκ gene segment. In some embodiments, genetically modified rodents (e.g., rats or mice) have an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) in their genome (e.g., their germline genome) which includes a single rearranged human κ light chain variable region upstream of the Cκ gene (e.g., operably ligated to it). Such an engineered immunoglobulin light chain locus is referred to as a "κULC locus," which is an example of a ULC locus. Rodents containing the κULC locus are exemplified, for example, in U.S. Patent Nos. 10,130,081 and 10,143,186, each of which is incorporated by reference in whole.

[0235] In some embodiments, a single rearranged human κ light chain variable region includes a human Vκ gene segment and a human Jκ gene segment. In some embodiments, the human Vκ gene segment is the human Vκ1-39 gene segment or the human Vκ3-20 gene segment. In some embodiments, the human Jκ gene segment is the human Jκ1 gene segment, the human Jκ2 gene segment, the human Jκ3 gene segment, the human Jκ4 gene segment, or the human Jκ5 gene segment. In some embodiments, the human Vκ gene segment is the human Vκ1-39 gene segment and the human Jκ gene segment is the human Jκ5 gene segment. In some embodiments, a single rearranged human κ light chain variable region is human Vκ1-39 / Jκ5. In some embodiments, the human Vκ gene segment is the human Vκ3-20 gene segment and the human Jκ gene segment is the human Jκ1 gene segment. In some embodiments, a single rearranged human κ light chain variable region is human Vκ3-20 / Jκ1. In some embodiments, the human Vκ gene segment is the human Vκ3-11 gene segment, and the human Jκ gene segment is selected from the human Jκ1 gene segment, the human Jκ2 gene segment, the human Jκ3 gene segment, the human Jκ4 gene segment, or the human Jκ5 gene segment. In some embodiments, the human Vκ gene segment is the human Vκ3-11 gene segment, and the human Jκ gene segment is the human Jκ1 gene segment. In some embodiments, the single rearranged human κ light chain variable region is Vκ3-11 / Jκ1.

[0236] In some embodiments, the Cκ gene at the κULC locus is a rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the κULC locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the κULC locus.

[0237] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κULC locus lack endogenous Vκ and / or Jκ gene segments that can be rearranged to form endogenous κ light chain variable regions. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κULC locus lack endogenous Vλ and / or Jλ gene segments that can be rearranged to form endogenous λ light chain variable regions.

[0238] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κULC locus produce antibodies, for example, in response to antigen stimulation, that include, among other things, κ light chains, each κ light chain containing a human κ light chain variable domain operably linked to the rodent (e.g., rat or mouse) κ light chain constant domain. In some embodiments, all κ light chains expressed by B cells of genetically modified rodents (e.g., rats or mice) containing the κULC locus contain a human κ light chain variable domain expressed from a single rearranged human κ light chain variable region or a somatic hypermutant version thereof.

[0239] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) has an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) in its genome (e.g., its germline genome) which is operably linked to a constant region sequence of the κ light chain of the Cκ gene (e.g., operably linked to the Cκ gene), and comprises strictly two unreorganized human Vκ gene segments and one or more unreorganized human Jκ gene segments. Such an engineered immunoglobulin κ light chain locus is referred to herein as the "κDLC locus," which is an example of a DLC locus. Rodents containing the κDLC locus are exemplified, for example, in U.S. Patents 9,796,788, 10,167,344, 10,412,940, and 10,130,081, each of which is incorporated in whole by reference.

[0240] In some embodiments, strictly two unreorganized human Vκ gene segments include the human Vκ1-39 gene segment and the human Vκ3-20 gene segment. In some embodiments, one or more unreorganized human Jκ gene segments include two human Jκ gene segments. In some embodiments, one or more unreorganized human Jκ gene segments include three human Jκ gene segments. In some embodiments, one or more unreorganized human Jκ gene segments include four human Jκ gene segments. In some embodiments, one or more unreorganized human Jκ gene segments include five human Jκ gene segments. In some embodiments, one or more unreorganized human Jκ gene segments include the human Jκ1 gene segment, the human Jκ2 gene segment, the human Jκ3 gene segment, the human Jκ4 gene segment, the human Jκ5 gene segment, or a combination thereof.

[0241] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) containing the κDLC locus has its genome (e.g., germline genome) contain exactly two unreorganized human Vκ gene segments and five unreorganized human Jκ gene segments. In some embodiments, the exactly two unreorganized human Vκ gene segments include the human Vκ1-39 gene segment and the human Vκ3-20 gene segment, and the five unreorganized human Jκ gene segments include the human Jκ1 gene segment, the human Jκ2 gene segment, the human Jκ3 gene segment, the human Jκ4 gene segment, and the human Jκ5 gene segment.

[0242] In some embodiments, the Cκ gene at the κDLC locus is a rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the κDLC locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the κDLC locus.

[0243] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κDLC locus lack endogenous immunoglobulin Vκ and / or Jκ gene segments that can be rearranged to form endogenous immunoglobulin κ light chain variable regions. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κDLC locus lack endogenous Vλ and / or Jλ gene segments that can be rearranged to form endogenous λ light chain variable regions.

[0244] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κDLC locus produce antibodies, for example, in response to antigen stimulation, which include, in particular, κ light chains, each κ light chain comprising a human κ light chain variable domain operably linked to the rodent (e.g., rat or mouse) κ light chain constant domain.

[0245] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) containing a limited repertoire of human λ light chain variable regions. In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) containing an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) containing a limited repertoire of human λ light chain variable regions. In some embodiments, the genetically modified rodent (e.g., rat or mouse) contains an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) containing a limited repertoire of human λ light chain variable regions, which includes one or two human Vλ gene segments and one or more human Jλ gene segments. In some embodiments, the genetically modified rodent (e.g., rat or mouse) includes a limited human λ light chain variable region repertoire operably linked to a light chain constant region gene segment. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) includes a limited human λ light chain variable region repertoire operably linked to a rodent (e.g., rat or mouse) Cκ gene segment. In some embodiments, the provided genetically modified rodent includes a limited human λ light chain variable region repertoire operably linked to a rodent (e.g., rat or mouse) Cλ gene segment.

[0246] In some embodiments, genetically modified rodents (e.g., rats or mice) have a genome (e.g., germline genome) containing an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) that includes a limited human λ light chain variable region repertoire, which includes a single rearranged human immunoglobulin λ light chain variable region (Vλ / Jλ). The single rearranged human λ light chain variable region includes a human Vλ gene segment conjugated to a human Jλ gene segment. In some embodiments, genetically modified rodents include a limited human λ light chain variable region repertoire operably ligated to a rodent (e.g., rat or mouse) Cκ or Cλ gene segment (e.g., mouse Cλ1 gene segment). Such an engineered immunoglobulin light chain locus is an example of a ULC locus, which is referred to herein as the "ULCiK locus". Rodents containing the ULCiK locus are exemplified, for example, in WO2020 / 247623, which is incorporated in its entirety by reference.

[0247] For several purposes, human Vλ gene segments are selected from the group consisting of Vλ4-69, Vλ8-61, Vλ4-60, Vλ6-57, Vλ10-54, Vλ5-52, Vλ1-51, Vλ9-49, Vλ1-47, Vλ7-46, Vλ5-45, Vλ1-44, Vλ7-43, Vλ1-40, Vλ5-37, Vλ1-36, Vλ3-27, Vλ3-25, Vλ2-23, Vλ3-22, Vλ3-21, Vλ3-19, Vλ2-18, Vλ3-16, Vλ2-14, Vλ3-12, Vλ2-11, Vλ3-10, Vλ3-9, Vλ2-8, Vλ4-3, and Vλ3-1. For several purposes, human Vλ gene segments are selected from the group consisting of Vλ5-52, Vλ1-51, Vλ9-49, Vλ1-47, Vλ7-46, Vλ5-45, Vλ1-44, Vλ7-43, Vλ1-40, Vλ5-37, Vλ1-36, Vλ3-27, Vλ3-25, Vλ2-23, Vλ3-22, Vλ3-21, Vλ3-19, Vλ2-18, Vλ3-16, Vλ2-14, Vλ3-12, Vλ2-11, Vλ3-10, Vλ3-9, Vλ2-8, Vλ4-3, and Vλ3-1. In some embodiments, the human Vλ gene segment is selected from the group consisting of Vλ1-51, Vλ5-45, Vλ1-44, Vλ1-40, Vλ3-21, and Vλ2-14. In some embodiments, the human Vλ gene segment is Vλ1-51 or Vλ2-14. In some embodiments, the human Jλ gene segment is selected from the group consisting of Jλ1, Jλ2, Jλ3, Jλ6, and Jλ7. In some embodiments, the human Jλ gene segment is selected from the group consisting of Jλ1, Jλ2, Jλ3, and Jλ7. In some embodiments, the human Jλ gene segment is Jλ2.

[0248] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the ULCiK locus lack endogenous Vκ and / or Jκ gene segments that can be rearranged to form endogenous κ light chain variable regions. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the ULCiK locus lack endogenous Vλ and / or Jλ gene segments that can be rearranged to form endogenous λ light chain variable regions.

[0249] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the ULCiK locus produce antibodies, for example, in response to antigen stimulation, which include light chains, and each light chain contains a human λ light chain variable domain operably linked to a light chain constant domain (e.g., a Cλ or Cκ domain) (e.g., rat or mouse). In some embodiments, all light chains expressed by B cells of genetically modified rodents (e.g., rats or mice) containing the ULCiK locus contain a human λ light chain variable domain expressed from a single rearranged human λ light chain variable region or a somatic hypermutant version thereof.

[0250] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) containing an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) that includes a limited human λ light chain variable region repertoire, which includes two unreorganized human Vλ gene segments and one or more unreorganized human Jλ gene segments. In some embodiments, the limited human λ light chain variable region repertoire includes two unreorganized human Vλ gene segments and four unreorganized human Jλ gene segments. In some embodiments, the limited human λ light chain variable region repertoire includes two unreorganized human Vλ gene segments and five unreorganized human Jλ gene segments. In some embodiments, the genetically modified rodent includes a limited human λ light chain variable region repertoire operably linked to a rodent (e.g., rat or mouse) Cλ gene segment (e.g., mouse Cλ1 gene segment). Such an engineered immunoglobulin light chain locus is an example of a DLC locus, which is referred to herein as the “DLCiK locus.” Rodents containing the DLCiK locus are exemplified, for example, in WO2020 / 247623, which is incorporated in its entirety by reference.

[0251] In some embodiments, the germline genome of genetically modified rodents is homozygous for an engineered immunoglobulin κ light chain locus containing a limited human λ light chain variable region repertoire. In some embodiments, the germline genome of genetically modified rodents is heterozygous for an engineered immunoglobulin κ light chain locus containing a limited human λ light chain variable region repertoire.

[0252] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the DLCiK locus lack endogenous immunoglobulin Vκ and / or Jκ gene segments that can be rearranged to form endogenous immunoglobulin κ light chain variable regions. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the DLCiK locus lack endogenous Vλ and / or Jλ gene segments that can be rearranged to form endogenous λ light chain variable regions.

[0253] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the DLCiK locus produce antibodies, for example, in response to antigen stimulation, particularly light chains, each light chain comprising a human λ light chain variable domain operably linked to a rodent (e.g., rat or mouse) light chain constant domain (e.g., Cλ or Cκ domain).

[0254] In some embodiments, genetically modified rodents (e.g., rats or mice) contain an exogenous terminal deoxynucleotidyltransferase (TdT) gene. Rodents containing exogenous TdT are exemplified, for example, in U.S. Patent Publication 2019 / 0223418 and PCT Publication WO2017 / 210586, each of which is incorporated in whole by reference. In some embodiments, rodents containing the exogenous TdT gene (e.g., rats or mice) can increase antigen receptor diversity compared to rodents that do not contain the exogenous TdT gene.

[0255] In some embodiments, the rodents described herein have a genome comprising an exogenous TdT gene operably linked to a transcriptional regulatory element.

[0256] In some embodiments, the transcriptional regulatory element includes a RAG1 transcriptional regulatory element, a RAG2 transcriptional regulatory element, an immunoglobulin heavy chain transcriptional regulatory element, an immunoglobulin κ light chain transcriptional regulatory element, an immunoglobulin λ light chain transcriptional regulatory element, or any combination thereof.

[0257] In some embodiments, exogenous TdT is located at the immunoglobulin κ light chain locus, immunoglobulin λ light chain locus, immunoglobulin heavy chain locus, RAG1 locus, or RAG2 locus.

[0258] In some embodiments, TdT is human TdT. In some embodiments, TdT is a short isoform of TdT (TdTS).

[0259] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH and KoK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH and LoL loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH, KoK, and LoL loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous in the HoH, KoK, LoL, or any combination thereof.

[0260] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH locus, KoK locus, and LoK locus in its genome (e.g., its germline genome).

[0261] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH and LoK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH locus, the LoK locus, or a combination thereof.

[0262] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH and LiK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH locus, the LiK locus, or a combination thereof.

[0263] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH loci and ULC loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH loci, the ULC loci, or a combination thereof.

[0264] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH loci and DLC loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH loci, the DLC loci, or a combination thereof.

[0265] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH locus and κULC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at HoH locus, κULC locus, or a combination thereof.

[0266] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the HoH locus and the κDLC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH locus, the κDLC locus, or a combination thereof.

[0267] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH loci and ULCiK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at HoH loci, ULCiK loci, or a combination thereof.

[0268] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH loci and DLCiK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at HoH loci, DLCiK loci, or a combination thereof.

[0269] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has HoH loci and HULC loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the HoH loci, the HULC loci, or a combination thereof.

[0270] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the UHC locus and the KoK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the UHC locus and the LoL locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the UHC locus, the KoK locus, and the LoL locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous in the UHC locus, the KoK locus, the LoL locus, or a combination thereof.

[0271] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a genome (e.g., its germline genome) that includes the UHC locus, the KoK locus, and the LoK locus.

[0272] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a LoK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the LoK locus, or a combination thereof.

[0273] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a LiK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the LiK locus, or a combination thereof.

[0274] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and an ULC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the ULC locus, or a combination thereof.

[0275] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a DLC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the DLC locus, or a combination thereof.

[0276] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a κULC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the κULC locus, or a combination thereof.

[0277] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a κDLC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the κDLC locus, or a combination thereof.

[0278] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and an ULCiK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the ULCiK locus, or a combination thereof.

[0279] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a DLCiK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the DLCiK locus, or a combination thereof.

[0280] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has a UHC locus and a HULC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the UHC locus, the HULC locus, or a combination thereof.

[0281] In some embodiments, genetically modified rodents (e.g., rats or mice) have LoH and KoK loci in their genome (e.g., their germline genome). In some embodiments, genetically modified rodents (e.g., rats or mice) have LoH and LoL loci in their genome (e.g., their germline genome). In some embodiments, genetically modified rodents (e.g., rats or mice) have LoH, KoK, and LoL loci in their genome (e.g., their germline genome). In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous in the LoH, KoK, LoL loci, or a combination thereof.

[0282] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has LoH locus, KoK locus, and LiK locus in its genome (e.g., its germline genome).

[0283] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has LoH and LoK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the LoK locus, or a combination thereof.

[0284] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has LoH and LiK loci in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the LiK locus, or a combination thereof.

[0285] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the LoH locus and the κULC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the κULC locus, or a combination thereof.

[0286] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the LoH locus and the κDLC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the κDLC locus, or a combination thereof.

[0287] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the LoH locus and the ULCiK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the ULCiK locus, or a combination thereof.

[0288] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the LoH locus and the DLCiK locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the DLCiK locus, or a combination thereof.

[0289] In some embodiments, the genetically modified rodent (e.g., rat or mouse) has the LoH locus and the HULC locus in its genome (e.g., its germline genome). In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the LoH locus, the HULC locus, or a combination thereof.

[0290] Exemplary rodents containing the Kappa Universal Light Chain Locus In some exemplary embodiments of the present invention, genetically modified non-human animals, such as rodents, such as mice, whose genome includes one of an immunoglobulin locus having a restricted ability to generate a broad repertoire of variable regions, can be conveniently utilized in a manner that relies on analysis based on the repertoire of unrestricted immunoglobulin chain sequences and mass spectrometry. In some embodiments, the restricted immunoglobulin chain is a light chain, such as a kappa light chain. In some embodiments, the genetically modified rodent has in its genome (e.g., its germline genome) one or more rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes (e.g., one or more endogenous rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes) upstream (e.g., operably linked to) one or more unreorganized human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J HThe invention includes a gene segment and an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) (i.e., the HoH locus) and an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) (κULC locus) containing a single rearranged human κ light chain variable region (Vκ / Jκ) upstream of the Cκ gene (e.g., operably linked thereto). Exemplary rodents containing the HoH locus and the κULC locus are exemplified, for example, in U.S. Patents 10,130,081 and 10,143,186, each of which is incorporated in whole by reference. In some embodiments, the genetically modified rodent (e.g., a rat or mouse) is homozygous at the HoH locus and / or the κULC locus. In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous at the HoH locus and the κULC locus.

[0291] In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least six human V H Includes a gene segment. In some embodiments, one or more unreorganized human V at the HoH locus. H The gene segment contains at least 18 human V H Includes a gene segment. In some embodiments, one or more unreorganized human V at the HoH locus. H The gene segment contains at least 39 human V H Includes a gene segment. In some embodiments, one or more unreorganized human V at the HoH locus. H The gene segment contains at least 80 human V H Includes a gene segment. In some embodiments, one or more unreorganized human D2 loci are located at the HoH locus. H The gene segment contains at least 27 human D H Includes a gene segment. In some embodiments, one or more unreorganized human J genes located at the HoH locus. HThe gene segment is at least six human J H Includes gene segments.

[0292] In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least 18 human V H Includes a gene segment and one or more unreorganized human D at the HoH locus. H The gene segment consists of 27 human D H Includes a gene segment and one or more unreorganized human J genes located at the HoH locus. H The gene segment consists of six human J H This includes a gene segment. Where discussed herein, such an engineered immunoglobulin heavy chain locus is referred to as the "VelocImmune®1 HoH locus". In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least 39 human V H Includes a gene segment and one or more unreorganized human D at the HoH locus. H The gene segment consists of 27 human D H Includes a gene segment and one or more unreorganized human J genes located at the HoH locus. H The gene segment consists of six human J H This includes a gene segment. Where discussed herein, such an engineered immunoglobulin heavy chain locus is referred to as the "VelocImmune®2 HoH locus". In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least 80 human V H Includes a gene segment and one or more unreorganized human D at the HoH locus. H The gene segment consists of 27 human D H Includes a gene segment and one or more unreorganized human J genes located at the HoH locus. H The gene segment consists of six human J HThis includes a gene segment. Where discussed herein, such an engineered immunoglobulin heavy chain locus will be referred to as the "VelocImmune®3 HoH locus".

[0293] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH and κULC loci also include a genome (e.g., germline genome) lacking the functional endogenous rodent Adam6 gene. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH and κULC loci also include in their genome (e.g., germline genome) one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6b or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a and mouse ADAM6b or include them. A rodent comprising the HoH locus and the κULC locus, and comprising one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, is exemplified, for example, in U.S. Patent No. 10,130,081, which is incorporated in whole by reference. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) expresses one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) comprising one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, located on the same chromosome as the HoH locus.In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) containing an HoH locus that includes one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) that includes one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, in place of the human Adam6 pseudogene. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) that includes one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, replacing the human Adam6 pseudogene.

[0294] In some embodiments, genetically modified rodents containing the HoH locus and the κULC locus are first and second human V H One or more human V genes containing a gene segment H Genetic segment and the first human V H Genetic segments and the second human V H The genome (e.g., germline genome) comprises one or more rodent (e.g., rat or mouse) ADAM6 polypeptides between gene segments, one or more nucleotide sequences encoding a functional orthologue, functional homolog, or functional fragment thereof. In some embodiments, the first human V H The gene segment is V H 1-2, and the second human V H The gene segment is V H6-1. In some embodiments, one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptide, its functional orthologue, functional homolog, or functional fragment are human V H Genetic segments and human D H It is located between gene segments.

[0295] In some embodiments, one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides restore or enhance the fertilization ability of male rodents.

[0296] In some embodiments, a single rearranged human κ light chain variable region at the κULC locus includes a human Vκ gene segment and a human Jκ gene segment. In some embodiments, the human Vκ gene segment is the human Vκ1-39 gene segment or the human Vκ3-20 gene segment. In some embodiments, the human Jκ gene segment is the human Jκ1 gene segment, the human Jκ2 gene segment, the human Jκ3 gene segment, the human Jκ4 gene segment, or the human Jκ5 gene segment. In some embodiments, the human Vκ gene segment is the human Vκ1-39 gene segment, and the human Jκ gene segment is the human Jκ5 gene segment. In some embodiments, a single rearranged human κ light chain variable region at the κULC locus is human Vκ1-39 / Jκ5. In some embodiments, the human Vκ gene segment is the human Vκ3-20 gene segment, and the human Jκ gene segment is the human Jκ1 gene segment. In some embodiments, the single rearranged human κ light chain variable region at the κULC locus is human Vκ3-20 / Jκ1.

[0297] In some embodiments, the κULC locus includes a non-natural leader sequence. In some embodiments, the leader sequence includes a signal peptide. In some embodiments, the leader sequence includes a non-natural signal peptide.

[0298] In some embodiments, the Cκ gene at the κULC locus is a rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the κULC locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the κULC locus.

[0299] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κULC locus lack endogenous Vκ and / or Jκ gene segments that can be rearranged to form endogenous κ light chain variable regions. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the κULC locus lack endogenous Vλ and / or Jλ gene segments that can be rearranged to form endogenous λ light chain variable regions.

[0300] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH locus and the κULC locus produce antibodies, for example, in response to antigen stimulation, comprising, in particular, (a) heavy chains, each heavy chain containing a human heavy chain variable domain operably linked to a rodent (e.g., rat or mouse) heavy chain constant domain, and (b) κ light chains, each κ light chain containing a human κ light chain variable domain operably linked to a κ light chain constant domain. In some embodiments, all κ light chains expressed by the genetically modified rodents (e.g., rats or mice) contain a single rearranged human κ light chain variable region or a somatic hypermutant version thereof containing a human κ light chain variable domain.

[0301] In some embodiments, a genetically modified rodent (e.g., a rat or mouse) containing a κULC locus with a single rearranged human κ variable region further includes the substitution of at least one non-histidine residue in its light chain variable region (e.g., its CDR3 region) with a histidine region. Such a genetically modified rodent is described in U.S. Patent No. 9,801,362, which is incorporated herein by reference in its entirety. Immunizing genetically modified rodents containing the substitution of a non-histidine residue with a histidine residue or the insertion of a histidine residue facilitates the identification of antibodies exhibiting pH-dependent characteristics against an antigen using a combination of repertory sequencing and MS methods as described herein and in the examples.

[0302] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin heavy chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having a κULC locus in its germline genome, comprising: (i) obtaining a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by genetically modified rodents immunized with the antigen; and (ii) matching a library of human immunoglobulin heavy chain variable domain sequences with the plurality of peptide sequences, wherein the library comprises a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of immunized rodents.

[0303] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin heavy chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having a κULC locus in its germline genome, comprising: (i) obtaining a library of human immunoglobulin heavy chain variable domain sequences comprising a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of a rodent immunized with the antigen; and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by a rodent immunized with the antigen.

[0304] Exemplary rodents containing the lambda universal light chain locus In other embodiments of the present invention, the method utilizes a restricted lambda light chain. In some embodiments, a genetically modified rodent has one or more unreorganized human V genes upstream of (e.g., operably linked to) one or more rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes (e.g., one or more endogenous rodent (e.g., rat or mouse) immunoglobulin heavy chain constant region genes) in its genome (e.g., its germline genome), one or more unreorganized human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H The genetically modified rodent includes a genetic segment and an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) (i.e., the HoH locus), and an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) (ULCiK locus) which includes a limited human λ light chain variable region repertoire (this limited human λ light chain variable region repertoire includes a single rearranged human immunoglobulin λ light chain variable region (Vλ / Jλ) located upstream of the light chain constant region gene (e.g., operably linked to it)). Rodents including the HoH and ULCiK loci are illustrated, for example, in WO2020 / 247623, which is incorporated in whole by reference. In some embodiments, the genetically modified rodent (e.g., a rat or mouse) is homozygous at the HoH and / or ULCiK loci. In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous at the HoH and ULCiK loci.

[0305] In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least six human V HIncludes a gene segment. In some embodiments, one or more unreorganized human V at the HoH locus. H The gene segment contains at least 18 human V H Includes a gene segment. In some embodiments, one or more unreorganized human V at the HoH locus. H The gene segment contains at least 39 human V H Includes a gene segment. In some embodiments, one or more unreorganized human V at the HoH locus. H The gene segment contains at least 80 human V H Includes a gene segment. In some embodiments, one or more unreorganized human D2 loci are located at the HoH locus. H The gene segment contains at least 27 human D H Includes a gene segment. In some embodiments, one or more unreorganized human J genes located at the HoH locus. H The gene segment is at least six human J H Includes gene segments.

[0306] In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least 18 human V H Includes a gene segment and one or more unreorganized human D at the HoH locus. H The gene segment consists of 27 human D H Includes a gene segment and one or more unreorganized human J genes located at the HoH locus. H The gene segment consists of six human J H This includes a gene segment. Where discussed herein, such an engineered immunoglobulin heavy chain locus is referred to as the "VelocImmune®1 HoH locus". In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least 39 human V H Includes a gene segment and one or more unreorganized human D at the HoH locus. H The gene segment consists of 27 human D HIncludes a gene segment and one or more unreorganized human J genes located at the HoH locus. H The gene segment consists of six human J H This includes a gene segment. Where discussed herein, such an engineered immunoglobulin heavy chain locus is referred to as the "VelocImmune®2 HoH locus". In some embodiments, one or more unreorganized human V genes are located at the HoH locus. H The gene segment contains at least 80 human V H Includes a gene segment and one or more unreorganized human D at the HoH locus. H The gene segment consists of 27 human D H Includes a gene segment and one or more unreorganized human J genes located at the HoH locus. H The gene segment consists of six human J H This includes a gene segment. Where discussed herein, such an engineered immunoglobulin heavy chain locus will be referred to as the "VelocImmune®3 HoH locus".

[0307] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH and ULCiK loci also include a genome (e.g., germline genome) lacking the functional endogenous rodent Adam6 gene. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH and ULCiK loci also include in their genome (e.g., germline genome) one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6b or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a and mouse ADAM6b or include them. A rodent comprising the HoH locus and the ULCiK locus, and comprising one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, is exemplified, for example, in U.S. Patent No. 10,130,081, which is incorporated in whole by reference. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) expresses one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) comprising one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, located on the same chromosome as the HoH locus.In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) containing an HoH locus that includes one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) that includes one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, in place of the human Adam6 pseudogene. In some embodiments, the provided genetically modified rodent (e.g., rat or mouse) has a genome (e.g., germline genome) that includes one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, replacing the human Adam6 pseudogene.

[0308] In some embodiments, genetically modified rodents containing the HoH and ULCiK loci are first and second human V H One or more human V genes containing a gene segment H Genetic segment and the first human V H Genetic segments and the second human V H The genome (e.g., germline genome) comprises one or more rodent (e.g., rat or mouse) ADAM6 polypeptides between gene segments, one or more nucleotide sequences encoding a functional orthologue, functional homolog, or functional fragment thereof. In some embodiments, the first human V H The gene segment is V H 1-2, and the second human V H The gene segment is V H6-1. In some embodiments, one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptide, its functional orthologue, functional homolog, or functional fragment are human V H Genetic segments and human D H It is located between gene segments.

[0309] In some embodiments, one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides restore or enhance the fertilization ability of male rodents.

[0310] In some embodiments, a single rearranged human λ light chain variable region at the ULC locus includes a human Vλ gene segment and a human Jλ gene segment. For several purposes, human Vλ gene segments are selected from the group consisting of Vλ4-69, Vλ8-61, Vλ4-60, Vλ6-57, Vλ10-54, Vλ5-52, Vλ1-51, Vλ9-49, Vλ1-47, Vλ7-46, Vλ5-45, Vλ1-44, Vλ7-43, Vλ1-40, Vλ5-37, Vλ1-36, Vλ3-27, Vλ3-25, Vλ2-23, Vλ3-22, Vλ3-21, Vλ3-19, Vλ2-18, Vλ3-16, Vλ2-14, Vλ3-12, Vλ2-11, Vλ3-10, Vλ3-9, Vλ2-8, Vλ4-3, and Vλ3-1. For several purposes, human Vλ gene segments are selected from the group consisting of Vλ5-52, Vλ1-51, Vλ9-49, Vλ1-47, Vλ7-46, Vλ5-45, Vλ1-44, Vλ7-43, Vλ1-40, Vλ5-37, Vλ1-36, Vλ3-27, Vλ3-25, Vλ2-23, Vλ3-22, Vλ3-21, Vλ3-19, Vλ2-18, Vλ3-16, Vλ2-14, Vλ3-12, Vλ2-11, Vλ3-10, Vλ3-9, Vλ2-8, Vλ4-3, and Vλ3-1. In some embodiments, the human Vλ gene segment is selected from the group consisting of Vλ1-51, Vλ5-45, Vλ1-44, Vλ1-40, Vλ3-21, and Vλ2-14. In some embodiments, the human Vλ gene segment is Vλ1-51 or Vλ2-14. In some embodiments, the human Jλ gene segment is selected from the group consisting of Jλ1, Jλ2, Jλ3, Jλ6, and Jλ7. In some embodiments, the human Jλ gene segment is selected from the group consisting of Jλ1, Jλ2, Jλ3, and Jλ7. In some embodiments, the human Jλ gene segment is Jλ2.

[0311] In some embodiments, the ULC locus includes a non-natural leader sequence. In some embodiments, the ULC locus includes a single rearranged human λ light chain variable region and a Vκ leader sequence. In some embodiments, the leader sequence includes a signal peptide. In some embodiments, the leader sequence includes a non-natural signal peptide.

[0312] In some embodiments, the genetically modified rodents include a limited repertoire of human λ light chain variable regions operably linked to a rodent (e.g., rat or mouse) Cκ or Cλ gene segment (e.g., mouse Cλ1 gene segment).

[0313] In some embodiments, the human Vλ gene segment is Vλ1-51, the human Jλ gene segment is Jλ2, and the light chain constant region gene is rodent Cλ (e.g., mouse Cλ1). In some embodiments, the human Vλ gene segment is Vλ1-51, the human Jλ gene segment is Jλ2, and the light chain constant region gene is rodent Cκ. In some embodiments, the human Vλ gene segment is Vλ2-14, the human Jλ gene segment is Jλ2, and the light chain constant region gene is rodent Cλ (e.g., mouse Cλ1). In some embodiments, the human Vλ gene segment is Vλ2-14, the human Jλ gene segment is Jλ2, and the light chain constant region gene is rodent Cκ.

[0314] In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous at the ULCiK locus. In some embodiments, genetically modified rodents (e.g., rats or mice) are heterozygous at the ULCiK locus.

[0315] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the ULCiK locus lack endogenous Vκ and / or Jκ gene segments that can be rearranged to form endogenous κ light chain variable regions. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the ULCiK locus lack endogenous Vλ and / or Jλ gene segments that can be rearranged to form endogenous λ light chain variable regions.

[0316] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the HoH and ULC loci produce antibodies, for example, in response to antigen stimulation, comprising, in particular, (a) heavy chains, each containing a human heavy chain variable domain operably linked to a rodent (e.g., rat or mouse) heavy chain constant domain, and (b) light chains, each containing a human λ light chain variable domain operably linked to a (e.g., rat or mouse) light chain constant domain (e.g., Cλ or Cκ domain). In some embodiments, all light chains expressed by B cells of genetically modified rodents (e.g., rats or mice) containing the ULCiK locus contain human λ light chain variable domains expressed from a single rearranged human λ light chain variable region or a somatic hypermutant version thereof.

[0317] In some embodiments, the present disclosure provides a method for identifying human immunoglobulin heavy chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having the ULCiK locus in its germline genome, comprising: (i) obtaining a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by genetically modified rodents immunized with the antigen; and (ii) matching a library of human immunoglobulin heavy chain variable domain sequences with the plurality of peptide sequences, wherein the library comprises a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of the immunized rodent.

[0318] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin heavy chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having the ULCiK locus in its germline genome, comprising: (i) obtaining a library of human immunoglobulin heavy chain variable domain sequences comprising a plurality of human immunoglobulin heavy chain variable domain sequences encoded by B cells of a rodent immunized with the antigen; and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin heavy chain variable domains obtained from a sample comprising an antibody population produced by a rodent immunized with the antigen.

[0319] Exemplary rodents containing the universal heavy chain locus In other embodiments, the restricted immunoglobulin chain of the mouse used by the method described herein is a heavy chain. In some embodiments, the genetically modified rodent includes in its genome (e.g., its germline genome) an engineered immunoglobulin heavy chain locus (e.g., an engineered endogenous rodent immunoglobulin heavy chain locus) (i.e., a UHC locus or common heavy chain locus) comprising a single rearranged human heavy chain variable region upstream of (e.g., operably linked to) one or more rodent (e.g., rat or mouse) constant region genes, and an engineered immunoglobulin κ light chain locus (e.g., an engineered endogenous rodent immunoglobulin κ light chain locus) (i.e., a KoK locus) comprising one or more unrearranged human Vκ gene segments and one or more unrearranged human Jκ gene segments upstream of (e.g., operably linked to) the Cκ gene. In some embodiments, genetically modified rodents (e.g., rats or mice) are homozygous at the UHC locus and / or the KoK locus.

[0320] In some embodiments, a single rearranged human heavy chain variable region at the UHC locus is a single human VH Gene segment, single human D H Gene segment, and single human J H Includes a gene segment. In some embodiments, a single human V H The gene segment is human V H 3-23, single human D H The gene segment is human D H 4-4, single human J H The gene segment is human J H The answer is 4.

[0321] In some embodiments, a single rearranged human heavy chain variable region at the UHC locus is a single human V H Genetic segments and single human J H It contains gene segments, which are separated by two amino acids. In some embodiments, a single human V H The gene segment is human V H 3-23, a single human J H The gene segment is human J H The result is 4, and the two amino acids are glycine and tyrosine.

[0322] In some embodiments, one or more rodent (e.g., mouse or rat) heavy chain constant region genes located at the UHC locus are one or more endogenous rodent (e.g., mouse or rat) heavy chain constant region genes.

[0323] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the UHC and KoK loci lack the functional endogenous rodent Adam6 gene. In some embodiments, genetically modified rodents (e.g., rats or mice) containing the UHC and KoK loci contain one or more rodent ADAM6 polypeptides, one or more nucleotide sequences encoding a functional orthologue, functional homolog, or functional fragment thereof. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6b or include it. In some embodiments, one or more rodent ADAM6 polypeptides are mouse ADAM6a and mouse ADAM6b or include them. In some embodiments, the provided genetically modified rodents (e.g., rats or mice) express one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments. In some embodiments, the provided genetically modified rodents (e.g., rats or mice) have a genome (e.g., germline genome) containing one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments, located on the same chromosome as the UHC locus. In some embodiments, the provided genetically modified rodents (e.g., rats or mice) have a genome (e.g., germline genome) containing a UHC locus containing one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments.In some embodiments, the provided genetically modified rodents (e.g., rats or mice) have a genome (e.g., germline genome) that replaces the human Adam6 pseudogene, comprising one or more nucleotide sequences encoding one or more rodent (e.g., rat or mouse) ADAM6 polypeptides, their functional orthologues, functional homologs, or functional fragments.

[0324] In some embodiments, one or more nucleotide sequences encoding one or more rodent ADAM6 polypeptides restore or enhance the fertilization ability of male rodents.

[0325] In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus include at least six human Vκ gene segments. In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus include at least sixteen human Vκ gene segments. In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus include at least thirty human Vκ gene segments. In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus include at least forty human Vκ gene segments. In some embodiments, one or more unreorganized human Jκ gene segments at the KoK locus include at least five human Jκ gene segments.

[0326] In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus contain at least 16 human Vκ gene segments, and one or more unreorganized human Jκ gene segments contain at least 5 human Jκ gene segments. Whereas described herein, such an engineered immunoglobulin heavy chain locus is referred to herein as the "VelocImmune® 1 KoK locus". In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus contain at least 30 human Vκ gene segments, and one or more unreorganized human Jκ gene segments at the KoK locus contain at least 5 human Jκ gene segments. Whereas described herein, such an engineered immunoglobulin heavy chain locus is referred to herein as the "VelocImmune® 2 KoK locus". In some embodiments, one or more unreorganized human Vκ gene segments at the KoK locus include at least 40 human Vκ gene segments, and one or more unreorganized human Jκ gene segments at the KoK locus include at least 5 human Jκ gene segments. Where otherwise described herein, such an engineered immunoglobulin heavy chain locus is referred to herein as the "VelocImmune®3 KoK locus".

[0327] In some embodiments, the immunoglobulin κ light chain constant region gene at the KoK locus is a rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the immunoglobulin κ light chain constant region gene at the KoK locus is an endogenous rodent (e.g., rat or mouse) Cκ gene. In some embodiments, the immunoglobulin κ light chain constant region gene at the KoK locus is an endogenous rodent (e.g., rat or mouse) Cκ gene located at the endogenous immunoglobulin κ light chain locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is homozygous at the KoK locus. In some embodiments, the genetically modified rodent (e.g., rat or mouse) is heterozygous at the KoK locus.

[0328] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the UHC and KoK loci produce antibodies, for example, in response to antigen stimulation, comprising, in particular, (a) heavy chains, each containing a human heavy chain variable domain operably linked to a rodent (e.g., rat or mouse) heavy chain constant domain, and (b) κ light chains, each containing a human κ light chain variable domain operably linked to a rodent (e.g., rat or mouse) κ light chain constant domain. In some embodiments, all heavy chains expressed by the genetically modified rodents (e.g., rats or mice) contain human heavy chain variable domains expressed from a single rearranged human heavy chain variable region or a somatic hypermutant version thereof.

[0329] In some embodiments, genetically modified rodents (e.g., rats or mice) containing the UHC and KoK loci also contain the exogenous terminal deoxynucleotidyltransferase (TdT) gene. In some embodiments, rodents containing the exogenous terminal deoxynucleotidyltransferase (TdT) gene (e.g., rats or mice) can increase antigen receptor diversity compared to rodents that do not contain the exogenous TdT gene.

[0330] In some embodiments, the rodents described herein have a genome comprising an exogenous terminal deoxynucleotidyltransferase (TdT) gene operably linked to a transcriptional regulatory element.

[0331] In some embodiments, the transcriptional regulatory element includes a RAG1 transcriptional regulatory element, a RAG2 transcriptional regulatory element, an immunoglobulin heavy chain transcriptional regulatory element, an immunoglobulin κ light chain transcriptional regulatory element, an immunoglobulin λ light chain transcriptional regulatory element, or any combination thereof.

[0332] In some embodiments, exogenous TdT is located at the immunoglobulin κ light chain locus, immunoglobulin λ light chain locus, immunoglobulin heavy chain locus, RAG1 locus, or RAG2 locus.

[0333] In some embodiments, TdT is human TdT. In some embodiments, TdT is a short isoform of TdT (TdTS).

[0334] In some embodiments, the present disclosure provides a method for identifying human immunoglobulin light chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having a UHC locus in its germline genome, comprising: (i) obtaining a plurality of peptide sequences of human immunoglobulin light chain variable domains obtained from a sample comprising an antibody population produced by genetically modified rodents immunized with the antigen; and (ii) matching a library of human immunoglobulin light chain variable domain sequences with the plurality of peptide sequences, wherein the library comprises a plurality of human immunoglobulin light chain variable domain sequences encoded by B cells of the immunized rodent.

[0335] In some embodiments, the Disclosure provides a method for identifying human immunoglobulin light chain variable domains or CDR sequences (e.g., CDR3 sequences) of antibodies specific to an antigen from a rodent having a UHC locus in its germline genome, comprising: (i) obtaining a library of human immunoglobulin light chain variable domain sequences comprising a plurality of human immunoglobulin light chain variable domain sequences encoded by B cells of a rodent immunized with the antigen; and (ii) matching the library with a plurality of peptide sequences of human immunoglobulin light chain variable domains obtained from a sample comprising an antibody population produced by a rodent immunized with the antigen.

[0336] The generated antigen-specific antibodies After identifying a target antibody (e.g., a target variable domain and / or target CDR sequence(s)) from a genetically modified non-human animal (e.g., a rodent, e.g., a rat or mouse) using the method herein, the method may further include expressing the nucleotide sequence encoding the obtained antibody (i.e., the first antibody) or a portion thereof (e.g., the variable region) in an antigen-binding protein or a second recombinant antibody. In some embodiments, the antibody sequence identified by the method herein is subsequently expressed in host cells. In some embodiments, the variable region sequence of the antibody identified herein is cloned into a second recombinant antibody expressed in host cells. Various embodiments of the second recombinant antibody are described later herein. In various embodiments, the antibody obtained by the method herein is further tested to confirm binding to the antigen immunogen or to determine the antibody's kinetic binding parameters. In some embodiments, the supernatant or purified protein from cells expressing (e.g., transfected) the second antibody obtained by the method herein is screened in various assays to determine the binding affinity and / or specificity to the antigen. Various assays that can be used include those described in the examples above, and others that will be obvious to those skilled in the art. In various embodiments, the antibody is concentrated on the antigen of interest or on an epitope on the antigen of interest (e.g., in the micromolar, nanomolar, or picomolar range of K). D It binds specifically.

[0337] In some embodiments, the nucleotide sequence encoding the resulting antibody is from an immunized host (e.g., a genetically modified non-human animal, e.g., a rodent, e.g., a mouse or a rat), and this host has a restricted repertoire of heavy and / or light chain variable regions in its genome (e.g., its germline genome). In some embodiments, the nucleotide sequence encoding the heavy chain variable domain is obtained from an immunized host (e.g., a genetically modified non-human animal, e.g., a rodent, e.g., a mouse or a rat), and this host has a restricted repertoire of immunoglobulin light chain variable regions in its genome (e.g., its germline genome). In some embodiments, the nucleotide sequence encoding the light chain variable domain is obtained from an immunized host (e.g., a genetically modified non-human animal, e.g., a rodent, e.g., a mouse or a rat), and this host has a restricted repertoire of immunoglobulin heavy chain variable regions in its genome (e.g., its germline genome).

[0338] In some embodiments, the nucleotide sequence encoding the heavy chain variable domain is obtained from an immunized rodent (e.g., mouse) whose genome (e.g., its germline genome) contains a single rearranged human light chain variable region comprising a single light chain V gene segment and a single light chain J gene segment, for example, a single human light chain Vκ gene segment and a single human light chain Jκ gene segment, or a single human light chain Vλ gene segment and a single human light chain Jλ gene segment (for rodents containing the ULC locus, see, for example, U.S. Patents 10,143,186 and 10,130,081, which are incorporated herein by reference in their entirety). Thus, immunizing such a rodent (e.g., mouse) with the antigen of interest allows for the analysis of the heavy chain variable region (e.g., heavy chain CDR3) sequence of an antibody targeting the antigen of interest, and the selection of the heavy chain variable region sequence, by the method described herein.

[0339] In some embodiments, the nucleotide sequences encoding antibodies obtained from an immunized host (e.g., genetically modified non-human animals, e.g., rodents, e.g., mice or rats) are codon-optimized. In some embodiments, the nucleotide sequences encoding the obtained heavy chain and / or light chain variable domains are codon-optimized. In some embodiments, the nucleotide sequences encoding one or more obtained CDR sequences are codon-optimized.

[0340] In some embodiments, the resulting nucleotide sequence encoding a human immunoglobulin variable domain (e.g., heavy chain and / or light chain variable region) is inserted into a construct for the expression of an antigen-binding protein. In some embodiments, the antigen-binding protein is an antibody.

[0341] In some embodiments, the obtained nucleotide sequence encoding a human immunoglobulin variable domain is operably ligated to a human immunoglobulin constant domain and inserted into a construct such that the antibody is expressed as a fully human antibody having the human variable domain upstream of the human constant domain. Thus, in some embodiments, the method further comprises, following obtaining a nucleotide sequence encoding a human immunoglobulin heavy chain variable domain and / or a human immunoglobulin light chain variable domain as described herein, (i) conjugating or ligating the nucleotide sequence encoding the human immunoglobulin heavy chain variable domain with a nucleotide sequence encoding a human immunoglobulin heavy chain constant domain to form a human immunoglobulin heavy chain sequence encoding a fully human immunoglobulin heavy chain, and / or (ii) conjugating or ligating a nucleotide sequence encoding a human immunoglobulin light chain variable domain (e.g., human immunoglobulin κ and / or λ light chain variable domain) with a nucleotide sequence encoding a human immunoglobulin light chain constant domain (e.g., human immunoglobulin κ and / or λ light chain constant domain) to form a human immunoglobulin κ and / or λ light chain sequence encoding a fully human immunoglobulin κ and / or λ light chain. In certain embodiments, human immunoglobulin heavy chain sequences and human immunoglobulin κ and / or λ light chain sequences are expressed in cells (e.g., host cells, mammalian cells) such that fully human immunoglobulin heavy chains and fully human immunoglobulin κ and / or λ light chains are expressed to form human antibodies. In some embodiments, human antibodies are isolated from cells or culture media containing cells.

[0342] In some embodiments, the antigen-binding protein (e.g., the second antibody) is a human antibody and / or a bispecific antibody. The term “bispecific antibody” includes antibodies that can selectively bind to two or more epitopes. A bispecific antibody generally comprises two non-identical heavy chains, each heavy chain specifically binding to either a different epitope, i.e., one on two different molecules (e.g., different epitopes on two different immunogens) or one on the same molecule (e.g., different epitopes on the same immunogen). If a bispecific antibody can selectively bind to two different epitopes (a first epitope and a second epitope), the affinity of the first heavy chain to the first epitope is generally at least one to two or three or four orders of magnitude lower than the affinity of the first heavy chain to the second epitope, and vice versa. The epitopes specifically bound by a bispecific antibody can be on the same or different targets (e.g., on the same or different proteins). Bispecific antibodies can be produced, for example, by combining heavy chains that recognize different epitopes of the same immunogen. For instance, a nucleic acid sequence encoding a heavy chain variable sequence that recognizes different epitopes of the same immunogen can be fused to a nucleic acid sequence encoding the same or different heavy chain constant region, and such a sequence can be expressed in cells expressing an immunoglobulin light chain. A typical bispecific antibody has two heavy chains, each having three heavy chain CDRs, followed (from N-terminus to C-terminus) by a CH1 domain, hinge, CH2 domain, and CH3 domain, as well as an immunoglobulin light chain, which does not confer epitope binding specificity but can associate with each heavy chain, or can associate with each heavy chain and can bind to one or more epitopes bound by the heavy chain epitope binding region, or can associate with each heavy chain and allow one or both heavy chains to bind to one or both epitopes.

[0343] For example, if the antigen-binding protein (e.g., a second antibody) is a bispecific antibody, in some embodiments, the bispecific antibody is produced by immunizing a genetically modified non-human animal, e.g., a rodent, e.g., a mouse or rat (which contains a limited repertoire of heavy and / or light chain variable regions in its genome (e.g., its germline genome)). In some embodiments, the non-human animal is a mouse, which contains a single rearranged human light chain variable region in its genome (e.g., its germline genome), comprising a single light chain V gene segment and a single light chain J gene segment, e.g., a single human light chain Vκ gene segment and a single human light chain Jκ gene segment, or a single human light chain Vλ gene segment and a single human light chain Jλ gene segment (for rodents containing the ULC locus, see, for example, U.S. Patents 10,143,186 and 10,130,081, which are incorporated herein by reference in their entirety). Therefore, by immunizing such mice with the first target antigen, the method described herein allows for the analysis of the heavy chain variable region (e.g., heavy chain CDR3) sequence of an antibody targeting the first target antigen, and the selection of the first heavy chain variable region sequence for use in a bispecific antibody. The method is repeated by immunizing a second mouse, similarly containing a single rearranged human light chain variable region including a single light chain V gene segment and a single light chain J gene segment (e.g., the same light chain V and J gene segments present in the first mouse), using the method described herein, in order to obtain a second heavy chain variable region for a second target antigen, and obtaining a second heavy chain variable region from the second mouse. Alternatively, the second heavy chain variable region sequence can be obtained using methods known in the art (e.g., hybridoma techniques described in U.S. Patents 10,143,186 and 10,130,081, which are incorporated herein by reference in their entirety) or other methods. The first and second heavy chain variable regions are expressed in the first and second heavy chains (e.g., the first and second human heavy chains) together with the same light chains present in the first and second mouse or somatic mutant versions thereof, in order to generate bispecific antibodies.

[0344] In some embodiments, for example, when the antigen-binding protein (e.g., a second antibody) is a bispecific antibody, the resulting nucleotide sequence encoding a human immunoglobulin variable domain, for example, a human immunoglobulin heavy chain variable domain, is operably linked to a human heavy chain immunoglobulin constant region and inserted into the construct, wherein the Fc domain of the heavy chain includes modifications to promote the formation of a heavy chain heterodimer and / or inhibit the formation of a heavy chain homodimer. Such modifications are provided, for example, in U.S. Patents 5,731,168, 5,807,706, 5,821,333, 7,642,228, and 8,679,785, and U.S. Patent Publication 2013 / 0195849 (each of which is incorporated herein by reference). In yet another embodiment, for example, if the second antibody is a bispecific antibody, the resulting nucleotide sequence encoding a human immunoglobulin variable domain, for example, a human immunoglobulin heavy chain variable domain, is operably linked to a human heavy chain immunoglobulin constant region (e.g., a human IgG constant region) and inserted into the construct, in which case one of the heavy chains of the bispecific antibody is modified to remove the protein A binding determinant, resulting in a difference in affinity between the homodimeric antigen-binding protein and the heterodimeric antigen-binding protein. Thus, one immunoglobulin heavy chain of the bispecific antibody contains a first CH3 region of human IgG selected from IgG1, IgG2, and IgG4, which binds to protein A; the second immunoglobulin heavy chain contains a second CH3 region of human IgG selected from IgG1, IgG2, and IgG4, which includes modifications to reduce or eliminate the binding of the second CH3 region to protein A; and the other immunoglobulin light chain of the bispecific antibody pairs with both immunoglobulin heavy chains. Compositions and methods addressing this issue are described in U.S. Patent No. 9,309,326 (which is incorporated herein by reference in its entirety).

[0345] In some embodiments, the nucleotide sequences encoding the human variable domain obtained by the method described herein are operably ligated to a human immunoglobulin constant region and expressed in a cell line so as to produce a fully human antibody. In some embodiments, the cell line expressing the fully human antibody is any cell suitable for expression of the recombinant nucleic acid sequence. Cells include those of prokaryotes and eukaryotes (unicellular or multicellular), bacterial cells (e.g., strains such as E. coli, Bacillus spp., Streptomyces spp.), Mycobacterium cells, fungal cells, yeast cells (e.g., S. cerevisiae, S. pombe, P. pastoris, P. methanolica, etc.), plant cells, insect cells (e.g., SF-9, SF-21, baculovirus-infected insect cells, Trichoplusia ni, etc.), non-human animal cells, human cells, or cell fusions such as hybridomas or quadromas. In some embodiments, the cells are human, monkey, ape, hamster, rat, or mouse cells. In some embodiments, the cells are eukaryotic and selected from the following cells: CHO (e.g., CHO Kl, DXB-11 CHO, Veggie-CHO), COS (e.g., COS-7), retinal cells, Vera, CVl, kidney cells (e.g., HEK293, 293 EBNA, MSR 293, MDCK, HaK, BHK), HeLa, HepG2, WI38, MRC 5, Colo205, HB 8065, HL-60, (e.g., BHK21), Jurkat, Daudi, A431 (epidermal), CV-1, U937, 3T3, L cells, C127 cells, SP2 / 0, NS-0, MMT 060562, Sertoli cells, BRL 3A cells, HTl 080 cells, 10 myeloma cells, tumor cells, and cell lines derived from the aforementioned cells. In some embodiments, the cells include one or more viral genes, for example, retinal cells expressing viral genes (e.g., PER.C6® cells).

[0346] Mammalian host cells used to produce antibodies can be cultured in a variety of media. Commercial media such as Ham F10 (Sigma), Minimum Essential Medium ((MEM), Sigma), RPMI-1640 (Sigma), and Dulbecco's Modified Eagle Medium ((DMEM), Sigma) are suitable for culturing host cells. The media may be supplemented as needed with hormones and / or other growth factors (such as insulin, transferrin, or epidermal growth factor), salts (such as sodium chloride, calcium, magnesium, and phosphates), buffers (such as HEPES), nucleosides (such as adenosine and thymidine), antibiotics (e.g., gentamicin), trace elements (defined as inorganic compounds that are normally present at final concentrations in the micromolar range), and glucose or equivalent energy sources. As is known to those skilled in the art, any other supplements may also be included in appropriate concentrations. Culture conditions such as temperature and pH have been used in various embodiments with host cells selected for expression and will be apparent to those skilled in the art.

[0347] Method for producing antigen-binding proteins and nucleic acid sequences encoding them. This disclosure includes methods for obtaining the amino acid and / or nucleotide sequences of the light and / or heavy chains of an antibody from a host immunized with the antigen of interest (i.e., a genetically modified host as described herein).

[0348] In some embodiments, the method includes obtaining a nucleotide sequence encoding the human immunoglobulin variable domain of a first antibody specific to the antigen, which includes obtaining a first sample from an immunized host containing multiple nucleic acid sequences encoding multiple immunoglobulin variable domains and determining the amino acid sequences of the multiple immunoglobulin variable domains; obtaining a second sample from an immunized host containing an antibody population targeting the antigen of interest and determining the peptide sequences of the heavy and / or light chain variable domains of the antibody population therefrom; comparing the amino acid sequences of the multiple immunoglobulin variable domains with the peptide sequences of the heavy and / or light chain variable domains of the antibody population to obtain a sequence of the human variable domain of an antibody specific to the antigen; and obtaining a nucleotide sequence encoding the human immunoglobulin variable domain of an antibody specific to the antigen. In some embodiments, the method further includes utilizing the obtained nucleotide sequence encoding the human immunoglobulin variable domain in an antigen-binding protein (e.g., a second antibody). In some embodiments, the nucleotide sequence encoding the human immunoglobulin variable domain in the antigen-binding protein is codon-optimized.

[0349] In some embodiments, a method is provided herein for obtaining a nucleotide sequence encoding a human immunoglobulin variable domain of an antigen-specific antibody, comprising: obtaining a plurality of nucleic acid sequences encoding a plurality of immunoglobulin variable domains from a first sample from a host immunized with the antigen, and determining the amino acid sequences of the encoded plurality of immunoglobulin variable domains; obtaining a second sample from an immunized host containing an antibody population targeting the antigen of interest, and determining the peptide sequences of the heavy chain and / or light chain variable domains therefrom; comparing the amino acid sequences of the encoded plurality of immunoglobulin variable domains with the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population, thereby obtaining a human immunoglobulin variable domain of an antigen-specific antibody; and obtaining a nucleotide sequence encoding a human immunoglobulin variable domain of an antibody specific to the antigen.

[0350] In some embodiments, a method is provided herein for obtaining a nucleotide sequence encoding a human immunoglobulin variable domain CDR (e.g., CDR3) sequence of an antigen-specific antibody, comprising: obtaining a plurality of nucleic acid sequences encoding a plurality of immunoglobulin variable domains from a first sample from a host immunized with the antigen, and determining the amino acid sequences of the encoded plurality of immunoglobulin variable domains; obtaining a second sample from an immunized host containing an antibody population targeting the antigen of interest, and determining the peptide sequences of the heavy chain and / or light chain variable domains therefrom; and comparing the amino acid sequences of the plurality of human immunoglobulin variable domains with the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population from the second sample, thereby obtaining a human immunoglobulin variable domain CDR (e.g., CDR3) sequence of an antigen-specific antibody, and a nucleotide sequence encoding a human immunoglobulin variable domain CDR (e.g., CDR3) sequence of an antigen-specific antibody.

[0351] In some embodiments, a method is provided herein for obtaining a human immunoglobulin variable domain sequence of an antigen-specific antibody, comprising: obtaining a plurality of nucleic acid sequences encoding a plurality of immunoglobulin variable domains from a first sample from a host immunized with the antigen, and determining the amino acid sequences of the encoded plurality of immunoglobulin variable domains; obtaining a second sample from an immunized host containing an antibody population targeting the antigen of interest, and determining the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population therefrom; and matching the amino acid sequences of the plurality of immunoglobulin variable domains to obtain a human immunoglobulin variable domain sequence of an antibody specific to the antigen.

[0352] In some embodiments, a method is provided herein for obtaining a human immunoglobulin variable domain CDR (e.g., CDR3) sequence of an antigen-specific antibody, comprising: obtaining a plurality of nucleic acid sequences encoding a plurality of immunoglobulin variable domains from a first sample from a host immunized with the antigen, and determining the amino acid sequences of the encoded plurality of immunoglobulin variable domains; obtaining a second sample from an immunized host containing an antibody population targeting the antigen of interest, and determining the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population therefrom; and comparing the amino acid sequences of the plurality of immunoglobulin variable domains with the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population, thereby obtaining a human immunoglobulin variable domain CDR (e.g., CDR3) sequence of an antigen-specific antibody.

[0353] Accordingly, in some embodiments, nucleic acid sequences encoding human immunoglobulin variable domains, or human immunoglobulin variable domain CDRs (e.g., CDR3), obtained using the method described herein, are provided herein. In other embodiments, nucleic acid sequences encoding immunoglobulin light chains or heavy chains, obtained using the method described herein, are provided herein.

[0354] In some embodiments, amino acid sequences of human variable domains or CDRs (e.g., CDR3) obtained using the methods described herein are also provided herein. In other embodiments, amino acid sequences of immunoglobulin light chains or heavy chains obtained using the methods described herein are also provided herein.

[0355] In some embodiments, a method for producing an antibody is provided herein, comprising expressing in a host cell (i) a nucleic acid encoding an immunoglobulin heavy chain, comprising a human immunoglobulin heavy chain variable region sequence operably linked to an immunoglobulin heavy chain constant region sequence, and (ii) a nucleic acid encoding an immunoglobulin light chain, comprising a human immunoglobulin light chain variable region sequence operably linked to an immunoglobulin light chain constant region sequence, wherein the human immunoglobulin heavy chain variable region sequence and / or the human immunoglobulin light chain variable region sequence are identified by any of the methods provided herein. In some embodiments, the host cell is cultured under conditions such that the host cell expresses an antibody comprising an immunoglobulin heavy chain and an immunoglobulin light chain.

[0356] In some embodiments, a method for producing a fully human immunoglobulin heavy chain and / or fully human immunoglobulin light chain is also provided herein, comprising: (a) identifying a human immunoglobulin heavy chain and / or light chain variable domain sequence by any of the methods provided herein; (b) operably linking a nucleic acid encoding a human immunoglobulin heavy chain variable domain with a nucleic acid encoding a human immunoglobulin heavy chain constant domain to form a fully human immunoglobulin heavy chain and / or operably linking a nucleic acid encoding a human immunoglobulin light chain variable domain with a nucleic acid encoding a human immunoglobulin light chain constant domain to form a fully human immunoglobulin light chain; and (c) expressing the fully human immunoglobulin heavy chain and / or fully human immunoglobulin light chain. In some embodiments, the fully human immunoglobulin heavy chain and / or fully human immunoglobulin light chain are expressed in host cells.

[0357] In some embodiments, antibodies comprising sequences obtained using the method described herein are also provided herein.

[0358] In some embodiments, cells expressing antigen-binding proteins derived from human immunoglobulin variable domains obtained by the method described herein are provided. In some embodiments, the cells are a cell line used for the production of antigen-binding proteins, for example, for the production of antigen-binding proteins for administration to a subject.

[0359] Pharmaceutical composition In some embodiments, an antigen-binding protein, a nucleic acid encoding an antigen-binding protein, or a therapeutically relevant portion thereof, produced by or derived from an antibody, nucleic acid, or therapeutically relevant portion thereof produced by a method disclosed herein, may be administered to a subject (e.g., a human subject). In some embodiments, the pharmaceutical composition includes an antibody produced by a non-human animal as disclosed herein. In some embodiments, the pharmaceutical composition may include a buffer, a diluent, an excipient, or any combination thereof. In some embodiments, the composition may also optionally contain one or more additional therapeutically active substances.

[0360] The descriptions of pharmaceutical compositions provided herein are primarily directed toward pharmaceutical compositions suitable for ethical administration to humans, but it will be understood by those skilled in the art that such compositions are generally suitable for administration to all types of animals. Modifications of pharmaceutical compositions suitable for administration to humans to make them suitable for administration to various animals are well understood, and a veterinary pharmacologist of ordinary skill can design and / or carry out such modifications, if any, by conventional experimentation.

[0361] For example, the pharmaceutical compositions provided herein may be in a sterile, injectable form (e.g., a form suitable for subcutaneous or intravenous injection). For example, in some embodiments, the pharmaceutical composition is provided in a liquid dosage form suitable for injection. In some embodiments, the pharmaceutical composition is provided as a powder (e.g., lyophilized and / or sterilized) which can optionally be reconstituted under vacuum with an aqueous diluent (e.g., water, buffer solution, salt solution, etc.) before injection. In some embodiments, the pharmaceutical composition is diluted and / or reconstituted with water, sodium chloride solution, sodium acetate solution, benzyl alcohol solution, phosphate-buffered saline, etc. In some embodiments, the powder should be gently mixed with the aqueous diluent (e.g., without shaking).

[0362] Formulations of the pharmaceutical compositions described herein may be prepared by any method known or hereafter developed in the field of pharmacology. Typically, such preparation methods include associating the active ingredient with a diluent or another excipient and / or one or more other auxiliary components, and then, if necessary and / or desired, shaping and / or packaging the product into desired single or multiple dose units.

[0363] In some embodiments, a pharmaceutical composition comprising an antibody produced by the method disclosed herein may be contained in a container for storage or administration, e.g., a vial, a syringe (e.g., an IV syringe), or a bag (e.g., an IV bag). The pharmaceutical compositions according to this disclosure may be prepared, packaged, and / or sold in bulk as single unit doses and / or as multiple single unit doses. As used herein, “unit dose” is an individual amount of a pharmaceutical composition containing a predetermined amount of the active ingredient. The amount of the active ingredient is typically equal to the dose of the active ingredient that would be administered to a subject and / or a favorable proportion of such a dose, such as half or one-third of such a dose.

[0364] The relative amounts of the active ingredient, pharmaceutically acceptable excipients, and / or any additional ingredients in the pharmaceutical compositions of this disclosure will vary depending on the specificity, size, and / or condition of the object being treated, and further depending on the route through which the composition is administered. For example, a composition may contain 0.1% to 100% (w / w) of the active ingredient.

[0365] When used herein, pharmaceutical compositions may additionally contain pharmaceutically acceptable excipients, including all solvents, dispersions, diluents, or other liquid vehicles, dispersing or suspending aids, surfactants, isotonic agents, thickeners or emulsifiers, preservatives, solid binders, lubricants, etc., to suit a particular dosage form of the desired nature. Remington's *The Science and Practice of Pharmacy*, 21st Edition, ARGennaro (Lippincott, Williams & Wilkins, Baltimore, MD, 2006) discloses various excipients used in the formulation of pharmaceutical compositions and known techniques for preparing them. Unless any conventional excipient medium is incompatible with the substance or its derivatives, for example, by causing any undesirable biological effect or otherwise interacting in a detrimental manner with any other component(s) of the pharmaceutical composition, its use is intended to be within the scope of this disclosure.

[0366] In some embodiments, pharmaceutically acceptable excipients are at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% pure. In some embodiments, the excipients are approved for use in humans and for veterinary use. In some embodiments, the excipients are approved by the U.S. Food and Drug Administration. In some embodiments, the excipients are pharmaceutical grade. In some embodiments, the excipients meet the standards of the United States Pharmacopeia (USP), European Pharmacopeia (EP), British Pharmacopeia, and / or International Pharmacopoeia.

[0367] Pharmaceutically acceptable excipients used in the manufacture of pharmaceutical compositions include, but are not limited to, inert diluents, dispersants and / or granulators, surfactants and / or emulsifiers, disintegrants, binders, preservatives, buffers, lubricants, and / or oils. Such excipients may be optionally included in pharmaceutical formulations. Excipients, such as cocoa butter and suppository waxes, colorants, coatings, sweeteners, flavorings, and / or fragrances, may be present in the composition at the discretion of the formulationer.

[0368] In some embodiments, the provided pharmaceutical composition comprises one or more pharmaceutically acceptable excipients (e.g., preservatives, inert diluents, dispersants, surfactants and / or emulsifiers, buffers, etc.). In some embodiments, the pharmaceutical composition comprises one or more preservatives. In some embodiments, the pharmaceutical composition does not contain preservatives.

[0369] In some embodiments, the pharmaceutical composition is provided in a form that can be refrigerated and / or frozen. In some embodiments, the pharmaceutical composition is provided in a form that cannot be refrigerated and / or frozen. In some embodiments, the reconstituted solution and / or liquid dosage form may be stored for a predetermined period after reconstitution (e.g., 2 hours, 12 hours, 24 hours, 2 days, 5 days, 7 days, 10 days, 2 weeks, 1 month, 2 months, or longer). In some embodiments, storage of the antibody composition for longer than a certain time results in antibody degradation.

[0370] Liquid dosage forms and / or reconstituted solutions may contain particulate matter and / or discoloration before administration. In some embodiments, solutions should not be used if they are discolored and / or turbid, and / or if particulate matter remains after filtration.

[0371] General considerations in the formulation and / or manufacture of pharmaceutical substances can be found, for example, in Remington: The Science and Practice of Pharmacy 21st ed., Lippincott Williams & Wilkins, 2005 (incorporated herein by reference).

[0372] kit This disclosure further provides packs or kits comprising one or more containers filled with at least a protein (single or complex (e.g., an antibody or a fragment thereof)) obtained by the methods described herein. The kits may be used in any applicable method (e.g., a research method). Such containers may optionally be accompanied by documentation in a format prescribed by government agencies regulating the manufacture, use or sale of pharmaceutical or biological products, the documentation indicating (a) an approval by the authority for manufacture, use or sale for human administration, (b) instructions for use, and / or (c) a contract governing the transfer of materials and / or biological products (e.g., non-human animals or non-human cells as described herein) between two or more entities and combinations thereof.

[0373] In some embodiments, a kit is provided comprising amino acids (e.g., an antibody or a fragment thereof) obtained by the method herein. In some embodiments, a kit is provided comprising nucleic acids (e.g., nucleic acids encoding an antibody or a fragment thereof) obtained by the method herein. In some embodiments, a kit is provided comprising sequences (amino acid sequences and / or nucleic acid sequences) identified by the method herein.

[0374] In some embodiments, kits described herein are provided for use in the manufacture and / or development of drugs (e.g., antibodies or fragments thereof) for therapeutic or diagnostic purposes.

[0375] In some embodiments, kits described herein are provided for use in the manufacture and / or development of drugs (e.g., antibodies or fragments thereof) for the treatment, prevention, or mitigation of diseases, disorders, or conditions.

[0376] Other characteristics of specific embodiments will become apparent in the course of describing the following exemplary embodiments. These are provided for illustrative purposes only and are not intended to limit other characteristics of specific embodiments.

[0377] While the present invention has been specifically illustrated and described with reference to numerous embodiments, it will be understood by those skilled in the art that modifications in form and detail may be made to the various embodiments disclosed herein without departing from the spirit and scope of the invention, and that the various embodiments disclosed herein are not intended to function as limiting the claims.

[0378] Exemplary Embodiments Embodiment 1. A method for obtaining human immunoglobulin variable domains or CDRs of antibodies specific to a particular antigen from a host immunized with a specific antigen, comprising: (i) obtaining a plurality of nucleic acids encoding a plurality of human immunoglobulin variable domains from a first sample from the immunized host, and determining the amino acid sequences of the encoded plurality of immunoglobulin variable domains; (ii) obtaining a second sample from the immunized host containing an antibody population targeting the antigen, and determining the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population therefrom; and (iii) comparing the amino acid sequences of the encoded plurality of human immunoglobulin variable domains from the first sample with the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population from the second sample. A method comprising: obtaining a human immunoglobulin variable domain or CDR sequence of an antibody specific to the antigen; wherein the host is a genetically modified non-human mammal whose genome comprises an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0379] Embodiment 2. The method according to Embodiment 1, wherein the host is a rodent.

[0380] Embodiment 3. The method according to Embodiment 2, wherein the host is a rat.

[0381] Embodiment 4. The method according to Embodiment 2, wherein the host is a mouse.

[0382] Embodiment 5. The method according to Embodiment 1, wherein the first sample contains a population of B cells.

[0383] Embodiment 6. The method according to Embodiment 5, wherein the first sample is a bone marrow sample and / or a spleen sample.

[0384] Embodiment 7. The method according to any one of the prior embodiments, wherein obtaining a plurality of nucleic acid sequences encoding a plurality of immunoglobulin variable domains from a first sample comprises preparing cDNA from the nucleic acid sequences and sequencing the rearranged heavy chain VDJ sequences and / or rearranged light chain VJ sequences in the first sample.

[0385] Embodiment 8. The method according to Embodiment 7, wherein it is determined using DNA sequencing technology that multiple nucleic acids encoding multiple immunoglobulin variable domains are obtained from a first sample.

[0386] Embodiment 9. The method according to Embodiment 8, wherein the DNA sequencing technology is next-generation DNA sequencing.

[0387] Embodiment 10. The method according to any one of the prior embodiments, wherein the second sample is selected from the group consisting of serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, or placenta.

[0388] Embodiment 11. The method according to any one of the prior embodiments, wherein determining the peptide sequence from a second sample includes mass spectrometry analysis of the heavy chain and / or light chain variable domains of the antibody population in the second sample.

[0389] Embodiment 12. The method according to Embodiment 11, wherein the mass spectrometry analysis is a combination of liquid chromatography and mass spectrometry (LC-MS).

[0390] Embodiment 13. The method according to Embodiment 11 or 12, further comprising proteolytic digestion of the heavy chain and / or light chain variable domains of the antibody population prior to mass spectrometry analysis.

[0391] Embodiment 14. The method according to any one of the prior embodiments, wherein obtaining a second sample containing an antibody population targeting a specific antigen from an immunized host comprises depleting antibodies not targeting the specific antigen from the second sample.

[0392] Embodiment 15. The method according to any one of the prior embodiments, wherein obtaining a second sample containing an antibody population targeting a specific antigen from an immunized host comprises concentrating the second sample with respect to antibodies targeting a specific antigen.

[0393] Embodiment 16. The method according to any one of the prior embodiments, comprising matching the amino acid sequences of multiple immunoglobulin variable domains from a first sample with the peptide sequences of heavy chain and / or light chain variable domains of an antibody population from a second sample, thereby aligning the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population with each other and with the amino acid sequences of multiple immunoglobulin variable domains.

[0394] Embodiment 17. The method according to any one of the prior embodiments, further comprising obtaining a nucleotide sequence of a human variable domain of an antigen-specific antibody.

[0395] Embodiment 18. The method according to Embodiment 17, further comprising expressing the obtained nucleotide sequence encoding a human immunoglobulin variable domain in a second recombinant antibody.

[0396] Embodiment 19. The method according to Embodiment 18, wherein a nucleotide sequence encoding a human variable domain is operably ligated to a human immunoglobulin constant region and expressed in a cell line.

[0397] Embodiment 20. The method according to Embodiment 19, wherein the human variable domain is a human heavy chain variable domain that is operably linked to a human immunoglobulin heavy chain constant region and expressed to generate a human immunoglobulin heavy chain.

[0398] Embodiment 21. The method according to Embodiment 20, wherein human immunoglobulin heavy chains are expressed together with human immunoglobulin light chains in a cell line.

[0399] Embodiment 22. The method according to Embodiment 19, wherein the human variable domain is a human light chain variable domain that is operably linked to a human immunoglobulin light chain constant region and expressed to generate a human immunoglobulin light chain.

[0400] Embodiment 23. The method according to Embodiment 22, wherein human immunoglobulin light chains are expressed together with human immunoglobulin heavy chains in a cell line.

[0401] Embodiment 24. The method according to any one of Embodiments 18 to 23, wherein the second antibody is a fully human antibody.

[0402] Embodiment 25. The method according to any one of Embodiments 18 to 24, wherein the second antibody is a bispecific antibody.

[0403] Embodiment 26. The method according to any one of Embodiments 18 to 25, further comprising purifying a second antibody and determining the affinity and / or specificity of the purified second antibody for a specific antigen.

[0404] Embodiment 27. The method according to any one of the prior embodiments, wherein the host is a genetically modified mouse, the genome comprising an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a murine constant region, and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a murine constant region.

[0405] Embodiment 28. The method according to Embodiment 27, wherein an immunoglobulin heavy chain variable region is operably linked to a mouse heavy chain constant region, and / or an immunoglobulin light chain variable region is operably linked to a mouse light chain constant region.

[0406] Embodiment 29. The method according to Embodiment 28, wherein an immunoglobulin heavy chain variable region operably linked to the mouse heavy chain constant region is located at the endogenous mouse heavy chain locus, and / or an immunoglobulin light chain variable region operably linked to the mouse light chain constant region is located at the endogenous mouse light chain locus.

[0407] Embodiment 30. The method according to any one of Embodiments 27 to 29, wherein the host is a genetically modified mouse whose genome (including its germline genome) comprises an immunoglobulin heavy chain variable region comprising a plurality of human heavy chain V gene segments, a plurality of human D gene segments, and a plurality of human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to the murine heavy chain constant region, and an immunoglobulin light chain variable region operably linked to the murine light chain constant region, comprising strictly two unreorganized human Vκ gene segments and five unreorganized human Jκ gene segments, wherein the strictly two unreorganized human Vκ gene segments are human Vκ1-39 gene segments and human Vκ3-20 gene segments.

[0408] Embodiment 31. The host is a genetically modified mouse, whose genome is operably linked to (a) an endogenous heavy chain locus and (i) a mouse heavy chain constant region, and has multiple unreorganized human V H Gene segments and multiple unreorganized human D H Gene segments and multiple unreorganized human J H (ii) a gene segment and an immunoglobulin heavy chain variable region including; (ii) operably linked to the mouse heavy chain constant region and a single human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H (iii) A restricted non-reorganized heavy chain variable region including a gene segment; (iii) a universal heavy chain coding sequence containing a single reorganized human heavy chain variable region operably linked to the mouse heavy chain constant region; (iv) one or more non-reorganized human V operably linked to the mouse heavy chain constant region.H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H (v) A histidine-modified non-reorganized heavy chain variable region comprising a gene segment and further comprising the substitution or insertion of at least one histidine to a non-histidine residue; (v) a heavy chain constant region (non-IgM gene, e.g., an IgG gene lacking a sequence encoding a functional CH1 domain) operably linked to one or more non-reorganized human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H (vi) A sequence encoding an immunoglobulin heavy chain only, including a gene segment and an immunoglobulin heavy chain variable region; or (vi) a sequence operably linked to a mouse immunoglobulin heavy chain constant region gene, comprising one or more unreorganized human V L Gene segment and one or more unreorganized human J L The method according to Embodiment 27, comprising: (b) an engineered endogenous rodent immunoglobulin heavy chain locus comprising a gene segment; and / or (b) an immunoglobulin light chain variable region operably linked to an endogenous light chain locus, comprising (i) a plurality of unreorganized human Vκ gene segments and a plurality of unreorganized human Jκ gene segments; (ii) a universal light chain coding sequence comprising a single reorganized human light chain variable region operably linked to the mouse light chain constant region; (iii) a restricted light chain variable region operably linked to the mouse light chain constant region, comprising two unreorganized human Vκ gene segments and one or more unreorganized human Jκ gene segments; or (iv) a histidine-modified light chain variable region operably linked to the mouse light chain constant region, comprising one or more human light chain V gene segments and one or more human light chain J gene segments, further comprising the substitution or insertion of at least one histidine to a non-histidine residue.

[0409] Embodiment 32. The method according to any one of the prior embodiments, wherein the genetically modified mouse further comprises a functional ADAM6 gene, and optionally the functional ADAM6 gene is a mouse ADAM6 gene.

[0410] Embodiment 33. The method according to any one of the prior embodiments, wherein the genetically modified mouse further expresses the exogenous terminal deoxynucleotidyltransferase (TdT) gene.

[0411] Embodiment 34. A method for obtaining a human immunoglobulin heavy chain variable domain or CDR of an antibody specific to a particular antigen from a host immunized with that antigen, comprising: obtaining a plurality of nucleic acids encoding a plurality of immunoglobulin heavy chain variable domains from a first sample from the immunized host and determining the amino acid sequences of the plurality of encoded human immunoglobulin variable domains; obtaining a second sample from the immunized host containing an antibody population targeting a particular antigen and determining the peptide sequence of the human heavy chain variable domain of the antibody population therefrom; and comparing the amino acid sequences of the plurality of human immunoglobulin heavy chain variable domains with the peptide sequence of the human heavy chain variable domain of the antibody population, thereby obtaining a human immunoglobulin heavy chain variable domain or CDR of an antibody specific to that antigen. A method comprising obtaining a CDR; wherein the host is a genetically modified mouse whose genome (including its germline genome) comprises an immunoglobulin heavy chain variable region comprising a plurality of human heavy chain V gene segments, a plurality of human D gene segments, a plurality of human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a murine constant region; and an immunoglobulin light chain variable region which is a single reorganized human light chain variable region comprising a single human light chain V gene segment, a single human light chain J gene segment, wherein the human immunoglobulin light chain variable region is operably linked to a murine light chain constant region.

[0412] Embodiment 35. The method according to Embodiment 34, wherein the single reorganized human light chain variable region is a single reorganized human kappa light chain variable region comprising a single human light chain Vκ gene segment and a single human light chain Jκ gene segment.

[0413] Embodiment 36. The method according to Embodiment 35, wherein a single human light chain Vκ gene segment is a Vκ1-39 or Vκ3-20 gene segment, and a single human light chain Jκ gene segment is a Jκ1 or Jκ5 gene segment.

[0414] Embodiment 37. The method according to Embodiment 35, wherein the constant light chain region of the murid family is the constant light chain region of the mouse kappa.

[0415] Embodiment 38. The method according to Embodiment 35, wherein a single reorganized human light chain variable region is operably linked to a mouse light chain constant region located at the endogenous mouse kappa light chain locus.

[0416] Embodiment 39. The method according to any one of Embodiments 35 to 38, wherein the genetically modified mouse further comprises a functional ADAM6 gene, and optionally the functional ADAM6 gene is a mouse ADAM6 gene.

[0417] Embodiment 40. The method according to Embodiment 39, wherein the first sample contains a population of B cells.

[0418] Embodiment 41. The method according to Embodiment 40, wherein the first sample is a bone marrow sample and / or a spleen sample.

[0419] Embodiment 42. The method according to any one of Embodiments 34 to 41, wherein obtaining multiple nucleic acid sequences encoding multiple human immunoglobulin heavy chain variable domains from a first sample comprises preparing cDNA from the nucleic acid sequences and sequencing the rearranged heavy chain VDJ sequences in the first sample.

[0420] Embodiment 43. The method of Embodiment 42, wherein it is determined using DNA sequencing technology that multiple nucleic acid sequences encoding multiple immunoglobulin variable domains are obtained from a first sample.

[0421] Embodiment 44. The method according to Embodiment 43, wherein the DNA sequencing technology is next-generation DNA sequencing.

[0422] Embodiment 45. The method according to any one of Embodiments 34 to 44, wherein the second sample is selected from the group consisting of serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, or placenta.

[0423] Embodiment 46. The method according to any one of Embodiments 34 to 45, wherein determining the peptide sequence from the second sample includes mass spectrometry analysis of the heavy chain variable domains of the antibody population in the second sample.

[0424] Embodiment 47. The method according to Embodiment 46, wherein the mass spectrometry analysis is a combination of liquid chromatography and mass spectrometry (LC-MS).

[0425] Embodiment 48. The method of Embodiment 46 or 47, further comprising proteolytic digestion of the heavy chain variable domains of the antibody population prior to mass spectrometry analysis.

[0426] Embodiment 49. The method according to any one of Embodiments 34 to 48, wherein obtaining a second sample containing an antibody population targeting a specific antigen from an immunized host comprises depleting antibodies not targeting the specific antigen from the second sample.

[0427] Embodiment 50. The method according to any one of Embodiments 34 to 49, wherein obtaining a second sample containing an antibody population targeting a specific antigen from an immunized host comprises concentrating the second sample with respect to antibodies targeting a specific antigen.

[0428] Embodiment 51. The method according to any one of Embodiments 34 to 50, wherein the amino acid sequences of multiple human immunoglobulin heavy chain variable domains are matched with the peptide sequences of human heavy chain variable domains of an antibody population, thereby aligning the peptide sequences of human heavy chain variable domains of an antibody population with each other and with the amino acid sequences of multiple human immunoglobulin heavy chain variable domains.

[0429] Embodiment 52. The method according to any one of Embodiments 34 to 51, further comprising obtaining a nucleotide sequence of the human heavy chain variable domain of an antigen-specific antibody.

[0430] Embodiment 53. The method according to Embodiment 52, further comprising expressing the obtained nucleotide sequence encoding the variable domain of the human immunoglobulin heavy chain in a second recombinant antibody.

[0431] Embodiment 54. The method according to Embodiment 53, wherein a nucleotide sequence encoding a human heavy chain variable domain is operably linked to a human immunoglobulin heavy chain constant region and expressed in a cell line to generate a human immunoglobulin heavy chain.

[0432] Embodiment 55. The method according to Embodiment 54, wherein human immunoglobulin heavy chains are expressed together with human immunoglobulin light chains in a cell line.

[0433] Embodiment 56. The method according to Embodiment 55, wherein the human immunoglobulin light chain is derived from the same single rearranged variable region sequence as that present in mice, or a somatic variant thereof.

[0434] Embodiment 57. The method according to any one of Embodiments 53 to 56, wherein the second antibody is a human antibody.

[0435] Embodiment 58. The method according to any one of Embodiments 53 to 57, wherein the second antibody is a bispecific antibody.

[0436] Embodiment 59. The method according to any one of Embodiments 53 to 58, further comprising purifying a second antibody and determining the affinity and / or specificity of the purified second antibody for a specific antigen.

[0437] Embodiment 60. The method according to any one of the prior embodiments, wherein obtaining a human immunoglobulin heavy chain variable domain or CDR of an antigen-specific antibody is based on one or more of the following: (1) the matching of a unique peptide obtained from a second sample with a CDR3 sequence in the amino acid sequence obtained from a first sample; (2) the matching of a unique peptide obtained from a second sample with a CDR1 and / or CDR2 sequence in the amino acid sequence obtained from a first sample; (3) the matching of one or more unique peptides obtained from a second sample with one or more framework sequences in the amino acid sequence obtained from a first sample; (4) the number of next-generation sequencing sequences; (5) the exclusion of CDR sequences containing methionine; and (6) the exclusion of CDR sequences that may be N-glycosylated.

[0438] Embodiment 61. A method for obtaining an immunoglobulin variable domain or CDR of an antigen-specific antibody, comprising: obtaining a sample containing an antibody population targeting an antigen from a host immunized with the antigen; determining the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population; and matching the peptide sequences of the heavy chain and / or light chain variable domains of the antibody population from the sample with a library of amino acid sequences containing multiple human immunoglobulin variable domains, thereby obtaining a human immunoglobulin variable domain or CDR sequence of an antibody specific to the antigen; the immunized host being genetically modified non- A method for a genetically modified non-human mammal comprising: a human mammal having a germline genome comprising an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0439] Embodiment 62. The method according to Embodiment 61, wherein a library of amino acid sequences containing a plurality of human immunoglobulin variable domains is encoded by a plurality of nucleic acid sequences obtained from an antigen-immunized host, and the immunized host is a genetically modified non-human mammal, the germline genome of which comprises an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0440] Embodiment 63. The method according to Embodiments 61-62, wherein the sample is selected from the group consisting of serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, or placenta.

[0441] Embodiment 64. The method according to Embodiments 62-63, wherein a library of amino acid sequences containing multiple human immunoglobulin variable domains is encoded by multiple nucleic acids obtained from a B cell sample, which is a bone marrow and / or spleen sample.

[0442] Embodiment 65. A method for identifying a human immunoglobulin variable domain or CDR of an antibody specific to a particular antigen, comprising: comparing a plurality of amino acid sequences encoded by a plurality of nucleic acid sequences encoding a plurality of human immunoglobulin variable domains produced by an animal immunized with the antigen with an amino acid sequence comprising peptide fragments from light chain and / or heavy chain variable domains produced from an antibody population targeting the antigen; thereby identifying a human immunoglobulin variable domain or CDR of an antibody specific to the antigen, wherein the host is a genetically modified non-human mammal whose genome comprises an immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the heavy chain variable region is operably linked to a constant region; and an immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the light chain is operably linked to a constant region.

[0443] Embodiment 66. The method according to Embodiment 65, wherein a plurality of nucleic acids and peptide fragments are obtained from animals immunized with an antigen. [Examples]

[0444] The present invention is further illustrated by the following non-limiting embodiments. These embodiments are provided to aid in understanding the present invention, but are not intended to limit the scope of the invention in any way, nor should they be construed as such. The embodiments do not include detailed descriptions of conventional methods (such as molecular cloning techniques) that would be well known to those skilled in the art. Unless otherwise indicated, parts are by weight, molecular weights are average molecular weights, and temperatures are in degrees Celsius. Those skilled in the art will understand that the order of the steps is not necessarily absolute and can be modified to achieve the same results in specific embodiments.

[0445] An exemplary overview of the process is shown in Figure 1 of this specification. Briefly, as described in the following examples, rodents (e.g., mice or rats) are immunized with the antigen of interest (e.g., CD22-Fc fusion protein) and their anti-antigen titers are evaluated. Animals showing high anti-antigen titers in blood samples are sacrificed, and bone marrow and / or spleen are obtained. B cells are purified and processed by next-generation sequencing (NGS) to generate a database of immunoglobulin sequences (e.g., variable domain sequences, e.g., heavy chain variable domain sequences). Serum (or alternative desired sample) is also obtained from the same sacrificed animals and enriched for antigen-specific antibodies (in the exemplary embodiments below, anti-Fc titers are depleted and anti-CD22 titers are enriched). The antigen-enriched antibodies are enzymatically digested into peptides, and these peptides are sequenced by mass spectrometry. The digested peptide sequences are searched against the generated NGS database to determine the variable domain sequences (e.g., heavy chain variable domain sequences) of antibodies specific to the antigen of interest.

[0446] Example 1. Immunization of universal light chain mice immunity Kappa universal light chain (κULC) mice (mice containing either a single rearranged human Vk1-39Jκ5 or Vk3-20Jκ1 operably linked to mouse Cκ, and also containing multiple human heavy chain V, D, and J gene segments operably linked to the mouse heavy chain constant region; mice referred to as ULC1-39 or ULC3-20, respectively) were immunized with human CD22.Fc chimeric (hCD22.hFc) immunogen. Kappa universal light chain mice have been previously described, for example, in U.S. Patents 10,130,081, 10,143,186, and US2019 / 0090462, which are incorporated herein by reference in their entirety. Preimmune serum was collected from the mice prior to the initiation of immunization. Mice were immunized at various time intervals using standard adjuvant and immunization protocols. Blood was periodically collected from the mice and antiserum titers were assayed for each antigen.

[0447] Determination of antiserum titer With protein: Antibody titers in serum against immunogens were determined by protein analysis using ELISA. 96-well microtiter plates (Thermo Scientific) were coated overnight at 4°C with 2 μg / ml each of hCD22 or human Fc protein in phosphate-buffered saline (PBS, Irvine Scientific). The plates were washed with phosphate-buffered saline containing 0.05% Tween20 (PBS-T, Sigma-Aldrich) and blocked at room temperature for 1 hour with 300 μl of 0.5% bovine serum albumin in PBS (BSA, Sigma-Aldrich). Pre-immunoantiserum and immunoantiserum were serially diluted 3-fold in 0.5% BSA-PBS and added to the plates at room temperature for 1 hour. The plates were washed, and goat anti-mouse IgG-Fc-horseradish peroxidase (HRP) conjugate secondary antibody (Jackson Immunoresearch) was added and incubated at room temperature for 1 hour. The plates were washed and developed using TMB / H2O2 as the substrate according to the manufacturer's recommended procedure. Absorbance at 450 nm was recorded using a spectrophotometer (Victor, Perkin Elmer). Antibody titers were calculated using Graphpad PRISM software and defined as an interpolated serum dilution factor where the binding signal was twice the background.

[0448] In cells: Antibody titers in serum against immunogens were determined in cells using Meso Scale Discovery (MSD) cell-binding ELISA. 96-well carbon surface plates were coated with 40,000 cells / well of Raji and Jurkat cells in PBS at 37°C for 1 hour. The cell coating solution was decanted, and the plates were blocked at room temperature (RT) for 1 hour with 150 μL of 2% bovine serum albumin (BSA, Sigma-Aldrich) in PBS. The plates were washed three times with PBS using a plate washer (Molecular Devices' AquaMax® 2000). Pre-immunoantiserum and immunoantiserum were serially diluted three-fold in 1% BSA-PBS and added to the plates at room temperature for 1 hour. The plates were washed, and then 1 μg / mL of goat anti-mouse IgG-Fc ruthenium conjugate secondary antibody was added to the plates and incubated at room temperature for 1 hour. The plates were washed, and the cells were developed by adding 150 μl of MSD's 4× surfactant-free Read Buffer T (diluted to 1×) per well, and read using an MSD SECTOR® imager 600 instrument. Antiserum titers were calculated using Graphpad PRISM software, and antibody titers were defined as interpolated serum dilution factors where the binding signal was twice the background.

[0449] result Humoral immune responses in ULC1-39 and ULC3-20 mice were investigated after immunization with hCD22 protein immunogen. Serum antibody titers were determined using ELISA for human CD22 and human Fc proteins, and for Raji and Jurkat cells using MSD cell binding assays. Mouse antiserum showed high titers against hCD22 and hFc proteins. Highly specific titers were induced in Raji cells (Table 1). Antibody titer was defined as an interpolated serum dilution factor where the binding signal was twice the background. [Table 1]

[0450] For next-generation sequencing (NGS) experiments, spleens and bone marrow were collected from all mice. Serum from each mouse was used in liquid chromatography-mass spectrometry (LC-MS) experiments.

[0451] Example 2. Construction of next-generation sequencing and reference antibody database Example 2.1. Next-generation sequencing (NGS) Next-generation sequencing or repertory sequencing was performed on mouse bone marrow and splenocytes. Bone marrow was collected from the femurs of CD22-immunized mice by flushing the femurs with 1× phosphate-buffered saline (PBS, Gibco) containing 2.5% fetal bovine serum (FBS). Single-cell suspensions were prepared from mouse spleens. Erythrocytes from spleen and bone marrow preparations were lysed with ACK lysis buffer (Gibco). Spleen B cells were positively enriched from whole splenocytes by magnetic cell sorting using anti-CD19 (mouse, B cell marker) magnetic beads and a MACS® column (Miltenyi Biotech). Each mouse tissue was processed into four copies for repertory sequencing. Total RNA was isolated from bone marrow and purified spleen B cells using the RNeasy Plus RNA isolation kit (Qiagen) according to the manufacturer's instructions.

[0452] Reverse transcription was performed using the SMARTer® RACE cDNA amplification kit (Clontech) and oligo dT primers to generate human heavy chain cDNA containing the IgG constant region sequence. During reverse transcription, the reverse complement DNA sequence of the template switching (TS) primer was attached to the 3' end of the newly synthesized cDNA. The purified cDNA was amplified by two rounds of semi-nested PCR to generate multiple cDNAs encoding the complete IgG variable domain complement expressed by the mRNA-obtaining cells, and a third round of PCR was subsequently performed to attach sequencing primers and indices. Exemplary primers used to construct the IgG repertoire library are shown in Table 2. [Table 2]

[0453] Human variable domain cDNA was sized to 400–700 bp using Pippin Prep (SAGE Science), quantified by qPCR using the KAPA Library Quantification Kit (KAPA Biosystems), and then loaded onto a MiSeq sequencer (Illumina) for 2 × 300 cycles of sequencing.

[0454] Example 2.2. Construction of an Antibody Reference Database A mouse-specific protein sequence database was constructed using variable diversity junction (VDJ) region sequences from ULC mice and classified by tissue of each mouse sample. VDJ sequence data obtained by NGS were first reverse multiplexed and filtered based on quality, length, and perfect match with IgG constant region primers. Duplicate pai...

Claims

1. A method for identifying the human immunoglobulin variable domain or CDR sequence of an antigen-specific antibody, (i) Obtaining a plurality of peptide sequences derived from the human immunoglobulin heavy chain variable domain and / or human immunoglobulin light chain variable domain from an antibody population produced by rodents immunized with the antigen, wherein the antibody population is specific to the antigen, the peptide sequences include a CDR sequence, and the rodents have their germline genome, An immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the immunoglobulin heavy chain variable region is operably linked to a constant region, An immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the immunoglobulin light chain variable region is operably linked to a constant region, and the immunoglobulin light chain variable region comprises the immunoglobulin light chain variable region, (ii) Comparing a library of human immunoglobulin heavy chain variable domain sequences and / or human immunoglobulin light chain variable domain sequences with the plurality of peptide sequences, thereby obtaining a human immunoglobulin variable domain or CDR sequence of an antibody specific to the antigen, The library comprises a human immunoglobulin heavy chain variable domain sequence and / or a human immunoglobulin light chain variable domain sequence that is specific to the antigen and encoded by the rodent B cells, The matching includes aligning the plurality of peptide sequences with each other and with the human immunoglobulin heavy chain variable domain and / or human immunoglobulin light chain variable domain of the library, The method, including the method described above.

2. A method for identifying the human immunoglobulin variable domain or CDR sequence of an antigen-specific antibody, (i) To obtain a library of human immunoglobulin heavy chain variable domain sequences and / or human immunoglobulin light chain variable domain sequences, The library comprises a human immunoglobulin heavy chain variable domain sequence and / or a human immunoglobulin light chain variable domain sequence that is specific to the antigen and encoded by rodent B cells immunized with the antigen. The aforementioned rodents, in their germline genome, An immunoglobulin heavy chain variable region comprising one or more human heavy chain V gene segments, one or more human D gene segments, and one or more human heavy chain J gene segments, wherein the immunoglobulin heavy chain variable region is operably linked to a constant region, An immunoglobulin light chain variable region comprising one or more human light chain V gene segments and one or more human light chain J gene segments, wherein the immunoglobulin light chain variable region is operably linked to a constant region, and the immunoglobulin light chain variable region comprises the immunoglobulin light chain variable region, (ii) Matching a library with a plurality of peptide sequences derived from human immunoglobulin heavy chain variable domains and / or human immunoglobulin light chain variable domains from an antibody population produced by rodents immunized with the antigen, wherein the antibody population is specific to the antigen, the peptide sequences include CDR sequences, and the matching comprises aligning the plurality of peptide sequences with each other and with the human immunoglobulin heavy chain variable domains and / or human immunoglobulin light chain variable domains of the library, The aforementioned method.

3. The method according to claim 1 or 2, wherein the human immunoglobulin heavy chain variable domain sequence and / or human immunoglobulin light chain variable domain sequence of the library are obtained by sequencing a sample containing a population of B cells from the bone marrow and / or spleen of a rodent.

4. The method according to any one of claims 1 to 3, wherein the human immunoglobulin heavy chain variable domain sequence and / or human immunoglobulin light chain variable domain sequence of the library are obtained by sequencing the cDNA of a rodent containing a rearranged heavy chain VDJ sequence and / or a rearranged light chain VJ sequence.

5. The method according to claim 3 or 4, wherein the sequencing includes next-generation DNA sequencing.

6. The method according to any one of claims 1 to 5, wherein the antibody population specific to the antigen is derived from the serum, plasma, lymphoid organs, intestines, cerebrospinal fluid, brain, spinal cord, and / or placenta of the rodent.

7. The method according to any one of claims 1 to 6, wherein the plurality of peptide sequences derived from the human immunoglobulin heavy chain variable domain and / or human immunoglobulin light chain variable domain are obtained or determined by mass spectrometry.

8. The method according to claim 7, wherein the mass spectrometry includes a combination of liquid chromatography and mass spectrometry (LC-MS).

9. The method according to claim 7 or 8, wherein the antibody population specific to the antigen is denatured before MS analysis.

10. The method according to any one of claims 7 to 9, wherein the antibody population specific to the antigen is proteolytically digested before MS analysis.

11. The method according to any one of claims 7 to 10, wherein the antibody population specific to the antigen is enriched for one or more features before MS analysis.

12. The method according to claim 11, wherein the antibody population specific to the antigen is concentrated with respect to the antibody that binds to the antigen.

13. The method according to claim 12, wherein the antibody population specific to the antigen was depleted of antibodies that bind to a second different antigen.

14. Matching the aforementioned plurality of peptide sequences with the library is (1) The matching of the CDR3 sequence in the library of the human immunoglobulin heavy chain variable domain sequence and / or human immunoglobulin light chain variable domain sequence with a unique peptide obtained or determined by MS, (2) The matching of the unique CDR1 and / or CDR2 sequences in the library of the human immunoglobulin heavy chain variable domain sequence and / or human immunoglobulin light chain variable domain sequence with one or more unique peptides obtained or determined by MS, (3) Matching of one or more framework sequences in the library of human immunoglobulin heavy chain variable domain sequences and / or human immunoglobulin light chain variable domain sequences with one or more unique peptides obtained or determined by MS, (4) The number of next-generation sequencing counts of the sequences in the library of the human immunoglobulin heavy chain variable domain sequence and / or human immunoglobulin light chain variable domain sequence, (5) Exclusion of CDR sequences containing methionine, (6) Exclusion of CDR sequences that may be N-glycosylated, or (7) Any combination of them The method according to any one of claims 7 to 13, based on the present invention.

15. The method according to any one of claims 1 to 14, wherein by matching the plurality of peptide sequences with the library, a plurality of human immunoglobulin variable domain sequences or CDR sequences of an antibody specific to the antigen are identified, and the plurality of human immunoglobulin variable domain sequences or CDR sequences are ranked.

16. The method according to any one of claims 1 to 15, wherein the rodent is a rat.

17. The method according to any one of claims 1 to 15, wherein the rodent is a mouse.

18. The method according to any one of claims 1 to 17, wherein the immunoglobulin heavy chain variable region in the germ cell line genome of the rodent is operably linked to the rodent heavy chain constant region, and / or the immunoglobulin light chain variable region in the germ cell line genome of the rodent is operably linked to the rodent light chain constant region.

19. The method according to claim 18, wherein the immunoglobulin heavy chain variable region operably linked to the rodent heavy chain constant region is located at the endogenous rodent heavy chain locus, and / or the immunoglobulin light chain variable region operably linked to the rodent light chain constant region is located at the endogenous rodent light chain locus.

20. The immunoglobulin heavy chain variable region in the germline genome of the rodent comprises a plurality of human heavy chain V gene segments, a plurality of human D gene segments, and a plurality of human heavy chain J gene segments, and the heavy chain variable region is operably linked to the rodent heavy chain constant region. The immunoglobulin light chain variable region in the germ cell lineage genome of the aforementioned rodent, (i) Operablely connected to the constant region of the rodent light chain, a single human V L Gene segment and single human light J L A universal light chain coding sequence containing a rearranged human light chain variable region, including a gene segment; (ii) Routinely linked to the constant region of the rodent light chain, two unreorganized human V L Gene segment and one or more unreorganized human J L A restricted light chain variable region including a gene segment; or (iii) A histidine-modified light chain variable region operably linked to the constant region of a rodent light chain, comprising one or more human light chain V gene segments and one or more human light chain J gene segments, further comprising the substitution or insertion of at least one histidine to a non-histidine residue. The method according to any one of claims 1 to 19, including the method described in any one of claims 1 to 19.

21. The immunoglobulin light chain variable region in the germ cell lineage genome of the rodent comprises a plurality of human light chain V gene segments and a plurality of human light chain J gene segments, the light chain variable region is operably linked to the rodent light chain constant region, and the immunoglobulin heavy chain variable region is (i) Routinely connected to the rodent heavy chain constant region, a single human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H A restricted, non-reorganized heavy chain variable region containing a gene segment; (ii) operably linked to a rodent heavy chain constant region and comprising a single human V H gene segment, a single human D H gene segment, and a single human J H gene segment, a universal heavy chain coding sequence comprising a single rearranged human heavy chain variable region; or (iii) Routinely linked to the rodent heavy chain steady region, one or more unreorganized human V H Gene segment and one or more unreorganized human D H Gene segment and one or more unreorganized human J H A histidine-modified non-reorganized heavy chain variable region comprising a gene segment and further comprising the substitution or insertion of at least one histidine residue into a non-histidine residue. The method according to any one of claims 1 to 19, including the method described in any one of claims 1 to 19.

22. The immunoglobulin light chain variable region in the germline genome of the rodent comprises a universal light chain coding sequence including a rearranged human light chain variable region comprising a single human Vκ gene segment and a single human light Jκ gene segment, wherein the rearranged human light chain variable region is located at the endogenous rodent κ light chain locus and is operably linked to the rodent light chain constant region. The immunoglobulin heavy chain variable region comprises a plurality of human heavy chain V gene segments, a plurality of human D gene segments, and a plurality of human heavy chain J gene segments, and the heavy chain variable region is operably linked to a rodent heavy chain constant region. The method according to any one of claims 1 to 19.

23. The immunoglobulin light chain variable region in the germline genome of the rodent comprises an engineered immunoglobulin κ light chain locus comprising a single rearranged human immunoglobulin λ light chain variable region, which includes a human Vλ gene segment ligated to a human Jλ gene segment. The immunoglobulin heavy chain variable region comprises a plurality of human heavy chain V gene segments, a plurality of human D gene segments, and a plurality of human heavy chain J gene segments, and the heavy chain variable region is operably linked to a rodent heavy chain constant region. The method according to any one of claims 1 to 19.

24. The method according to any one of claims 1 to 23, wherein the rodent further comprises a functional ADAM6 gene, and optionally the functional ADAM6 gene is a rodent ADAM6 gene.

25. The method according to any one of claims 1 to 24, wherein the rodent further expresses an exogenous terminal deoxynucleotidyltransferase (TdT) gene.

26. The method according to any one of claims 1 to 25, further comprising expressing the nucleotide sequence encoding the human immunoglobulin heavy chain variable domain and / or human immunoglobulin light chain variable domain obtained after matching in a recombinant antigen-binding protein.

27. The method according to claim 26, wherein the recombinant antigen-binding protein is a human antibody.

28. The method according to claim 26, wherein the recombinant antigen-binding protein is a bispecific antibody.

29. A method for producing fully human immunoglobulin heavy chains and / or fully human immunoglobulin light chains, (a) Obtaining a human immunoglobulin heavy chain variable domain sequence and / or a human immunoglobulin light chain variable domain sequence by the method described in any one of claims 1 to 28; (b) Operately linking the nucleic acid encoding the variable domain of the human immunoglobulin heavy chain with the nucleic acid encoding the constant domain of the human immunoglobulin heavy chain to form a complete human immunoglobulin heavy chain, and / or The nucleic acid encoding the variable domain of the human immunoglobulin light chain is operably linked to the nucleic acid encoding the constant domain of the human immunoglobulin light chain to form a complete human immunoglobulin light chain; (c) Expressing the full human immunoglobulin heavy chain and / or the full human immunoglobulin light chain, The method, including the method described above.