Polypeptide libraries and methods for purifying them
By using a chaperone-assisted method for purifying nucleic acid-peptide complexes in nanopore sequencing, the problem of unreacted raw materials and byproducts in peptide-nucleic acid complex purification was solved, thereby improving the quality and effectiveness of sequencing signals.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENZHEN HUADA GENE INST
- Filing Date
- 2024-12-27
- Publication Date
- 2026-06-30
AI Technical Summary
In existing nanopore sequencing technologies, the purification methods for peptide-nucleic acid complexes cannot effectively remove incompletely reacted raw materials and non-target byproducts, resulting in poor sequencing signal quality and affecting signal analysis.
A nucleic acid-peptide complex containing a chaperone molecule is used to link peptide molecules through complementary pairing of template molecule, chaperone molecule, first nucleic acid and second nucleic acid, and purification is performed using specific cleavage sites and vectors to remove unreacted and non-target products.
It improved the quality and proportion of peptide sequencing signals, simplified the signal analysis process, and enhanced the effectiveness of sequencing signals.
Smart Images

Figure CN122302000A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of biotechnology, specifically to a polypeptide library, a method for purifying the polypeptide library, and a method for sequencing the polypeptide library. Background Technology
[0002] Proteomics is one of the most important life science research topics in the post-Human Genome Project, and its foundation lies in the reading of the amino acid sequences that make up proteins. Developing reliable protein sequencing technologies is crucial for obtaining protein amino acid sequence information more accurately and rapidly. Related protein sequence resolution methods include: Edman degradation, mass spectrometry, sequence fluorescence identification, amino acid side-chain fluorescent labeling identification, terminal amino acid fluorescent identification, nanopore single-molecule sequencing, and so on. Among these, nanopore peptide sequencing technology, which measures the change in current signal as a polypeptide chain passes through a nanopore and analyzes its correspondence with different amino acids, can achieve rapid reading of polypeptide sequences at the single-molecule level.
[0003] The current technical principles of nanopore sequencing for qualitative detection of proteins mainly fall into two categories: (1) allowing peptides to pass directly through nanopores under the action of an electric field, and using the blocking current signal generated during the passage of the peptides to analyze the peptides; (2) linking nucleic acids with peptides to form a "nucleic acid-peptide" complex, allowing the complex to pass through nanopores under the traction of motor proteins such as helicases or polymerases, and analyzing the blocking current signal generated during the passage of the complex. However, when forming the "nucleic acid-peptide" complex, there are inevitably unreacted raw materials and non-target byproducts, which will generate a large number of invalid and difficult-to-distinguish signals during nanopore sequencing. This not only reduces the proportion of target peptide signals, but also has an adverse effect on subsequent signal analysis.
[0004] Therefore, there is an urgent need for a new peptide library and a method for purifying the peptide library so that high-quality sequencing signals can be obtained during sequencing. Summary of the Invention
[0005] The present invention aims to at least partially solve one of the technical problems in the related art.
[0006] Therefore, a first aspect of the present invention provides a nucleic acid-peptide complex containing a chaperone molecule, the complex comprising: a peptide molecule and a double strand, the double strand comprising a template molecule, a chaperone molecule, a first nucleic acid and a second nucleic acid, at least a portion of the first nucleic acid being complementary to a first segment of the template molecule, at least a portion of the second nucleic acid being complementary to a second segment of the template molecule, the chaperone molecule being attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid, the peptide molecule being attached to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid, wherein the template molecule has a first site.
[0007] In some embodiments, the chaperone molecule is selected from any one or more of the following: PEG, spacer, deoxyribose phosphate, ribose phosphate, nucleotide, deoxynucleotide, G-quadruplex, peptide nucleic acid, or locked nucleotide.
[0008] In some embodiments, the spacer is selected from any one or more of the following: Spacer C3, Spacer C6, Spacer 9, Spacer C12, or Spacer 18.
[0009] In some embodiments, the average molecular weight of the PEG is 300-20000.
[0010] In some embodiments, the PEG is selected from any one or more of the following: PEG-300, PEG-400, PEG-800, PEG-1000, PEG-1500, PEG-2000, PEG-3000, PEG-4000, PEG-6000, PEG-8000, or PEG-20000.
[0011] In some embodiments, the first site is located in the first segment or the second segment of the template molecule, or the first segment and the second segment of the template molecule are not adjacent, and the first site is located between the first segment and the second segment of the template molecule.
[0012] In some embodiments, the first site is selected from at least one of the AP site, nuclease recognition site, and cleavable chemical bond.
[0013] In some embodiments, the nuclease recognition site is: hypoxanthine nucleotide, ribonucleotide, deoxyuridine nucleotide, or restriction endonuclease recognition site.
[0014] In some embodiments, the breakable chemical bond includes at least one of disulfide bond, diselenide bond, imine bond, acylhydrazone bond, disulfide bond, ester bond, and borate ester bond.
[0015] In some embodiments, the number of constituent units of the chaperone molecule is ≥1; the number of amino acids of the polypeptide molecule is ≥2.
[0016] In some embodiments, the polypeptide molecule is linked to the first nucleic acid, the second nucleic acid, and / or the chaperone molecule via a coupling reaction.
[0017] In some embodiments, the coupling reaction comprises a reaction of at least one group of the following functional groups: amine with aryl azide, amine with hydroxymethylphosphine, amine with imine ester, amine with NHS ester, amine with PFP ester, amine or carboxyl with carbodiimide, carbohydrate with acyl hydrazine, hydroxyl with isocyanate, carbonyl with hydrazine, mercapto with maleimide or pyridyl disulfide, mercaptoamine or hydroxyl with vinyl sulfone or vinyl sulfonamide, and azide with alkyne.
[0018] In some embodiments, the reaction of the azide with the alkyne includes: azide-DBCO click chemistry, azide-OCT click chemistry, azide-DIBO click chemistry, azide-BARAC click chemistry, azide-ALO click chemistry, azide-DIFO click chemistry, azide-MOFO click chemistry, azide-DIBAC click chemistry, azide-DIMAC click chemistry, or azide-cyclooctene click chemistry.
[0019] In some embodiments, the double strand is: a hybrid of a first nucleic acid - an int DBCO-modified deoxyribonucleotide - a chaperone molecule, a DBCO-modified deoxyribonucleotide - a second nucleic acid, and a template molecule; or a hybrid of a first nucleic acid - an int DBCO-modified deoxyribonucleotide, a chaperone molecule - an int DBCO-modified deoxyribonucleotide - a second nucleic acid, and a template molecule.
[0020] In some embodiments, the length of the first nucleic acid is ≥1 nt, the length of the second nucleic acid is ≥2 nt, and the length of the template molecule is ≥1 nt.
[0021] In some embodiments, the length of the first nucleic acid is 5-500 nt; the length of the second nucleic acid is 5-500 nt; and the length of the template molecule is 5-1200 nt.
[0022] In some embodiments, the 5' end of the first nucleic acid contains a phosphorylation group, and the 3' end of the template molecule contains a protruding mononucleotide.
[0023] In some embodiments, the complex further includes a linker connected to the 5' end of the first nucleic acid and / or the 3' end of the template molecule; preferably, the linker is a dual-linker, more preferably a Y-linker.
[0024] In some embodiments, the complex includes a first ligation product, a second ligation product, and / or a third ligation product, wherein the polypeptide molecule of the first ligation product is linked to both the 3' end of the first nucleic acid and the 5' end of the second nucleic acid, the polypeptide molecule of the second ligation product is linked only to the 3' end of the first nucleic acid, and the polypeptide molecule of the third ligation product is linked only to the 5' end of the second nucleic acid.
[0025] A second aspect of the present invention provides a method for preparing a nucleic acid-peptide complex containing a chaperone molecule as described in any embodiment of the first aspect of the present invention. The method includes: providing a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid to obtain a first mixture, wherein the chaperone molecule is linked to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid; annealing the first mixture such that the first nucleic acid and the second nucleic acid in the first mixture are complementaryly paired with the template molecule, thereby obtaining a second mixture containing a double strand; and adding a peptide molecule to the second mixture such that the peptide molecule is linked to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid through a coupling reaction, thereby obtaining the nucleic acid-peptide complex containing the chaperone molecule. It is understood that the characteristics of the nucleic acid-peptide complex containing a chaperone molecule proposed in the first aspect of the present invention are also applicable to the method described in the second aspect of the present invention, and will not be repeated here.
[0026] In some embodiments, the molar ratio of the template molecule, the first nucleic acid, the second nucleic acid, and the polypeptide molecule in the first mixture is 1:1:1:1.
[0027] In some embodiments, the annealing process includes heating the first mixture to 65°C and cooling it to 25°C at a rate of 0.1°C / s.
[0028] In some embodiments, the coupling reaction is carried out at room temperature for 12-72 hours.
[0029] In some embodiments, the method further includes: providing a connector, contacting the connector with the nucleic acid-peptide complex containing the chaperone molecule, such that the connector is attached to the 5' end of a first nucleic acid in the complex and / or the 3' end of the template molecule, thereby obtaining a nucleic acid-peptide complex containing the connector and the chaperone molecule.
[0030] In some embodiments, the connector is a dual-connector, more preferably a Y-connector, and most preferably a Y-connector containing motor proteins.
[0031] A third aspect of the present invention provides a method for purifying nucleic acid-peptide complexes containing chaperone molecules, the method comprising:
[0032] Step a. Provide a mixture containing a ligation product, wherein the ligation product is a nucleic acid-peptide complex containing a chaperone molecule according to any embodiment of the first aspect or a nucleic acid-peptide complex containing a chaperone molecule prepared by the preparation method according to any embodiment of the second aspect;
[0033] Step b. Capture the first ligation product, the second ligation product, and / or the third ligation product in the mixture by at least one of the vector and the vector complex, wherein the vector complex comprises a third nucleic acid and the vector;
[0034] Step c. Using a first reagent or first physical conditions, cleave the first site of the template molecule in the first linker, second linker, and / or third linker captured on at least one of the carrier and the carrier complex to obtain a first purified product, and
[0035] Step d. The second site in the first purified product is cleaved using a second reagent or a second physical condition to obtain a second purified product, wherein at least one of the third nucleic acid, the second nucleic acid, and the second segment of the template molecule has the second site; and the first reagent is different from the second reagent, and the first physical condition is different from the second physical condition.
[0036] In some embodiments, in step c, after cleavage, the template molecule is separated into a first portion consisting of a sequence from the 5' end of the template molecule to the first site and a second portion consisting of a sequence from the first site to the 3' end of the template molecule; wherein, for the first ligation product, the first ligation product is linked to at least one of the vector and the vector complex in its complete form, comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the first portion of the template molecule, and the second portion, as the first purified product; wherein, for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form comprising the second nucleic acid and the second portion of the template molecule, or ii) The first purified product is formed by linking the chaperone molecule to the 5' end of the second nucleic acid and the second ligation product comprising the second nucleic acid, the chaperone molecule, and the template molecule in a second portion thereof, and being linked to at least one of the vector and the vector complex. The third ligation product is formed by linking the chaperone molecule to the 3' end of the first nucleic acid and the third ligation product comprising the second nucleic acid, the polypeptide molecule, and the template molecule in a second portion thereof, or by linking the chaperone molecule to the 5' end of the second nucleic acid and the third ligation product comprising the second nucleic acid, the polypeptide molecule, the chaperone molecule, and the template molecule in a second portion thereof, and being linked to at least one of the vector and the vector complex.
[0037] In some embodiments, in step d, after cleavage, 1) based on the second site being located on the third nucleic acid, the third nucleic acid is separated into a third portion consisting of a sequence from the 5' end of the third nucleic acid to the second site and a fourth portion consisting of a sequence from the second site to the 3' end of the third nucleic acid; wherein, the first ligation product is the second purified product in its complete form comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the template molecule, the first and second portions, and the fourth portion of the third nucleic acid; for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is the fourth portion comprising the second nucleic acid, the second portion of the template molecule, and the third nucleic acid. The second ligation product is, in part, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, wherein the second ligation product comprises the second nucleic acid, the chaperone molecule, the second portion of the template molecule, and the fourth portion of the third nucleic acid, as the second purified product; for the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, wherein the second ligation product comprises the second nucleic acid, the polypeptide molecule, the second portion of the template molecule, and the fourth portion of the third nucleic acid, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, wherein the third ligation product comprises the second nucleic acid, the polypeptide molecule, the chaperone molecule, the second portion of the template molecule, and the fourth portion of the third nucleic acid, as the second purified product; or
[0038] 2) Based on the fact that the second site is located in the second nucleic acid, the second nucleic acid is separated into a third portion consisting of a sequence from the 5' end of the second nucleic acid to the second site and a fourth portion consisting of a sequence from the second site to at least one of the vector and the vector complex; wherein, the first ligation product is the second purified product in its complete form, comprising the first nucleic acid, the third portion of the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the first portion of the template molecule, and the second portion; for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form comprising the third portion of the second nucleic acid and the second portion of the template molecule. ii) Based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the second ligation product is used as the second purified product in the form of a third portion of the second nucleic acid, the chaperone molecule, and a second portion of the template molecule; for the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is used as the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, and a second portion of the template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is used as the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, the chaperone molecule, and a second portion of the template molecule; or
[0039] 3) Based on the fact that the second site is located in the second segment of the template molecule, the second part of the template molecule is divided into a third part consisting of a sequence from the 3' end of the second part to the second site and a fourth part consisting of a sequence from the second site to at least one of the vector and the vector complex; wherein, the first ligation product is the second purified product in its complete form comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the first part of the template molecule, and the third part; for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product comprises the second nucleic acid and the template The second ligation product is the second purified product in the form of a third portion of the molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid; for the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, and a template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, a chaperone molecule, and a template molecule.
[0040] In some embodiments, the mixture of ligation products further includes a double-stranded unligated polypeptide molecule, the double-stranded unligated polypeptide molecule comprising a linker, a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid, the linker being attached to the 5' end of the first nucleic acid and / or the 3' end of the template molecule, at least a portion of the first nucleic acid being complementary to a first segment of the template molecule, at least a portion of the second nucleic acid being complementary to a second segment of the template molecule, the chaperone molecule being attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid, and the double-stranded product not containing a polypeptide molecule.
[0041] In some embodiments, the method further includes:
[0042] In step b, the duplexes of unlinked polypeptide molecules in the mixture are captured by at least one of the carrier and the carrier complex;
[0043] In step c, the first site in the double-stranded unlinked polypeptide molecule captured on the carrier or carrier complex is cleaved using a first reagent or a first physical condition. After cleavage, the template molecule is separated into a first portion consisting of a sequence from the 5' end of the template molecule to the first site and a second portion consisting of a sequence from the first site to the 3' end of the template molecule. The double-stranded unlinked polypeptide molecule is linked to at least one of the carrier or carrier complex as the first purified product, comprising the second nucleic acid and the second portion of the template molecule.
[0044] In step d, the second site in the first purified product is cleaved using a second reagent or a second physical condition to obtain a second purified product.
[0045] Wherein, after cleavage, 1) based on the second site being located on the third nucleic acid, the third nucleic acid is separated into a third part consisting of a sequence from the 5' end of the third nucleic acid to the second site and a fourth part consisting of a sequence from the second site to the 3' end of the third nucleic acid, and the unlinked double strand of the polypeptide molecule serves as the second purified product in the form of containing the second nucleic acid, the second part of the template molecule, and the fourth part of the third nucleic acid; or
[0046] 2) Based on the fact that the second site is located in the second nucleic acid, the second nucleic acid is separated into a third portion consisting of a sequence from the 5' end of the second nucleic acid to the second site and a fourth portion consisting of a sequence from the second site to at least one of the vector and the vector complex, the unlinked dipole molecule duplex serving as the second purified product in the form of the third portion of the second nucleic acid and the second portion of the template molecule; or
[0047] 3) Based on the fact that the second site is located in the second segment of the template molecule, the second part of the template molecule is separated into a third part consisting of a sequence from the 3' end of the second part to the second site and a fourth part consisting of a sequence from the second site to at least one of the vector and the vector complex, and the unlinked polypeptide molecule duplex is used as the second purified product in the form of the second nucleic acid and the third part of the template molecule.
[0048] In some embodiments, the first reagent or the first physical condition can specifically cleave the first site, and the second reagent or the second physical condition can specifically cleave the second site.
[0049] In some embodiments, the first site and the second site are each independently selected from at least one of the AP site, the nuclease recognition site, and the cleavable chemical bond, and the first site is different from the second site.
[0050] In some embodiments, the nuclease recognition site is: hypoxanthine nucleotide, ribonucleotide, deoxyuridine nucleotide, or restriction endonuclease recognition site.
[0051] In some embodiments, the breakable chemical bond includes at least one of disulfide bond, diselenide bond, imine bond, acylhydrazone bond, disulfide bond, ester bond, and borate ester bond.
[0052] In some embodiments, the first reagent and the second reagent each independently include at least one of apurine-free endonuclease 1, RNase, endonuclease, User enzyme, or restriction endonuclease, and the first reagent is different from the second reagent.
[0053] In some embodiments, the first physical condition and the second physical condition each independently include at least one of light, heat or electromagnetic radiation, and the first physical condition is different from the second physical condition.
[0054] In some embodiments, the carrier complex is formed by capturing the first ligation product, the second ligation product, and the third ligation product through the carrier complex, wherein the carrier surface of the carrier complex has a first modification, the end of the third nucleic acid has a second modification, and the carrier and the third nucleic acid react between the first modification and the second modification to form the carrier complex.
[0055] At least a portion of the third nucleic acid is complementary to a portion of the second nucleic acid; or at least a portion of the third nucleic acid is complementary to a portion of the second segment of the template molecule.
[0056] In some embodiments, the third nucleic acid has the second site.
[0057] In some embodiments, the first modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the second modification is a group capable of reacting with the first modification.
[0058] In some embodiments, the second modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the first modification is a group capable of reacting with the second modification.
[0059] In some embodiments, the carrier and the third nucleic acid form the carrier complex through the interaction of biotin and streptavidin.
[0060] In some embodiments, the carrier is at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles.
[0061] In some embodiments, the carrier is a magnetic bead.
[0062] In some embodiments, based on capturing the first ligation product, the second ligation product, and / or the third ligation product through the vector, wherein the surface of the vector has a first modification, and the end of the second nucleic acid has a third modification or the end of the second segment of the template molecule has a fourth modification, the vector and the second nucleic acid are linked through a reaction between the first modification and the third modification, and the second nucleic acid or the second segment of the template molecule has the second site; or the vector and the second segment of the template molecule are linked through a reaction between the first modification and the fourth modification, and the second segment of the template molecule or the second nucleic acid has the second site.
[0063] In some embodiments, the first modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the third modification is a group capable of reacting with the first modification.
[0064] In some embodiments, the third modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the first modification is a group capable of reacting with the third modification.
[0065] In some embodiments, the first modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the fourth modification is a group capable of reacting with the first modification;
[0066] In some embodiments, the fourth modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the first modification is a group capable of reacting with the fourth modification.
[0067] In some embodiments, the vector is linked to the second nucleic acid via biotin and streptavidin.
[0068] In some embodiments, the carrier and the second segment of the template molecule are linked by biotin and streptavidin.
[0069] In some embodiments, the carrier is at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles. In some preferred embodiments, the carrier is magnetic beads.
[0070] A fourth aspect of the present invention provides a polypeptide library obtained by the purification method described in any of the third aspects.
[0071] In some embodiments, the polypeptide library is obtained by the purification method described in any of the fourth aspects, wherein the polypeptide library contains a second purified product having a linker.
[0072] In some embodiments, the polypeptide library is a second purified product obtained after purifying the first, second, and third ligation products in the third aspect embodiment of the present invention.
[0073] In some embodiments, the polypeptide library further includes a second purified product obtained by purifying the duplex of the unlinked polypeptide molecule described in the third aspect of the present invention.
[0074] A fifth aspect of the present invention provides a kit for preparing a nucleic acid-peptide complex containing a chaperone molecule, comprising: a double strand, wherein the double strand is used to link with a polypeptide molecule to be tested, wherein the double strand comprises a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid, wherein at least a portion of the first nucleic acid is complementary to a first segment of the template molecule, at least a portion of the second nucleic acid is complementary to a second segment of the template molecule, the chaperone molecule is linked to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid, wherein the template molecule has a first site.
[0075] A sixth aspect of the present invention provides a kit for purifying nucleic acid-peptide complexes containing chaperone molecules, comprising a first reagent, a second reagent, and at least one of a carrier and a carrier complex, wherein the at least one of the carrier and the carrier complex is used to capture the nucleic acid-peptide complex containing the chaperone molecule, the first reagent is used to cleave a first site of the nucleic acid-peptide complex containing the chaperone molecule, and the second reagent is used to cleave a second site of either the nucleic acid-peptide complex containing the chaperone molecule or the second site of the carrier complex, and the first reagent and the second reagent are different.
[0076] In some embodiments, the first reagent and the second reagent are each independently selected from at least one of apurine-free endonuclease 1, RNase, endonuclease, User enzyme, or restriction endonuclease, and the first reagent is different from the second reagent.
[0077] In some embodiments, the carrier is selected from at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles;
[0078] In some embodiments, the surface of the carrier has a first modification, and the first modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen.
[0079] In some embodiments, the carrier complex comprises a third nucleic acid and the carrier, and the third nucleic acid has the second site.
[0080] In some embodiments, the carrier surface in the carrier complex has a first modification, the end of the third nucleic acid has a second modification, and the carrier and the third nucleic acid react between the first modification and the second modification to form the carrier complex.
[0081] The seventh aspect of the present invention proposes the application of the nucleic acid-peptide complex containing a chaperone molecule according to any embodiment of the first aspect of the present invention, the nucleic acid-peptide complex containing a chaperone molecule prepared according to any embodiment of the second aspect, the nucleic acid-peptide complex containing a chaperone molecule purified according to any embodiment of the third aspect, the peptide library according to any embodiment of the fourth aspect, and / or the kit according to any embodiment of the fifth or sixth aspect of the present invention in high-throughput sequencing.
[0082] An eighth aspect of the present invention provides a method for sequencing a polypeptide, the method comprising: adding a polypeptide library according to any embodiment of the fourth aspect of the present invention into a detection solution chamber; under the action of an electric field, controlling the polypeptide library to pass through a nanopore by a motor protein, thereby obtaining an electrical signal corresponding to the polypeptide; and decoding the electrical signal to determine the amino acid information of the polypeptide, wherein optionally, the amino acid information of the polypeptide includes amino acid sequence information and amino acid modification information.
[0083] The advantages and technical effects brought about by the independent claims according to the embodiments of the present invention are as follows:
[0084] (1) The present invention provides a chaperone-assisted peptide library, wherein the single-end free chaperone molecule can pass through the nanopore synchronously with the peptide to be tested. In the synchronous pore-passing state, the volume of "peptide + chaperone molecule" is more suitable for the nanopore protein channel used, which can effectively amplify the signal difference generated by the subtle amino acid residue changes in the peptide to be tested in the nanopore sensing region.
[0085] (2) The purification method of chaperone-assisted polypeptide-double-stranded complex described in the embodiments of the present invention can introduce specific cleavage sites to facilitate subsequent purification using magnetic beads, effectively removing byproducts, unreacted raw materials and excessive sequencing adapters, thereby increasing the proportion of sequencing signals of the target library assisted by chaperone molecules in nanopore sequencing, thus obtaining more effective sequencing signals, and providing convenience for subsequent sequencing signal processing, classification, modeling and algorithm development. Attached Figure Description
[0086] Figure 1 This is a schematic diagram of the structure of a nucleic acid-peptide complex containing a chaperone molecule to be purified in one embodiment of the present invention.
[0087] Figure 2 This invention illustrates a method by which a carrier complex captures the nucleic acid-peptide complex containing the chaperone molecule to be purified, according to one embodiment of the invention.
[0088] Figures 3(a), 3(b), 3(c), and 3(d) are schematic flowcharts of purification methods for various nucleic acid-peptide complexes containing chaperone molecules according to embodiments of the present invention.
[0089] Figure 4 This is a schematic diagram of the binding structure of the polypeptide library and the anchoring sequence in an embodiment of the present invention.
[0090] Figure 5 This is a sequencing signal diagram of the polypeptide library in an embodiment of the present invention. Detailed Implementation
[0091] Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain the present invention, and should not be construed as limiting the present invention.
[0092] This invention is based on the inventor's discoveries and understanding of the following facts and problems:
[0093] Currently, nanopore-based protein sequencing technologies mainly fall into two categories: (1) enabling peptides or proteins to pass directly through nanopores under the influence of an electric field, and using the transpore signal to analyze the peptides. For example, Yu et al. used guanidine hydrochloride to denature proteins, destroying their secondary, tertiary, and quaternary structures, making them free peptide chains that can translocate through nanopores. By analyzing the changes in current generated when different protein peptide chains translocate through the pores, they were able to distinguish between three proteins (DOI: https: / / doi.org / 10.1038 / s41587-022-01598-3); Lucas et al. used trypsin to digest the proteins to be tested, and the resulting peptide fragments passed through nanopores. They collected the electrical signals during translocation, mainly focusing on the degree of current signal retardation and the retardation time. By comparing the peptide signals obtained from the digestion with the electrical signals obtained from sequencing the corresponding pattern peptides separately, the peptides to be tested can be identified (DOI: https: / / doi.org / 10.1021 / acs.nanolett.1c02371). (2) Link the phosphoribosyl backbone of nucleic acid to polypeptide to form a “nucleic acid-polypeptide” complex, so that the complex can stably pass through the nanopore and resolve its signal under the traction of motor proteins such as helicase or polymerase. For example, Huang et al. used Phi29 DNA polymerase as a combination of motor protein and MspA nanoporin to achieve relatively stable control of the perforation rate of specific negatively charged polypeptide chains and readout of nanopore electrical signals through DNA polymerization (DOI: https: / / doi.org / 10.1021 / acs.nanolett.1c02371); Bai et al. used MTA helicase as a combination of motor protein and MspA-M2 nanoporin to achieve rate-controlled sequencing of single-stranded "nucleic acid-peptide-nucleic acid" sequences through the inherent stepping ability of helicase on DNA single strands (DOI: https: / / doi.org / 10.1039 / D1SC04342K); Oxford Nanopore Technologies Limited (ONT) disclosed a protein sequencing scheme based on oligonucleotide-controlled protein rate control, which is designed to first synthesize adapter-peptide-dsDNA. The tail complex was used to partially anneal the single-stranded DNA to form double-stranded DNA, which was then bound to an oligonucleotide control protein. By controlling the DNA translocation rate through the oligonucleotide control protein, the translocation rate of the peptide was indirectly controlled, thereby enabling the reading of the peptide nanopore electrical signal (WO2021 / 111125A1).
[0094] A peptide-oligonucleotide conjugate (POC or PO) is a product formed by linking peptide and nucleic acid molecules together through a specific chemical structure. This type of conjugate combines the functions and advantages of both nucleic acids and peptides, offering advantages such as programmability, high specificity, and diverse biological activities. Nucleic acid-peptide conjugates can be used in nanopore sequencing technology for proteins.
[0095] However, the preparation of the aforementioned "nucleic acid-peptide" complex inevitably produces various byproducts, such as redundant linkers, free target peptides, unlinked nucleic acid products, or peptide-nucleic acid single-end ligation products. The inventors of this application recognized that: firstly, because amino acids are smaller than nucleotides, they cannot generate effective current blocking, resulting in weak sequencing signals; secondly, various byproducts contaminate the sequencing signal; furthermore, regardless of whether the nucleic acid molecule and peptide are successfully coupled at both ends or at the 5' / 3' end, there is almost no significant difference in their physicochemical properties, making traditional HPLC purification methods ineffective. Consequently, incompletely reacted raw materials and undesirable byproducts inevitably occur. Their presence not only reduces the signal proportion of the target peptide but also adversely affects subsequent signal analysis, resulting in poor sequencing signal quality.
[0096] Therefore, it is necessary to develop a nanopore-based protein sequencing library purification method and the protein sequencing library obtained therefrom, so as to achieve the effect of simultaneously purifying the sequencing library and improving the sequencing signal quality.
[0097] Therefore, a first aspect of the present invention provides a nucleic acid-peptide complex containing a chaperone molecule, the complex comprising: a peptide molecule and a double strand, the double strand comprising a template molecule, a chaperone molecule, a first nucleic acid and a second nucleic acid, at least a portion of the first nucleic acid being complementary to a first segment of the template molecule, at least a portion of the second nucleic acid being complementary to a second segment of the template molecule, the chaperone molecule being attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid, the peptide molecule being attached to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid, wherein the template molecule has a first site.
[0098] In some embodiments, the chaperone molecule is selected from any one or more of the following: PEG, spacer, deoxyribose phosphate, ribose phosphate, nucleotide, deoxynucleotide, G-quadruplex, peptide nucleic acid, or locked nucleotide.
[0099] In some embodiments, the spacer is selected from any one or more of the following: Spacer C3, Spacer C6, Spacer 9, Spacer C12, or Spacer 18.
[0100] In some embodiments, the average molecular weight of the PEG is 300-20000.
[0101] In some embodiments, the PEG is selected from any one or more of the following: PEG-300, PEG-400, PEG-800, PEG-1000, PEG-1500, PEG-2000, PEG-3000, PEG-4000, PEG-6000, PEG-8000, or PEG-20000.
[0102] In some embodiments, the first site is located in the first segment or the second segment of the template molecule, or the first segment and the second segment of the template molecule are not adjacent, and the first site is located between the first segment and the second segment of the template molecule.
[0103] In some embodiments, the first site is selected from at least one of the AP site, nuclease recognition site, and cleavable chemical bond.
[0104] In some embodiments, the nuclease recognition site is: hypoxanthine nucleotide, ribonucleotide, deoxyuridine nucleotide, or restriction endonuclease recognition site.
[0105] In some embodiments, the breakable chemical bond includes at least one of disulfide bond, diselenide bond, imine bond, acylhydrazone bond, disulfide bond, ester bond, and borate ester bond.
[0106] In some embodiments, the number of constituent units of the chaperone molecule is ≥1; the number of amino acids of the polypeptide molecule is ≥2.
[0107] In some embodiments, the polypeptide molecule is linked to the first nucleic acid, the second nucleic acid, and / or the chaperone molecule via a coupling reaction.
[0108] In some embodiments, the coupling reaction comprises a reaction of at least one group of the following functional groups: amine with aryl azide, amine with hydroxymethylphosphine, amine with imine ester, amine with NHS ester, amine with PFP ester, amine or carboxyl with carbodiimide, carbohydrate with acyl hydrazine, hydroxyl with isocyanate, carbonyl with hydrazine, mercapto with maleimide or pyridyl disulfide, mercaptoamine or hydroxyl with vinyl sulfone or vinyl sulfonamide, and azide with alkyne.
[0109] In some embodiments, the reaction of the azide with the alkyne includes: azide-DBCO click chemistry, azide-OCT click chemistry, azide-DIBO click chemistry, azide-BARAC click chemistry, azide-ALO click chemistry, azide-DIFO click chemistry, azide-MOFO click chemistry, azide-DIBAC click chemistry, azide-DIMAC click chemistry, or azide-cyclooctene click chemistry.
[0110] In some embodiments, the double strand is: a hybrid of a first nucleic acid - an int DBCO-modified deoxyribonucleotide - a chaperone molecule, a DBCO-modified deoxyribonucleotide - a second nucleic acid, and a template molecule; or a hybrid of a first nucleic acid - an int DBCO-modified deoxyribonucleotide, a chaperone molecule - an int DBCO-modified deoxyribonucleotide - a second nucleic acid, and a template molecule.
[0111] In some embodiments, the length of the first nucleic acid is ≥1 nt, the length of the second nucleic acid is ≥2 nt, and the length of the template molecule is ≥1 nt.
[0112] In some embodiments, the length of the first nucleic acid is 5-500 nt; the length of the second nucleic acid is 5-500 nt; and the length of the template molecule is 5-1200 nt.
[0113] In some embodiments, the 5' end of the first nucleic acid contains a phosphorylation group, and the 3' end of the template molecule contains a protruding mononucleotide.
[0114] In some embodiments, the complex further includes a linker connected to the 5' end of the first nucleic acid and / or the 3' end of the template molecule; preferably, the linker is a dual-linker, more preferably a Y-linker.
[0115] In some embodiments, the complex includes a first ligation product, a second ligation product, and / or a third ligation product, wherein the polypeptide molecule of the first ligation product is linked to both the 3' end of the first nucleic acid and the 5' end of the second nucleic acid, the polypeptide molecule of the second ligation product is linked only to the 3' end of the first nucleic acid, and the polypeptide molecule of the third ligation product is linked only to the 5' end of the second nucleic acid.
[0116] The following combination Figure 1This invention describes the structure of a nucleic acid-peptide complex containing a chaperone molecule to be purified in some embodiments of the present invention. In some embodiments, the nucleic acid-peptide complex containing a chaperone molecule to be purified in some embodiments of the present invention comprises a template molecule (DNA3), a chaperone molecule, a first nucleic acid (DNA1), and a second nucleic acid (DNA2). At least a portion of the first nucleic acid is complementary to a first segment of the template molecule, and at least a portion of the second nucleic acid is complementary to a second segment of the template molecule. The chaperone molecule is attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid; in other words, the chaperone molecule is a single-ended free molecule. The peptide molecule is attached to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid. It can be understood that the peptide molecule (peptide) in the target library is simultaneously attached to both DNA1 and DNA2, while byproducts are single-ended or not attached to DNA1 or DNA2. DNA3 has a first site for subsequent cleavage.
[0117] A second aspect of the present invention provides a method for preparing a nucleic acid-peptide complex containing a chaperone molecule as described in any embodiment of the first aspect of the present invention. The method includes: providing a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid to obtain a first mixture, wherein the chaperone molecule is linked to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid; annealing the first mixture such that the first nucleic acid and the second nucleic acid in the first mixture are complementaryly paired with the template molecule, thereby obtaining a second mixture containing a double strand; and adding a peptide molecule to the second mixture such that the peptide molecule is linked to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid through a coupling reaction, thereby obtaining the nucleic acid-peptide complex containing the chaperone molecule.
[0118] In some embodiments, the molar ratio of the template molecule, the first nucleic acid, the second nucleic acid, and the polypeptide molecule in the first mixture is 1:1:1:1.
[0119] In some embodiments, the annealing process includes heating the first mixture to 65°C and cooling it to 25°C at a rate of 0.1°C / s.
[0120] In some embodiments, the coupling reaction is carried out at room temperature for 12-72 hours.
[0121] In some embodiments, the method further includes: providing a connector, contacting the connector with the nucleic acid-peptide complex containing the chaperone molecule, such that the connector is attached to the 5' end of a first nucleic acid in the complex and / or the 3' end of the template molecule, thereby obtaining a nucleic acid-peptide complex containing the connector and the chaperone molecule.
[0122] In some embodiments, the connector is a dual-joint connector, more preferably a Y-connector, and most preferably a Y-connector containing motor proteins. This application... Figure 4 The diagram shows a Y-shaped linker structure containing a motor protein, in which partial sequences of DNA5 and DNA6 are complementary to form a double-stranded structure, while partial sequences are not complementary to form a forked single-stranded structure, and the motor protein binds to the forked DNA single strand.
[0123] A third aspect of the present invention provides a method for purifying nucleic acid-peptide complexes containing chaperone molecules, the method comprising:
[0124] Step a. Provide a mixture containing a ligation product, wherein the ligation product is a nucleic acid-peptide complex containing a chaperone molecule according to any embodiment of the first aspect or a nucleic acid-peptide complex containing a chaperone molecule prepared by the preparation method according to any embodiment of the second aspect;
[0125] Step b. Capture the first ligation product, the second ligation product, and / or the third ligation product in the mixture by at least one of the vector and the vector complex, wherein the vector complex comprises a third nucleic acid and the vector;
[0126] Step c. Using a first reagent or first physical conditions, cleave the first site of the template molecule in the first linker, second linker, and / or third linker captured on at least one of the carrier and the carrier complex to obtain a first purified product, and
[0127] Step d. The second site in the first purified product is cleaved using a second reagent or a second physical condition to obtain a second purified product, wherein at least one of the third nucleic acid, the second nucleic acid, and the second segment of the template molecule has the second site; and the first reagent is different from the second reagent, and the first physical condition is different from the second physical condition.
[0128] In some embodiments, the amino acids of the polypeptide include one or more of any 20 canonical amino acids, any unnatural amino acids, and any non-canonical amino acids.
[0129] In some embodiments, a mixture containing ligation products refers to the ligation products (nucleic acid-peptide complexes) and various byproducts generated during the preparation of nucleic acid-peptide complexes, which may include, but are not limited to, redundant linkers, free peptide molecules, free duplexes, and single-end ligation products. It is understood that a free duplex consists of a template molecule and a first nucleic acid and a second nucleic acid not linked to a peptide molecule. A single-end ligation product is formed by a template molecule, a first nucleic acid linked to a peptide molecule, and a second nucleic acid not linked to a peptide molecule; or by a template molecule, a second nucleic acid linked to a peptide molecule, and a first nucleic acid not linked to a peptide molecule; or by a template molecule, a second nucleic acid linked to a peptide molecule, and a first nucleic acid linked to another peptide molecule. The chaperone molecule is attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid; in other words, the chaperone molecule is a single-end free molecule.
[0130] In some embodiments, the ligation product includes a double-ended ligation product and a single-ended ligation product, wherein the polypeptide molecule is simultaneously ligated to the first nucleic acid and the second nucleic acid to form the double-ended ligation product, i.e., the first ligation product, and wherein the polypeptide molecule is respectively ligated to the first nucleic acid or the second nucleic acid to form the single-ended ligation product, i.e., the second ligation product or the third ligation product.
[0131] In some embodiments, the sequence length of the template molecule can be any length between 1 and 10,000 bases.
[0132] In some embodiments, in step c, after cleavage, the template molecule is separated into a first portion consisting of a sequence from the 5' end of the template molecule to the first site and a second portion consisting of a sequence from the first site to the 3' end of the template molecule; wherein, for the first ligation product, the first ligation product is linked to at least one of the vector and the vector complex in its complete form, comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the template molecule, and the first and second portions, as the first purified product. Figure 3(a) (2) shows the structure of the first purified product obtained after purification of the first ligation product; wherein, for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form comprising the second nucleic acid and the second portion of the template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid. The second ligation product, in the form of a second portion comprising the second nucleic acid, a chaperone molecule, and a template molecule, is linked to at least one of the carrier and the carrier complex as the first purified product. Figure 3(b) (2) shows the structure of the first purified product obtained after purification of the second ligation product in i) above. For the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the third ligation product is in the form of a second portion comprising the second nucleic acid, a polypeptide molecule, and a template molecule; or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is in the form of a second portion comprising the second nucleic acid, a polypeptide molecule, a chaperone molecule, and a template molecule, linked to at least one of the carrier and the carrier complex as the first purified product. Figure 3(c) (2) shows the structure of the first purified product obtained after purification of the third ligation product in i) above. It can be understood that the first portion of the template molecule in the second and third ligation products has detached from the carrier and is free in solution because it is not linked to the carrier after cleavage. It can be removed by a washing step (e.g., washing the carrier).
[0133] In some embodiments, in step d, after cleavage, 1) based on the second site being located on the third nucleic acid, the third nucleic acid is separated into a third part consisting of a sequence from the 5' end of the third nucleic acid to the second site and a fourth part consisting of a sequence from the second site to the 3' end of the third nucleic acid; wherein, the first ligation product is the second purified product in its complete form, comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the template molecule, the first part and the second part, and the fourth part of the third nucleic acid, as shown in (3) of Figure 3(a); wherein, for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form comprising the second nucleic acid, the second part of the template molecule, and the fourth part of the third nucleic acid, or ii) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form comprising the second nucleic acid, the second part of the template molecule, and the fourth part of the third nucleic acid, or ii) based on the chaperone molecule being linked to the second nucleic acid. The molecule is attached to the 5' end of the second nucleic acid. The second ligation product is the second purified product in the form of a second part containing the second nucleic acid, the chaperone molecule, the template molecule, and a fourth part containing the third nucleic acid. Figure 3(b) (3) shows the structure of the second purified product obtained after purification of the second ligation product. For the third ligation product, i) based on the chaperone molecule being attached to the 3' end of the first nucleic acid, the second ligation product is the second purified product in the form of a second nucleic acid, a polypeptide molecule, the template molecule, and a fourth part containing the third nucleic acid; or ii) based on the chaperone molecule being attached to the 5' end of the second nucleic acid, the third ligation product is the second purified product in the form of a second nucleic acid, a polypeptide molecule, the chaperone molecule, the template molecule, and a fourth part containing the third nucleic acid. Figure 3(c) (3) shows the structure of the second purified product obtained after purification of the third ligation product.
[0134] 2) Based on the fact that the second site is located in the second nucleic acid, the second nucleic acid is separated into a third portion consisting of a sequence from the 5' end of the second nucleic acid to the second site and a fourth portion consisting of a sequence from the second site to at least one of the vector and the vector complex; wherein, the first ligation product is the second purified product in its complete form, comprising the first nucleic acid, the third portion of the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the first portion of the template molecule, and the second portion; for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form comprising the third portion of the second nucleic acid and the second portion of the template molecule. ii) Based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the second ligation product is used as the second purified product in the form of a third portion of the second nucleic acid, the chaperone molecule, and a second portion of the template molecule; for the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is used as the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, and a second portion of the template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is used as the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, the chaperone molecule, and a second portion of the template molecule; or
[0135] 3) Based on the fact that the second site is located in the second segment of the template molecule, the second part of the template molecule is divided into a third part consisting of a sequence from the 3' end of the second part to the second site and a fourth part consisting of a sequence from the second site to at least one of the vector and the vector complex; wherein, the first ligation product is the second purified product in its complete form comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, the first part of the template molecule, and the third part; for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product comprises the second nucleic acid and the template The second ligation product is the second purified product in the form of a third portion of the molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid; for the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, and a template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is the second purified product in the form of a third portion of the second nucleic acid, a polypeptide molecule, a chaperone molecule, and a template molecule.
[0136] In some embodiments, the mixture of ligation products further includes a double-stranded unligated polypeptide molecule, the double-stranded unligated polypeptide molecule comprising a linker, a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid, the linker being attached to the 5' end of the first nucleic acid and / or the 3' end of the template molecule, at least a portion of the first nucleic acid being complementary to a first segment of the template molecule, at least a portion of the second nucleic acid being complementary to a second segment of the template molecule, the chaperone molecule being attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid, and the double-stranded product not containing a polypeptide molecule.
[0137] In some embodiments, the first reagent or the first physical condition can specifically cleave the first site, and the second reagent or the second physical condition can specifically cleave the second site.
[0138] In some embodiments, the first site and the second site are each independently selected from at least one of the AP site, the nuclease recognition site, and the cleavable chemical bond, and the first site is different from the second site.
[0139] In some embodiments, the nuclease recognition site is: hypoxanthine nucleotide, ribonucleotide, deoxyuridine nucleotide, or restriction endonuclease recognition site.
[0140] In some embodiments, the breakable chemical bond includes at least one of disulfide bond, diselenide bond, imine bond, acylhydrazone bond, disulfide bond, ester bond, and borate ester bond.
[0141] In some embodiments, the first reagent and the second reagent each independently include at least one of apurine-free endonuclease 1, RNase, endonuclease, User enzyme, or restriction endonuclease, and the first reagent is different from the second reagent.
[0142] In some embodiments, the first physical condition and the second physical condition each independently include at least one of light, heat or electromagnetic radiation, and the first physical condition is different from the second physical condition.
[0143] In some embodiments, the carrier complex is formed by capturing the first ligation product, the second ligation product, and the third ligation product through the carrier complex, wherein the carrier surface of the carrier complex has a first modification, the end of the third nucleic acid has a second modification, and the carrier and the third nucleic acid react between the first modification and the second modification to form the carrier complex.
[0144] At least a portion of the third nucleic acid is complementary to a portion of the second nucleic acid; or at least a portion of the third nucleic acid is complementary to a portion of the second segment of the template molecule.
[0145] In some embodiments, the third nucleic acid has the second site.
[0146] In some embodiments, the first modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the second modification is a group capable of reacting with the first modification.
[0147] In some embodiments, the second modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the first modification is a group capable of reacting with the second modification.
[0148] In some embodiments, the carrier and the third nucleic acid form the carrier complex through the interaction of biotin and streptavidin.
[0149] In some embodiments, the carrier is at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles.
[0150] In some embodiments, the carrier is a magnetic bead.
[0151] In some embodiments, based on capturing the first ligation product, the second ligation product, and / or the third ligation product through the vector, wherein the surface of the vector has a first modification, and the end of the second nucleic acid has a third modification or the end of the second segment of the template molecule has a fourth modification, the vector and the second nucleic acid are linked through a reaction between the first modification and the third modification, and the second nucleic acid or the second segment of the template molecule has the second site; or the vector and the second segment of the template molecule are linked through a reaction between the first modification and the fourth modification, and the second segment of the template molecule or the second nucleic acid has the second site.
[0152] In some embodiments, the first modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the third modification is a group capable of reacting with the first modification.
[0153] In some embodiments, the third modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the first modification is a group capable of reacting with the third modification.
[0154] In some embodiments, the first modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the fourth modification is a group capable of reacting with the first modification;
[0155] In some embodiments, the fourth modification is selected from amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide, and antigen, and the first modification is a group capable of reacting with the fourth modification.
[0156] In some embodiments, the vector is linked to the second nucleic acid via biotin and streptavidin.
[0157] In some embodiments, the carrier and the second segment of the template molecule are linked by biotin and streptavidin.
[0158] In some embodiments, the carrier is at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles.
[0159] In some preferred embodiments, the carrier is a magnetic bead.
[0160] Figure 2 This invention illustrates a method by which a carrier complex captures a nucleic acid-peptide complex containing a chaperone molecule (first ligation product) to be purified. The template molecule (DNA3) in the first ligation product contains a first site (RNAseH cleavage site) for subsequent cleavage. The carrier complex consists of a carrier (magnetic beads) and a third nucleic acid (DNA4), the third nucleic acid containing a second site (RNAseA cleavage site) for subsequent cleavage. The surface of the carrier (magnetic beads) is modified with streptavidin, and the 5' end of the third nucleic acid (DNA4) is biotin-labeled. The carrier and the third nucleic acid are linked through the binding of biotin and streptavidin.
[0161] Through the purification methods discussed above, only the intact polypeptide-double-stranded complex in a free state (the target nucleic acid-polypeptide complex, i.e., the second purified product of the first ligation product, the structure of which can be shown as (3) in Figure 3(a)) has a linker, which can obtain an effective sequencing signal for the next sequencing step. Other types of second purified products (such as the second purified product of the second ligation product, the second purified product of the third ligation product, and the second purified product of the unligated polypeptide molecule) cannot be captured by the nanopore and cannot be sequenced because they do not have a linker. The chaperone-assisted polypeptide-double-stranded complex purification method described in this embodiment of the invention can introduce specific cleavage sites, which facilitates subsequent purification using magnetic beads, effectively removing byproducts, unreacted raw materials, and excessive sequencing linkers, thereby increasing the proportion of sequencing signals of the target library assisted by the chaperone molecule in nanopore sequencing, thus obtaining more effective sequencing signals, and providing convenience for subsequent sequencing signal processing, classification, modeling, and algorithm development.
[0162] A fourth aspect of the present invention provides a polypeptide library obtained by the purification method described in any embodiment of the third aspect. It is understood that the nucleic acid-peptide complex containing a chaperone molecule, as described in the first aspect of the present invention, can itself also be a polypeptide library (with lower purity). The second purified product obtained by the purification method of the embodiments of the present invention can also be a polypeptide library. In some embodiments, the polypeptide library is obtained by the purification method described in any embodiment of the third aspect, wherein the polypeptide library is a second purified product containing a linker (with high purity).
[0163] In summary, the embodiments of the present invention provide a polypeptide library in which a single-ended free chaperone molecule can pass through a nanopore simultaneously with the polypeptide to be tested, pushing the amino acid residues on the polypeptide to the nanopore wall, enhancing the interaction between the amino acid residues on the polypeptide and the amino acid residues at the nanopore contraction site, and enhancing the electrical signal characteristics of individual amino acids of the polypeptide.
[0164] A fifth aspect of the present invention provides a kit for preparing a nucleic acid-peptide complex containing a chaperone molecule, comprising: a double strand, wherein the double strand is used to link with a polypeptide molecule to be tested, wherein the double strand comprises a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid, wherein at least a portion of the first nucleic acid is complementary to a first segment of the template molecule, at least a portion of the second nucleic acid is complementary to a second segment of the template molecule, the chaperone molecule is linked to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid, wherein the template molecule has a first site.
[0165] A sixth aspect of the present invention provides a kit for purifying nucleic acid-peptide complexes containing chaperone molecules, comprising a first reagent, a second reagent, and at least one of a carrier and a carrier complex, wherein the at least one of the carrier and the carrier complex is used to capture the nucleic acid-peptide complex containing the chaperone molecule, the first reagent is used to cleave a first site of the nucleic acid-peptide complex containing the chaperone molecule, and the second reagent is used to cleave a second site of either the nucleic acid-peptide complex containing the chaperone molecule or the second site of the carrier complex, and the first reagent and the second reagent are different.
[0166] In some embodiments, the first reagent and the second reagent are each independently selected from at least one of apurine-free endonuclease 1, RNase, endonuclease, User enzyme, or restriction endonuclease, and the first reagent is different from the second reagent.
[0167] In some embodiments, the carrier is selected from at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles;
[0168] In some embodiments, the surface of the carrier has a first modification, and the first modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen.
[0169] In some embodiments, the carrier complex comprises a third nucleic acid and the carrier, and the third nucleic acid has the second site.
[0170] In some embodiments, the carrier surface in the carrier complex has a first modification, the end of the third nucleic acid has a second modification, and the carrier and the third nucleic acid react between the first modification and the second modification to form the carrier complex.
[0171] The seventh aspect of the present invention proposes the application of the nucleic acid-peptide complex containing a chaperone molecule according to any embodiment of the first aspect of the present invention, the nucleic acid-peptide complex containing a chaperone molecule prepared according to any embodiment of the second aspect, the nucleic acid-peptide complex containing a chaperone molecule purified according to any embodiment of the third aspect, the peptide library according to any embodiment of the fourth aspect, and / or the kit according to any embodiment of the fifth or sixth aspect of the present invention in high-throughput sequencing.
[0172] An eighth aspect of the present invention provides a method for sequencing a polypeptide, the method comprising: adding a polypeptide library according to any embodiment of the fourth aspect of the present invention into a detection solution chamber; under the action of an electric field, controlling the polypeptide library to pass through a nanopore by a motor protein, thereby obtaining an electrical signal corresponding to the polypeptide; and decoding the electrical signal to determine the amino acid information of the polypeptide, wherein optionally, the amino acid information of the polypeptide includes amino acid sequence information and amino acid modification information.
[0173] In some embodiments, the nanopore protein used for sequencing may be one or more of the following proteins: α-hemolysin nanopores, Aerolysin nanopores, MspA nanopores, CsgG nanopores, FraC nanopores, Phi29 nanopores, including corresponding variants and mutations of the above nanopores, as well as other solid nanopore structures (e.g., graphene nanopores, gold nanopores, silicon nitride nanopores, silica nanopores, alumina nanopores, etc.).
[0174] In summary, the embodiments of the present invention provide a chaperone-assisted peptide library and its application in sequencing, wherein the single-end free chaperone molecule can pass through the nanopore synchronously with the target peptide. In this synchronous passage state, the volume of "peptide + chaperone molecule" is more suitable for the nanopore protein channel used, which can effectively amplify the signal difference caused by subtle changes in amino acid residues in the target peptide in the nanopore sensing region.
[0175] Unless otherwise specified, the experimental methods used in the following examples are conventional methods, performed according to the techniques or conditions described in the literature in this field or according to the product instructions. Unless otherwise specified, the materials and reagents used in the following examples are commercially available.
[0176] Example
[0177] The DNA sequences used in the following examples were all synthesized by Changzhou New Life Technology Co., Ltd.
[0178] The DNA sequences used in the following examples are as follows:
[0179] 1. DNA 1-chaperone molecules: in - As a 5' phosphorylation modification, the structural formula of Int DBCO dT is: XXXXXXXXXX(SEQ ID NO: 4)-(int DBCO dT) is a chaperone molecule, where X is a baseless deoxynucleoside, and its structural formula is:
[0180] 2. DNA2:
[0181] DBCO-TCCCTTTTTTTTTTGCTGTCTTCTGTCGTCGTTTGTCCTCTTGGTGTTCTTCTTGTCAGTG / ddC / (SEQ ID NO: 2), wherein the DBCO structure modified at the 5' end is:
[0182] ddC is a dideoxycytosine nucleoside.
[0183] 3. DNA3:
[0184] AAACGACGACAGAAGACAGCAAAAAAAAAAGGGATTGGAAGGTTAGG / rG / / rA / / rA / / rA / AAAAAAA CACGAGAAGCA (SEQ ID NO: 3), where / rG / and / rA / are RNA nucleotides guanine ribonucleoside and adenine ribonucleoside, respectively.
[0185] 4. DNA4:
[0186] Biotin-TEG- / rC / / rU / / rC / / rU / / rC / / rU / GCACTGACAAGAAGAACACCAAGAGGA / ddC / (SEQ ID NO: 5), where the structural formula of Biotin-TEG is:
[0187]
[0188] 5. DNA5 and DNA6, where DNA5 and DNA6 combine to form a Y-linker.
[0189] DNA5:
[0190] ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZTTTTTTTTTTTTYYYYGGTTGTTTCTGTTGGTGCTGATATTG CT(SEQ ID NO: 6), where Z is an Int C3 Spacer (iSpC3), and its structural formula is: Y is an Int Spacer 18 (iSp18), and its structure is: DNA6:
[0191]
[0192] 6. DNA7 is the anchoring sequence, which is partially complementary to DNA6.
[0193] DNA7: / Chol-TEG / TTYYYYTTGACCGCTCGCCTC (SEQ ID NO: 8), where 5'-Chol-TEG represents 5'-cholesterol-polyethylene glycol, and Y stands for Int Spacer 18 (iSp18).
[0194] 7. DNA8:
[0195]
[0196] 8. DNA9:
[0197] Maleimide-TCCCTTTTTTTTTTGCTGTCTTCTGTCGTCGTTTGTCCTCTTGGTGTTCTTCTTGTCAGTG / ddC / (SEQ ID NO: 10), wherein the 5' modified Maleimide has the following structural formula:
[0198]
[0199] Example 1
[0200] 1.1 Preparation of nucleic acid-peptide complexes
[0201] The present disclosure describes the process for preparing nucleic acid-peptide complexes.
[0202] (1) The synthesized DNA1-intDBCO-chaperone molecules, DNA2, and DNA3 were annealed in 1×PBS (purchased from Sangon Biotech (Shanghai) Co., Ltd., catalog number: E607008-0500) in a ratio of 1:1:1 to make the DNA1-intDBCO-chaperone molecules, DNA2, and DNA3 complementary pairing.
[0203] (2) After annealing, add 1 equivalent (molar ratio) of peptide1 or peptide2 (final concentration: 20 μM) to the above solution and react at room temperature for 24 h. This allows the azide on peptide1 to link the peptide molecule with the DNA1-int DBCO-chaperone molecule and the modified dibenzocyclooctylene (DBCO) on DNA2 through a click chemical reaction, and to pair complementaryly with DNA3.
[0204] The Peptide1 sequence is: GGSGSSGSR{Lys(N3)} (SEQ ID NO: 11);
[0205] Peptide2 sequence: GRSGSSGSG{Lys(N3)}(SEQ ID NO: 12),
[0206] The structure of {Lys(N3)} is as follows:
[0207]
[0208] (3) Obtain a mixture of nucleic acid-peptide complexes. The mixture contains not only nucleic acid-peptide complexes with double-ended peptide 1 (target product), but also nucleic acid-peptide complexes with peptide 1 coupled only at the 5' end or the 3' end, double-stranded nucleic acid annealing products of uncoupled peptides, and non-target products such as free peptide molecules.
[0209] 1.2 Purification of nucleic acid-peptide complexes
[0210] This disclosure describes a method for purifying a nucleic acid-peptide complex targeting a dual-terminal coupled peptide in a mixture of nucleic acid-peptide complexes. As discussed above, the step of providing the adapter can be performed before or after any step. This embodiment describes the process in detail using the example of providing the adapter first.
[0211] (1) Preparation of Y-type adapters: Annealing the adapter top strand DNA5 and adapter bottom strand DNA6 to obtain the annealed product, and then co-incubating with DNA helicase (Dda) to obtain Y-type adapters.
[0212] (2) Add the Y-linker (2 μM) to the nucleic acid-peptide complex mixture at a ratio of 2:1, and use the T4 ligase kit (purchased from New England Biolabs, catalog number: #M2200L) to ligate the nucleic acid-peptide complex with the Y-linker complex to obtain the library to be purified.
[0213] (3) Incubate 10 μL of SA magnetic beads (purchased from Nanjing Novizan Biotechnology Co., Ltd., catalog number N512-01) with 20 pmol of DNA4 in a shaker at room temperature for 1 h, and then discard the supernatant.
[0214] (4) Wash the magnetic beads in (3) twice with 1×PBS (containing 0.1% Tween) to remove excess free DNA4 and obtain DNA4-modified affinity magnetic beads.
[0215] (5) Add the magnetic beads obtained in step (4) to the 20 pmol library to be purified obtained in step (2) and incubate in a shaker at room temperature for 1 h to enrich the library on the surface of the magnetic beads by utilizing the complementary pairing of DNA2 and DNA4.
[0216] (6) Discard the supernatant, retain the magnetic beads that have been incubated in (5), and wash twice with 1×PBS (0.1% Tween).
[0217] (7) Add 10 μL of 1×NEBuffer (purchased from New England Biolabs, catalog number: #B7001) to resuspend the magnetic beads obtained in (6), and add 0.5 μL of the first reagent RNase H (purchased from New England Biolabs, catalog number: M0297L), and digest at 37°C for 1 h.
[0218] (8) Discard the supernatant, retain the digested magnetic beads from (7), and wash twice with 1×PBS (0.1% Tween).
[0219] (9) Add 10 μL of 1×PBS to resuspend the magnetic beads obtained in (8), and add 0.5 μL of the second reagent RNase A (purchased from New England Biolabs, catalog number: #M0305). Digest at 30°C for 40 min.
[0220] (10) Discard the magnetic beads and keep the supernatant.
[0221] (11) Co-incubate with the anchor sequence DNA7 to achieve complementary pairing.
[0222] Step (10) of this embodiment yields a second purified product, namely, a free double-ended ligation product, a second nucleic acid and a second segment of a single-ended ligation product, and / or a second nucleic acid and a second segment of a single-ended ligation product linked with a polypeptide molecule. Only the free double-ended ligation product has an adapter, and step (11) of this embodiment yields a sequencing library capable of being inserted into a membrane for nanopore sequencing.
[0223] 1.3 Nanopore sequencing of nucleic acid-peptide complexes
[0224] The peptide nanopore sequencing library described in section 1.2 was co-incubated with the anchored sequence DNA7 and then added to the solution chamber of the sequencing chip. Nanopore sequencing was performed by applying an external electric field and controlling the rate with motor proteins, and sequencing data were collected. The results are as follows: Figure 5 As shown in the figures, the dashed boxes in (a) and (b) correspond to the nanopore sequencing signals of peptide1 and peptide2, respectively. It can be seen that there are significant differences between the two, proving that the proposed method can effectively distinguish peptides with different sequences through the differences in nanopore sequencing signals. The proportion of effective sequencing signals among all sequencing signals can reach 80-90%.
[0225] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this invention, "a plurality of" means at least two, such as two, three, etc., unless otherwise explicitly specified.
[0226] In this invention, unless otherwise explicitly specified and limited, the terms "installation," "connection," "linking," and "fixing," etc., should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral part; they can refer to a mechanical connection, an electrical connection, or a connection that allows communication between them; they can refer to a direct connection or an indirect connection through an intermediate medium; they can refer to the internal communication of two components or the interaction between two components, unless otherwise explicitly limited. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.
[0227] In this invention, unless otherwise explicitly specified and limited, "above" or "below" the second feature can mean that the first feature is in direct contact with the second feature, or that the first feature is in indirect contact with the second feature through an intermediate medium. Furthermore, "above," "over," and "on top" of the second feature can mean that the first feature is directly above or diagonally above the second feature, or simply that the first feature is at a higher horizontal level than the second feature. "Below," "below," and "under" the second feature can mean that the first feature is directly below or diagonally below the second feature, or simply that the first feature is at a lower horizontal level than the second feature.
[0228] In this invention, the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to a specific feature, structure, material, or characteristic described in connection with that embodiment or example, which is included in at least one embodiment or example of the invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0229] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
Claims
1. A nucleic acid-peptide complex containing a chaperone molecule, characterized in that, The complex comprises: polypeptide molecules and duplexes, The double-stranded compound comprises a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid. At least a portion of the first nucleic acid is complementary to a first segment of the template molecule, and at least a portion of the second nucleic acid is complementary to a second segment of the template molecule. The chaperone molecule is attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid. The polypeptide molecule is attached to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid. The template molecule has a first dot.
2. The complex according to claim 1, characterized in that, The chaperone molecule is selected from any one or more of the following: PEG, spacer, deoxyribose phosphate, nucleotide, deoxynucleotide, G-quadruplex, peptide nucleic acid, or locked nucleotide. Optionally, the spacer is selected from any one or more of the following: Spacer C3, Spacer C6, Spacer 9, Spacer C12, or Spacer 18. Optionally, the average molecular weight of the PEG is 300-20000. Preferably, the PEG is selected from any one or more of the following: PEG-300, PEG-400, PEG-800, PEG-1000, PEG-1500, PEG-2000, PEG-3000, PEG-4000, PEG-6000, PEG-8000 or PEG-20000.
3. The complex according to claim 1 or 2, characterized in that, The first site is located in the first segment or the second segment of the template molecule, or The first and second segments of the template molecule are not adjacent, and the first site is located between the first and second segments of the template molecule.
4. The complex according to claim 1, characterized in that, The first site is selected from at least one of the AP site, nuclease recognition site, and cleavable chemical bond; Preferably, the nuclease recognition site is: inosine nucleotide, ribonucleotide, deoxyuridine nucleotide, or restriction endonuclease recognition site; Preferably, the breakable chemical bond includes at least one of disulfide bond, diselenide bond, imine bond, acylhydrazone bond, disulfide bond, ester bond, and borate ester bond.
5. The complex according to claim 1 or 2, characterized in that, The number of constituent units of the chaperone molecule is ≥1; the number of amino acids of the polypeptide molecule is ≥2.
6. The complex according to claim 1, characterized in that, The polypeptide molecule is linked to the first nucleic acid, the second nucleic acid, and / or the chaperone molecule via a coupling reaction. Preferably, the coupling reaction comprises a reaction of at least one group of the following functional groups: amine with aryl azide, amine with hydroxymethylphosphine, amine with imine ester, amine with NHS ester, amine with PFP ester, amine or carboxyl group with carbodiimide, carbohydrate with acyl hydrazine, hydroxyl group with isocyanate, carbonyl group with hydrazine, mercapto group with maleimide or pyridyl disulfide, mercaptoamine or hydroxyl group with vinyl sulfone or vinyl sulfonamide, and azide with alkyne. Preferably, the reaction of azides with alkynes includes: azide-DBCO click chemistry reaction, azide-OCT click chemistry reaction, azide-DIBO click chemistry reaction, azide-BARAC click chemistry reaction, azide-ALO click chemistry reaction, azide-DIFO click chemistry reaction, azide-MOFO click chemistry reaction, azide-DIBAC click chemistry reaction, azide-DIMAC click chemistry reaction, or azide-cyclooctene click chemistry reaction.
7. The complex according to claim 6, characterized in that, The double strand is: a first nucleic acid - an int DBCO-modified deoxyribonucleotide - a chaperone molecule, a DBCO-modified deoxyribonucleotide - a second nucleic acid, and a hybrid with the template molecule; or First nucleic acid - DBCO-modified deoxyribonucleotide, chaperone molecule - int DBCO-modified deoxyribonucleotide - second nucleic acid, hybrid with template molecule. Optionally, the length of the first nucleic acid is ≥1 nt, the length of the second nucleic acid is ≥2 nt, and the length of the template molecule is ≥1 nt; Preferably, the length of the first nucleic acid is 5-500 nt; the length of the second nucleic acid is 5-500 nt; and the length of the template molecule is 5-1200 nt.
8. The complex according to claim 1, characterized in that, The first nucleic acid has a phosphorylation group at its 5' end, and the template molecule has a protruding mononucleotide at its 3' end.
9. The complex according to claim 1, characterized in that, The complex further includes a linker that is attached to the 5' end of the first nucleic acid and / or the 3' end of the template molecule; Preferably, the connector is a double connector, and more preferably a Y-type connector.
10. The complex according to claim 9, characterized in that, The complex includes a first ligation product, a second ligation product, and / or a third ligation product, wherein the polypeptide molecule in the first ligation product is linked to both the 3' end of the first nucleic acid and the 5' end of the second nucleic acid, the polypeptide molecule in the second ligation product is linked only to the 3' end of the first nucleic acid, and the polypeptide molecule in the third ligation product is linked only to the 5' end of the second nucleic acid.
11. A method for preparing a nucleic acid-peptide complex containing a chaperone molecule as described in any one of claims 1 to 10, characterized in that, The method includes: A template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid are provided to obtain a first mixture, wherein the chaperone molecule is attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid; The first mixture is annealed to allow the first and second nucleic acids in the first mixture to complementary pair with the template molecule, thereby obtaining a second mixture containing a double strand; and Add polypeptide molecules to the second mixture, so that the polypeptide molecules are linked to the 3' end of the first nucleic acid and / or the 5' end of the second nucleic acid through a coupling reaction, thereby obtaining the nucleic acid-polypeptide complex containing the chaperone molecule.
12. The method according to claim 11, characterized in that, The molar ratio of the template molecule, the first nucleic acid, the second nucleic acid, and the polypeptide molecule in the first mixture is 1:1:1:
1. Optionally, the annealing treatment includes heating the first mixture to 65°C and then cooling it to 25°C at a rate of 0.1°C / s. Optionally, the coupling reaction is carried out at room temperature for 12-72 hours.
13. The method according to claim 11, characterized in that, The method further includes: A connector is provided, which is then contacted with the nucleic acid-peptide complex containing the chaperone molecule, such that the connector is attached to the 5' end of the first nucleic acid in the complex and / or the 3' end of the template molecule, thereby obtaining a nucleic acid-peptide complex containing the connector and the chaperone molecule. Preferably, the connector is a dual-connector, more preferably a Y-type connector, and most preferably a Y-type connector containing motor proteins.
14. A method for purifying nucleic acid-peptide complexes containing chaperone molecules, characterized in that, The method includes: Step a. Provide a mixture containing the linkage product. The ligation product is the nucleic acid-peptide complex containing a chaperone molecule according to claim 10 or the nucleic acid-peptide complex containing a chaperone molecule prepared by the preparation method according to claim 13. Step b. Capture the first ligation product, the second ligation product, and / or the third ligation product in the mixture by at least one of the vector and the vector complex, wherein the vector complex comprises a third nucleic acid and the vector; Step c. Using a first reagent or first physical conditions, cleave the first site of the template molecule in the first linker, second linker, and / or third linker captured on at least one of the carrier and the carrier complex to obtain a first purified product; and Step d. The second site in the first purified product is cleaved using a second reagent or a second physical condition to obtain a second purified product, wherein at least one of the third nucleic acid, the second nucleic acid, and the second segment of the template molecule has the second site, and the first reagent is different from the second reagent, and the first physical condition is different from the second physical condition.
15. The purification method according to claim 14, characterized in that, In step c, after cleavage, the template molecule is separated into a first part consisting of the sequence from the 5' end of the template molecule to the first site, and a second part consisting of the sequence from the first site to the 3' end of the template molecule. Wherein, for the first ligation product, the first ligation product, in its complete form comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, and the first and second portions of the template molecule, is linked to at least one of the carrier and the carrier complex as the first purified product. Wherein, for the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of a second portion comprising the second nucleic acid and the template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the second ligation product is in the form of a second portion comprising the second nucleic acid, the chaperone molecule, and the template molecule, and is ligated to at least one of the vector and the vector complex as the first purified product. Wherein, for the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the third ligation product is in the form of a second portion comprising the second nucleic acid, a polypeptide molecule, and a template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is in the form of a second portion comprising the second nucleic acid, a polypeptide molecule, a chaperone molecule, and a template molecule, and is linked to at least one of the carrier and the carrier complex as the first purified product.
16. The purification method according to claim 14 or 15, characterized in that, In step d, after cutting, 1) Based on the fact that the second site is located in the third nucleic acid, the third nucleic acid is divided into a third part consisting of the sequence from the 5' end of the third nucleic acid to the second site and a fourth part consisting of the sequence from the second site to the 3' end of the third nucleic acid. The first ligation product is used as the second purified product in its complete form, comprising the first nucleic acid, the second nucleic acid, a chaperone molecule, a polypeptide molecule, a linker, a first and second portion of a template molecule, and a fourth portion of the third nucleic acid. For the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of containing the second nucleic acid, a second portion of the template molecule, and a fourth portion of the third nucleic acid; or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the second ligation product is in the form of containing the second nucleic acid, the chaperone molecule, a second portion of the template molecule, and a fourth portion of the third nucleic acid, as the second purified product. For the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of a second portion comprising the second nucleic acid, a polypeptide molecule, a second portion of the template molecule, and a fourth portion of the third nucleic acid; or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is in the form of a second purified product comprising the second nucleic acid, a polypeptide molecule, a chaperone molecule, a second portion of the template molecule, and a fourth portion of the third nucleic acid; or 2) Based on the fact that the second site is located in the second nucleic acid, the second nucleic acid is divided into a third part consisting of a sequence from the 5' end of the second nucleic acid to the second site and a fourth part consisting of a sequence from the second site to at least one of the vector and the vector complex. The first ligation product is used as the second purified product in its complete form, comprising the first nucleic acid, the third part of the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, and the first and second parts of the template molecule. For the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of a third portion of the second nucleic acid and a second portion of the template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the second ligation product is in the form of a third portion of the second nucleic acid, the chaperone molecule, and a second portion of the template molecule, as the second purified product. For the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of a third portion of the second nucleic acid, a polypeptide molecule, and a second portion of the template molecule; or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is in the form of a third portion of the second nucleic acid, a polypeptide molecule, a chaperone molecule, and a second portion of the template molecule, as the second purified product; or 3) Based on the fact that the second site is located in the second segment of the template molecule, the second part of the template molecule is divided into a third part consisting of a sequence from the 3' end of the second part to the second site, and a fourth part consisting of a sequence from the second site to at least one of the vector and the vector complex. The first ligation product is used as the second purified product in its complete form, comprising the first nucleic acid, the second nucleic acid, the chaperone molecule, the polypeptide molecule, the adapter, and the first and third portions of the template molecule. For the second ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of a third portion containing the second nucleic acid and the template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the second ligation product is in the form of a third portion containing the second nucleic acid, the chaperone molecule, and the template molecule, as the second purified product. For the third ligation product, i) based on the chaperone molecule being linked to the 3' end of the first nucleic acid, the second ligation product is in the form of a third portion comprising the second nucleic acid, a polypeptide molecule, and a template molecule, or ii) based on the chaperone molecule being linked to the 5' end of the second nucleic acid, the third ligation product is in the form of a third portion comprising the second nucleic acid, a polypeptide molecule, a chaperone molecule, and a template molecule, as the second purified product.
17. The purification method according to claim 14, characterized in that, The mixture of ligation products further includes a double-stranded unligated polypeptide molecule. The double-stranded unligated polypeptide molecule comprises a linker, a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid. The linker is attached to the 5' end of the first nucleic acid and / or the 3' end of the template molecule. At least a portion of the first nucleic acid is complementary to a first segment of the template molecule, and at least a portion of the second nucleic acid is complementary to a second segment of the template molecule. The chaperone molecule is attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid. The double-stranded product does not contain a polypeptide molecule. Optionally, the method further includes: In step b, the unlinked doublets of polypeptide molecules in the mixture are captured by at least one of the carrier and the carrier complex, and In step c, the first site in the double-stranded unlinked polypeptide molecule captured on the carrier or carrier complex is cleaved using a first reagent or a first physical condition. After cleavage, the template molecule is separated into a first portion consisting of a sequence from the 5' end of the template molecule to the first site and a second portion consisting of a sequence from the first site to the 3' end of the template molecule. The double-stranded unlinked polypeptide molecule, comprising the second nucleic acid and the second portion of the template molecule, is linked to at least one of the carrier or carrier complex as the first purified product. In step d, the second site in the first purified product is cleaved using a second reagent or a second physical condition to obtain a second purified product. After being cut, 1) Based on the fact that the second site is located on the third nucleic acid, the third nucleic acid is separated into a third portion consisting of a sequence from the 5' end of the third nucleic acid to the second site and a fourth portion consisting of a sequence from the second site to the 3' end of the third nucleic acid, and the unlinked double strand of the polypeptide molecule is used as the second purified product in the form of the second nucleic acid, the second portion of the template molecule, and the fourth portion of the third nucleic acid; or 2) Based on the fact that the second site is located in the second nucleic acid, the second nucleic acid is separated into a third portion consisting of a sequence from the 5' end of the second nucleic acid to the second site and a fourth portion consisting of a sequence from the second site to at least one of the vector and the vector complex, the unlinked dipole molecule duplex serving as the second purified product in the form of the third portion of the second nucleic acid and the second portion of the template molecule; or 3) Based on the fact that the second site is located in the second segment of the template molecule, the second part of the template molecule is separated into a third part consisting of a sequence from the 3' end of the second part to the second site and a fourth part consisting of a sequence from the second site to at least one of the vector and the vector complex, and the unlinked polypeptide molecule duplex is used as the second purified product in the form of the second nucleic acid and the third part of the template molecule.
18. The purification method according to any one of claims 14 to 17, characterized in that, The first reagent or the first physical condition can specifically cleave the first site, and the second reagent or the second physical condition can specifically cleave the second site; Preferably, the first site and the second site are each independently selected from at least one of the AP site, the nuclease recognition site, and the cleavable chemical bond, and the first site is different from the second site; Preferably, the nuclease recognition site is: inosine nucleotide, ribonucleotide, deoxyuridine nucleotide, or restriction endonuclease recognition site; Preferably, the breakable chemical bond includes at least one of disulfide bond, diselenide bond, imine bond, acylhydrazone bond, disulfide bond, ester bond, and borate ester bond; Preferably, the first reagent and the second reagent each independently include at least one of apurine-free endonuclease 1, RNase, endonuclease, User enzyme, or restriction endonuclease, and the first reagent is different from the second reagent; Preferably, the first physical condition and the second physical condition each independently include at least one of light, heat or electromagnetic radiation, and the first physical condition is different from the second physical condition.
19. The purification method according to claim 14, characterized in that, The carrier complex is based on the capture of the first ligation product, the second ligation product, and / or the third ligation product via the carrier complex, wherein the carrier surface in the carrier complex has a first modification, the third nucleic acid has a second modification at its ends, and the carrier and the third nucleic acid react between the first modification and the second modification to form the carrier complex. At least a portion of the third nucleic acid is complementary to a portion of the second nucleic acid; or at least a portion of the third nucleic acid is complementary to a portion of the second segment of the template molecule; Optionally, the third nucleic acid has the second site; Optionally, the first modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the second modification is a group capable of reacting with the first modification; Optionally, the second modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the first modification is a group capable of reacting with the second modification; Preferably, the carrier and the third nucleic acid form the carrier complex through the interaction of biotin and streptavidin; Optionally, the carrier is at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles; Preferably, the carrier is a magnetic bead.
20. The purification method according to claim 14, characterized in that, Based on the capture of the first ligation product, the second ligation product, and / or the third ligation product via the vector, wherein the surface of the vector has a first modification, and the ends of the second nucleic acid have a third modification, or the ends of the second segment of the template molecule have a fourth modification, The vector and the second nucleic acid are linked through a reaction between the first modification and the third modification, and the second nucleic acid or the second segment of the template molecule has the second site, or The vector and the second segment of the template molecule are linked through a reaction between the first modification and the fourth modification, and the second segment of the template molecule or the second nucleic acid has the second site. Optionally, the first modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the third modification is a group capable of reacting with the first modification; Optionally, the third modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the first modification is a group capable of reacting with the third modification; Optionally, the first modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the fourth modification is a group capable of reacting with the first modification; Optionally, the fourth modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen, and the first modification is a group capable of reacting with the fourth modification; Preferably, the vector is linked to the second nucleic acid via biotin and streptavidin; Preferably, the carrier is connected to the second segment of the template molecule via biotin and streptavidin; Optionally, the carrier is at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles; preferably, the carrier is magnetic beads.
21. A polypeptide library, characterized in that, The polypeptide library was obtained by the purification method according to any one of claims 14 to 20.
22. A kit for preparing nucleic acid-peptide complexes containing chaperone molecules, characterized in that, include: Double strand, The double strand is used to connect with the polypeptide molecule to be tested. The double-stranded polymer comprises a template molecule, a chaperone molecule, a first nucleic acid, and a second nucleic acid. At least a portion of the first nucleic acid is complementary to a first segment of the template molecule, and at least a portion of the second nucleic acid is complementary to a second segment of the template molecule. The chaperone molecule is attached to the 3' end of the first nucleic acid or the 5' end of the second nucleic acid. The template molecule has a first dot.
23. A kit for purifying nucleic acid-peptide complexes containing chaperone molecules, characterized in that, Including at least one of the first reagent, the second reagent and the carrier and the carrier complex, At least one of the carrier and the carrier complex is used to capture the nucleic acid-peptide complex containing the chaperone molecule. The first reagent is used to cleave the first site of the nucleic acid-peptide complex containing the chaperone molecule. The second reagent is used to cleave the second site in the nucleic acid-peptide complex containing the chaperone molecule or the second site in the carrier complex. Furthermore, the first reagent is different from the second reagent; Preferably, the first reagent and the second reagent are each independently selected from at least one of apurine-free endonuclease 1, RNase, endonuclease, User enzyme, or restriction endonuclease, and the first reagent is different from the second reagent; Preferably, the carrier is selected from at least one of the following: magnetic beads, magnetic sheets, gold nanoparticles, silicon particles, and gel particles; Preferably, the surface of the carrier has a first modification, and the first modification is selected from one of amino, carboxyl, hydroxyl, carbonyl, azide, alkyne, biotin, streptavidin, amylose, Ni-NTA, peptide and antigen; Preferably, the carrier complex comprises a third nucleic acid and the carrier, and the third nucleic acid has the second site; Preferably, the carrier surface in the carrier complex has a first modification, the end of the third nucleic acid has a second modification, and the carrier and the third nucleic acid react between the first modification and the second modification to form the carrier complex.
24. The use of the nucleic acid-peptide complex containing a chaperone molecule according to any one of claims 1 to 10, the nucleic acid-peptide complex containing a chaperone molecule prepared by the method according to any one of claims 11 to 13, the nucleic acid-peptide complex containing a chaperone molecule purified by the method according to any one of claims 14 to 20, the peptide library according to claim 21, and / or the kit according to claim 22 or 23 in high-throughput sequencing.
25. A method for sequencing a polypeptide, characterized in that, The method includes: The polypeptide library of claim 21 is added to the detection solution chamber. Under the action of an electric field, the polypeptide library is controlled by motor proteins to pass through the nanopores, thereby obtaining the electrical signal corresponding to the polypeptide; and The electrical signal is decoded to determine the amino acid information of the polypeptide. Optionally, the amino acid information of the polypeptide includes amino acid sequence information and amino acid modification information.