Method for preparing nucleic acid-polypeptide complex
By preparing nucleic acid-peptide complexes and constructing peptide libraries, the problem of difficult targeted detection in existing peptide sequencing has been solved, achieving high resolution and high accuracy of peptide library sequencing data, simplifying the peptide library construction process, and enabling accurate acquisition of peptide amino acid information.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SHENZHEN HUADA GENE INST
- Filing Date
- 2024-12-27
- Publication Date
- 2026-07-02
AI Technical Summary
Existing peptide sequencing methods cannot achieve targeted detection of natural peptides, resulting in complex signals and low resolution and accuracy in data processing.
By preparing nucleic acid-peptide complexes, a peptide library is constructed by directionally linking oligonucleotide fragments with peptide fragments using modifying groups, and then performing directional detection in nanopore sequencing.
It improves the resolution and accuracy of peptide library sequencing data, simplifies the peptide library construction process, and enables accurate acquisition of peptide amino acid information.
Smart Images

Figure PCTCN2024143336-FTAPPB-I100001 
Figure PCTCN2024143336-FTAPPB-I100002 
Figure PCTCN2024143336-FTAPPB-I100003
Abstract
Description
Methods for preparing nucleic acid-peptide complexes Technical Field
[0001] This invention relates to the field of protein sequencing, and more specifically, to a method for preparing nucleic acid-peptide complexes. Further, it relates to a method for preparing peptide libraries, a kit for use therein, and a method for detecting peptides. Background Technology
[0002] Polypeptides are an important class of molecules in living organisms, composed of amino acid sequences linked by peptide bonds. As intermediate products in protein degradation, peptides play a crucial role in human growth, development, immune regulation, and metabolism. Currently, Edman degradation, mass spectrometry, and nanopore sequencing are the main methods for peptide sequencing. However, existing peptide sequencing methods have relatively low accuracy.
[0003] Therefore, there is an urgent need to develop a new method for constructing peptide sequencing libraries to improve the accuracy of peptide sequencing. Summary of the Invention
[0004] The present invention aims to at least partially address one of the technical problems existing in the prior art. To this end, the present invention provides a method for constructing a polypeptide library.
[0005] This invention is based on the following discoveries of the inventors:
[0006] Nanopore sequencing is currently the main method for peptide sequencing. However, current nanopore sequencing methods can only identify differences in single amino acids and cannot achieve directional sequencing of natural peptides. This results in complex peptide sequence readout signals and low resolution and accuracy in data processing. To overcome this problem, this invention provides a method for constructing peptide libraries that enables directional detection of peptides in nanopore sequencing, improving the resolution and accuracy of peptide library sequencing data.
[0007] In a first aspect, the present invention provides a method for preparing a nucleic acid-peptide complex. According to an embodiment of the invention, the method includes: providing a peptide fragment to be tested and an oligonucleotide fragment, wherein the oligonucleotide fragment includes a first oligonucleotide fragment and a second oligonucleotide fragment, the first oligonucleotide fragment having a first modifying group at its 3' end and the second oligonucleotide fragment having a second modifying group at its 5' end; directionally linking the peptide fragment to be tested with the first oligonucleotide fragment and the second oligonucleotide fragment to obtain the nucleic acid-peptide complex; wherein the peptide fragment to be tested has a modifying group N at its N-terminus and a modifying group C at its C-terminus, the modifying group N at the N-terminus of the peptide fragment to be tested is adapted to be linked with the first modifying group at the 3' end of the first oligonucleotide, and the modifying group C at the C-terminus of the peptide fragment to be tested is adapted to be linked with the second modifying group at the 5' end of the second oligonucleotide fragment. According to the method of the embodiment of the invention, the peptide and nucleic acid can be directionally linked, and the linking conversion rate and yield are high.
[0008] In a second aspect, the present invention provides a method for preparing a polypeptide library. According to an embodiment of the present invention, the method includes: preparing the nucleic acid-peptide complex according to the method described in the first aspect of the present invention; providing a adapter, and ligating the adapter to the nucleic acid-peptide complex to obtain the polypeptide library. According to an embodiment of the present invention, the polypeptide library construction process is simple, and the obtained polypeptide library can achieve targeted detection in nanopores, especially in nanopore sequencing, which can improve the resolution and accuracy of polypeptide library sequencing data.
[0009] In a third aspect, the present invention provides a kit. According to embodiments of the invention, the kit comprises a first oligonucleotide fragment, a second oligonucleotide fragment, and a third oligonucleotide fragment, wherein the first oligonucleotide fragment has a first modifying group at its 3' end, the second oligonucleotide fragment has a second modifying group at its 5' end, and the third oligonucleotide fragment is capable of hybridizing with the first and second oligonucleotide fragments through base complementarity pairing. According to embodiments of the invention, the kit can be used to prepare peptide-nucleic acid directed conjugation complexes and libraries, and exhibits high conjugation conversion efficiency and high yield.
[0010] In a fourth aspect, the present invention proposes the use of the kit described in the third aspect of the present invention in high-throughput sequencing.
[0011] In a fifth aspect, the present invention provides a method for detecting peptides. According to an embodiment of the present invention, the method includes: preparing a peptide library using the method described in the second aspect of the present invention; and detecting the peptide library to obtain amino acid information of the target peptide. The method according to the embodiments of the present invention can accurately obtain amino acid information of the target peptide, such as amino acid sequence, amino acid side chain modifications, or the presence or absence of the target peptide. In particular, when used for nanopore sequencing, it can improve the resolution and accuracy of peptide library sequencing data.
[0012] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description
[0013] The above and / or additional aspects and advantages of the present invention will become apparent and readily understood from the description of the embodiments taken in conjunction with the following drawings, in which:
[0014] Figure 1 is a schematic diagram of the "one-pot cooking" directional coupling method in Example 1.
[0015] Figure 2 is a schematic diagram of the "stepwise" directional coupling method in Example 2.
[0016] Figure 3 is a denaturing gel image of the "nucleic acid-peptide-nucleic acid" directed coupling reaction formed in Example 1.
[0017] Figure 4 is a mass spectrum of the "nucleic acid-peptide-nucleic acid" directional coupling product formed by the "one-pot" directional coupling reaction in Example 1.
[0018] Figure 5 shows the nanopore signal of the “nucleic acid-peptide-nucleic acid” directional coupling products formed by the two polypeptides and nucleic acids in Example 1.
[0019] Figure 6 is a denaturing gel image of the "nucleic acid-peptide-nucleic acid" directional conjugate formed by the "stepwise" directional coupling reaction in Example 2.
[0020] Figure 7 is the mass spectrum of the "nucleic acid-peptide-nucleic acid" directed conjugate formed by the "stepwise" directed coupling reaction in Example 2.
[0021] Figure 8 shows the nanopore signal diagrams of the “nucleic acid-peptide-nucleic acid” directional coupling products formed by three polypeptides with different sequences and nucleic acids in Example 2.
[0022] Figure 9 is a schematic diagram of the products and raw materials formed after the reaction in the non-directional coupling method, the "one-pot" directional coupling method, and the "step-by-step" directional coupling method.
[0023] Figure 10 is a bar chart comparing the total number of sequencing reads (read counts) and the number of sequencing reads (OPO counts) of the paired-end linked nucleic acid-peptide library obtained in signal extraction for the OPO products formed by "stepwise" directional coupling and comparative non-directional coupling of peptide sample P2 in Example 2. Detailed Implementation
[0024] The embodiments of the present invention are described in detail below. The embodiments described below are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.
[0025] It should be noted that the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. Furthermore, in the description of this invention, unless otherwise stated, "a plurality of" means two or more.
[0026] To facilitate understanding of this invention, certain technical and scientific terms are specifically defined below. Unless otherwise expressly defined elsewhere in this invention, all other technical and scientific terms used herein have the meanings commonly understood by one of ordinary skill in the art to which this invention pertains.
[0027] In this invention, the terms "comprising" or "including" are open-ended expressions, meaning they include the contents specified in this invention but do not exclude other aspects.
[0028] In this invention, the terms “optionally,” “optionally,” or “optionally” generally refer to events or conditions described subsequently that may but may not occur, and the description includes both cases in which the event or condition occurs and cases in which the event or condition does not occur.
[0029] In this invention, the minimum and maximum carbon atom content in hydrocarbon groups are indicated by prefixes. For example, the prefix Ca~b indicates the presence of "a" to "b" carbon atoms. Exemplarily, "C..." 0~n "C" refers to a saturated / unsaturated carbon chain, either straight or branched, containing 0, 1, 2, 3, 4, 5, ..., or n carbon atoms; further understanding, "C" 0~n "Should be interpreted as any subranges included, such as C" 0~6 In, containing C 0~6 C 0~3 C 0~2 C 2~6 C 2~5 C 2~4 C 2~3 C 3~6 C3~5 C 3~4 C 4~6 C 4~5 ; "C 1~n "C" refers to a saturated / unsaturated carbon chain, either straight or branched, containing 1, 2, 3, 4, 5, ..., or n carbon atoms; further understanding, "C" 1~n "Should be interpreted as any subranges included, such as C" 1~6 In, containing C 1~6 C 1~3 C 1~2 C 2~6 C 2~5 C 2~4 C 2~3 C 3~6 C 3~5 C 3~4 C 4~6 C 4~5 The term "C1-C" 10 "Alkyl" should be understood to mean a straight-chain or branched saturated monovalent hydrocarbon group having 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 carbon atoms. The alkyl group is, for example, methyl, ethyl, propyl, butyl, pentyl, hexyl, isopropyl, isobutyl, sec-butyl, tert-butyl, isopentyl, 2-methylbutyl, 1-methylbutyl, 1-ethylpropyl, 1,2-dimethylpropyl, neopentyl, 1,1-dimethylpropyl, 4-methylpentyl, 3-methylpentyl, 2-methylpentyl, 1-methylpentyl, 2-ethylbutyl, 1-ethylbutyl, 3,3-dimethylbutyl, 2,2-dimethylbutyl, 1,1-dimethylbutyl, 2,3-dimethylbutyl, 1,3-dimethylbutyl, or 1,2-dimethylbutyl, etc.
[0030] In the chemical structure of the ligands or compounds described in this disclosure, the bonds... This indicates that the configuration is not specified. If chiral isomers exist in the chemical structure, the bond... It can be Or simultaneously include Two configurations. Although all the above structural formulas are shown in some isomer forms for simplicity, this disclosure can include all isomers, such as tautomers, rotational isomers, geometric isomers, diastereomers, racemates, and enantiomers.
[0031] This invention proposes methods for preparing nucleic acid-peptide complexes, methods for preparing peptide libraries, kits, applications, and methods for detecting peptides, which will be described in detail below.
[0032] Methods for preparing nucleic acid-peptide complexes
[0033] In a first aspect, the present invention provides a method for preparing a nucleic acid-peptide complex. According to an embodiment of the invention, the method includes: providing a peptide fragment to be tested and an oligonucleotide fragment, wherein the oligonucleotide fragment includes a first oligonucleotide fragment and a second oligonucleotide fragment, the first oligonucleotide fragment having a first modifying group at its 3' end and the second oligonucleotide fragment having a second modifying group at its 5' end; directionally linking the peptide fragment to be tested with the first oligonucleotide fragment and the second oligonucleotide fragment to obtain the nucleic acid-peptide complex; wherein the peptide fragment to be tested has a modifying group N at its N-terminus and a modifying group C at its C-terminus, the modifying group N at the N-terminus of the peptide fragment to be tested is adapted to be linked with the first modifying group at the 3' end of the first oligonucleotide, and the modifying group C at the C-terminus of the peptide fragment to be tested is adapted to be linked with the second modifying group at the 5' end of the second oligonucleotide fragment. According to the method of the embodiment of the invention, the peptide and nucleic acid can be directionally linked, and the linking conversion rate and yield are high.
[0034] According to an embodiment of the present invention, the polypeptide fragment to be tested is obtained after prior enzymatic digestion.
[0035] According to an embodiment of the present invention, the directional ligation includes a first directional ligation and a second directional ligation. The first directional ligation includes linking the N-terminal modifying group N of the polypeptide fragment to the first modifying group at the 3' end of the first oligonucleotide fragment. The second directional ligation includes linking the C-terminal modifying group C of the polypeptide fragment to the second modifying group at the 5' end of the second oligonucleotide fragment. The first directional ligation and the second directional ligation are performed simultaneously or stepwise. After directional ligation, the obtained nucleic acid-peptide complex structure includes: 5'-first oligonucleotide fragment-3'-N-terminus-peptide-C-terminus-5'-second oligonucleotide fragment-3'.
[0036] According to an embodiment of the present invention, when the first directional connection and the second directional connection are performed simultaneously, it is also called "one-pot" directional coupling, as shown in Figure 1.
[0037] According to an embodiment of the present invention, when the first directional connection and the second directional connection are performed in steps, it is also called "stepwise" directional coupling, as shown in Figure 2.
[0038] According to embodiments of the present invention, the first directional linking is carried out via SuFEx reaction, nucleophilic ring-opening reaction, reductive amination reaction, alkylation substitution reaction or amide reaction.
[0039] According to embodiments of the present invention, the SuFEx reaction is exemplarily shown below:
[0040] R 1-SO2F+R 2 -NH2→R 1 -SO2-NH-R 2 +HF. It is necessary to explain that R... 1 The first oligonucleotide fragment after the removal of the thiofluoride group, R 2 This refers to the polypeptide fragment to be tested, after removing the N-terminal NH2 and having a C-terminal modifying group.
[0041] According to embodiments of the present invention, the nucleophilic ring-opening reaction is exemplarily shown below:
[0042] One point that needs explanation is that R 1 For the first oligonucleotide fragment after removing the epoxy group or aziridine group, R 2 The target peptide fragment is a modified C-terminus with the N-terminus NH2 removed. R is a substituent group, which can be selected from H, alkyl, alkyl acid, alkoxycarbonyl, silyl, Lewis acid such as zinc chloride, boron fluoride, and aluminum chloride.
[0043] According to embodiments of the present invention, the reductive amination reaction is exemplarily shown below:
[0044] R 1 -CHO+R 2- NH2→R 1 -CH2-NH-R 2 .
[0045] One point that needs explanation is that R 1 For the first oligonucleotide fragment after removing the aldehyde group, R 2 This refers to the polypeptide fragment to be tested, after removing the N-terminal NH2 and having a C-terminal modifying group.
[0046] According to embodiments of the present invention, alkylation substitution reactions are exemplarily shown below:
[0047] R 1 -X(X=Br,I,Cl)+R 2 -NH2→R 1 -NH-R 2 .
[0048] One point that needs explanation is that R 1 To remove the haloalkyl group from the first oligonucleotide fragment, R 2 This refers to the polypeptide fragment to be tested, after removing the N-terminal NH2 and having a C-terminal modifying group.
[0049] According to embodiments of the present invention, the amide reaction is exemplarily illustrated below:
[0050] (1)R 1-COOH+R 2 -NH2→R 1 -CONH-R 2
[0051] (2)
[0052] One point that needs explanation is that R 1 For the first oligonucleotide fragment after removing the carboxyl or enone group, R 2 This refers to the polypeptide fragment to be tested, after removing the N-terminal NH2 and having a C-terminal modifying group.
[0053] According to an embodiment of the present invention, the first modifying group is at least one of a thioyl fluoride group, an epoxy group, an aziridine group, an aldehyde group, a haloalkyl group, a carboxyl group, and an enone group.
[0054] According to an embodiment of the present invention, the structure of the thioyl fluoride group is as follows: The structure of the epoxy group is The structure of the aziryl group is The structure of the aldehyde group is The structure of the haloalkyl group is Where X = Br, I, Cl; the structure of the carboxyl group is... The structure of the enone group is
[0055] According to embodiments of the present invention, the second directional linking is carried out via Staudinger linking reaction, CuAAC reaction, SPAAC reaction, tetrazolium linking reaction, IEDDA reaction, oxime linking reaction, thiol-ene reaction, and Diels-Alder diene cycloaddition reaction.
[0056] According to an embodiment of the present invention, the Staudinger connection response is exemplarily shown below:
[0057] (1)R 3 -PH2+R 4 -N3→R 3 -P(N3)-R 4 →R 3 -P(NH)-R 4 +N2;
[0058] (2)
[0059] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the azide group is removed, wherein the modified group N is -NH2, R 3 This is the second oligonucleotide fragment after the phosphorus group has been removed.
[0060] According to embodiments of the present invention, the CuAAC reaction is exemplarily illustrated below:
[0061] (1)R 4 -N3+R 3 -C≡CH+Cu(I)→1,4-substituted triazoles;
[0062] (2)
[0063] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the azide group is removed, wherein the modified group N is -NH2, R 3 This is the second oligonucleotide fragment after removing the alkyne group.
[0064] According to embodiments of the present invention, the SPAAC reaction is exemplarily illustrated as follows:
[0065] (1)R 4 -N3+Cyclooctyne→1,2,3-triazole;
[0066] (2)
[0067] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the azide group is removed, wherein the modified group N is -NH2, R 3 This is the second oligonucleotide fragment after the cyclic alkyne group has been removed.
[0068] According to embodiments of the present invention, the tetrazolium linkage reaction is exemplarily illustrated below:
[0069] (1)R 4 -C2N4+R 3 -C=C→diazacyclohexene
[0070] (2)
[0071] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the tetrazolium group is removed, wherein the modified group N is -NH2, R 3 This refers to the second oligonucleotide fragment after removing the cycloalkenyl group, where R and R' are H, hydrocarbons, or hydrocarbon derivatives.
[0072] According to embodiments of the present invention, the IEDDA reaction is exemplarily shown below:
[0073] (1)Tetrazine+Trans-cyclooctene(TCO)→Pyridazine
[0074] (2)
[0075] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the tetrazolium group is removed, wherein the modified group N is -NH2, R 3 This is the second oligonucleotide fragment after removing trans-cyclooctene.
[0076] According to embodiments of the present invention, oxime connection reactions are exemplarily illustrated as follows:
[0077] (1)R 4 -CO-R+R 3 -NH2OH→R 4 -CH=N-OH-R 3 +H2O
[0078] (2)
[0079] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the removal of ketone or aldehyde groups, wherein the modified group N is -NH2; R 3 It is the second oligonucleotide fragment after removing the aminooxy group; R is a substituent group, which can be selected from H, alkyl, or alkyl derivatives.
[0080] According to embodiments of the present invention, the thiol-alkene reaction is exemplarily illustrated below:
[0081] (1)R 4 -SH+R 3 -CH=CH2→R 4 -S-CH2-CH2-R 3
[0082] (2)
[0083] One point that needs explanation is that R 4 The target polypeptide fragment is an N-terminus with a modifying group N after the removal of thiols, wherein the modifying group N is -NH2; R 3 This is the second oligonucleotide fragment after removing the alkenyl group.
[0084] According to embodiments of the present invention, the Diels-Alder diene cycloaddition reaction is exemplarily illustrated below:
[0085] One point that needs explanation is that R 4 The target polypeptide fragment is a modified group N with an N-terminus after the removal of cyclopentadiene, wherein the modified group N is -NH2; R3 This is the second oligonucleotide fragment after removing the alkenyl group.
[0086] According to embodiments of the present invention, the modifying group C and the second modifying group are selected from any one of the following groups: (a) azido and phosphorus groups; (b) azido and alkyne groups; (c) tetrazolium and cycloalkenyl groups; (d) tetrazolium and 1,2,4-triazine groups; (e) ketone and aminooxy groups; (f) aldehyde and aminooxy groups; (g) thiol and alkenyl groups; (h) cyclopentadiene and bicyclo[6.1.0]nonyne.
[0087] According to an embodiment of the present invention, the structure of the azide group can be as follows: The structure of the tetrazolium group can be: The structure of the ketone group can be: The structure of the aldehyde group can be The structure of the thiol group can be: The structure of cyclopentadiene can be:
[0088] According to an embodiment of the present invention, the structure of the phosphorus group may be as follows: The structure of the alkynyl group can be The structure of the cycloolefin group can be as follows: The structure of the 1,2,4-triazine group can be as follows: The structure of the oxime group can be: The structure of the olefin group is The structure of bicyclic [6.1.0]nonyne is as follows:
[0089] According to embodiments of the present invention, when the modifying group C is an azide group, the second modifying group is a phosphorus group or an alkyne group; when the modifying group C is a tetrazolium group, the second modifying group is a cycloalkenyl group or a 1,2,4-triazine group; when the modifying group C is a ketone group or an aldehyde group, the second modifying group is an aminooxy group; when the modifying group C is a thiol group, the second modifying group is an alkenyl group; or when the modifying group C is cyclopentadiene, the second modifying group is a bicyclo[6.1.0]nonyne.
[0090] According to embodiments of the present invention, the endonuclease is selected from at least one of the following: lysine C-terminal restriction endonuclease, glutamate C-terminal restriction endonuclease, and chymotrypsin.
[0091] According to an embodiment of the present invention, the method further includes: providing a third oligonucleotide fragment, wherein the third oligonucleotide fragment hybridizes with a first oligonucleotide fragment and a second oligonucleotide fragment through base complementary pairing to obtain the nucleic acid-peptide complex. The nucleic acid-peptide complex comprises a double-stranded nucleic acid-peptide-double-stranded nucleic acid complex, the structure of which is shown in Figure 1 or Figure 2.
[0092] According to embodiments of the present invention, the third oligonucleotide fragment is hybridized with the first and second oligonucleotide fragments simultaneously, and the test peptide fragment is directionally ligated with the first and second oligonucleotide fragments. In this case, the test peptide fragment is simultaneously ligated and hybridized with the first, second, and third oligonucleotide fragments, also known as "one-pot" directional coupling. Figure 1 illustrates the principle and steps of a "one-pot" directional coupling method.
[0093] According to an embodiment of the present invention, the polypeptide fragment to be tested is first directionally ligated with the first oligonucleotide fragment and the second oligonucleotide fragment, and then the third oligonucleotide fragment is hybridized with the first oligonucleotide fragment and the second oligonucleotide fragment. In this process, the polypeptide fragment to be tested is first ligated with the first oligonucleotide fragment and the second oligonucleotide fragment, and then hybridized with the third oligonucleotide fragment; this is also known as "stepwise" directional coupling.
[0094] According to an embodiment of the present invention, the polypeptide fragment to be tested is first directionally ligated to the first oligonucleotide fragment to obtain a polypeptide-first oligonucleotide fragment ligation product. Then, the polypeptide-first oligonucleotide fragment ligation product is simultaneously directionally ligated to the second oligonucleotide fragment, and the third oligonucleotide fragment is hybridized to the first and second oligonucleotide fragments. In this case, the polypeptide fragment to be tested is first ligated to the first oligonucleotide fragment, then to the second oligonucleotide fragment, and finally hybridized to the third oligonucleotide fragment, which is another method of "stepwise" directional coupling. Figure 2 illustrates the principle and steps of a "stepwise" directional coupling method.
[0095] According to an embodiment of the present invention, the directional ligation further includes removing unligated peptide fragments to be tested.
[0096] According to an embodiment of the present invention, an unlinked polypeptide fragment to be tested is removed using a nucleic acid purification column.
[0097] Methods for preparing peptide libraries
[0098] In a second aspect, the present invention provides a method for preparing a polypeptide library. According to an embodiment of the present invention, the method includes: preparing the nucleic acid-peptide complex according to the method described in the first aspect of the present invention; providing a adapter, and ligating the adapter to the nucleic acid-peptide complex to obtain the polypeptide library. According to an embodiment of the present invention, the polypeptide library construction process is simple, and the obtained polypeptide library can achieve targeted detection in nanopores, especially in nanopore sequencing, which can improve the resolution and accuracy of polypeptide library sequencing data.
[0099] According to an embodiment of the present invention, the adapter is linked to a first oligonucleotide fragment in the nucleic acid-peptide complex.
[0100] According to embodiments of the present invention, the adapter is a Y-type adapter, which connects to the first and third oligonucleotide fragments in the nucleic acid-peptide complex. The connection can be achieved in various ways. For example, the 5' end of the first oligonucleotide fragment may have a phosphorylation modification, the 3' end of the third oligonucleotide fragment may have a prominent mononucleotide A, and the Y-type adapter may have a prominent mononucleotide T; AT ligation can then be performed under the catalysis of a ligase.
[0101] According to an embodiment of the present invention, the connector includes a molecular tag.
[0102] According to an embodiment of the present invention, there are multiple polypeptide fragments to be tested, and different polypeptide fragments to be tested are connected by linkers with different molecular tags to obtain a polypeptide library labeled with each different polypeptide fragment to be tested.
[0103] Reagent test kit
[0104] In a third aspect, the present invention provides a kit. According to embodiments of the invention, the kit comprises a first oligonucleotide fragment, a second oligonucleotide fragment, and a third oligonucleotide fragment, wherein the first oligonucleotide fragment has a first modifying group at its 3' end, the second oligonucleotide fragment has a second modifying group at its 5' end, and the third oligonucleotide fragment is capable of hybridizing with the first and second oligonucleotide fragments through base complementarity pairing. According to embodiments of the invention, the kit can be used to prepare libraries of peptides and nucleic acids with directed conjugation, and exhibits high conjugation conversion efficiency and high yield.
[0105] According to an embodiment of the present invention, the first modifying group is at least one of a thioyl fluoride group, an epoxy group, an aziridine group, an aldehyde group, a haloalkyl group, a carboxyl group, and an enone group.
[0106] According to an embodiment of the present invention, the second modifying group is at least one of azide, phosphorus, azide, alkynyl, tetrazolium, cycloalkenyl, 1,2,4-triazine, ketone, aminooxy, aldehyde, thiol, alkene, cyclopentadiene, and bicyclo[6.1.0]nonyne.
[0107] According to embodiments of the present invention, the kit further includes at least one of the following reagents: lysine restriction endonuclease, glutamate C-terminal restriction endonuclease, chymotrypsin, enzyme digestion reaction buffer, adapter, ligase, ligation reaction buffer, reagents required for SuFEx reaction, nucleophilic ring-opening reaction, reductive amination reaction, alkylation substitution reaction, amide reaction, Staudinger ligation reaction, CuAAC reaction, SPAAC reaction, tetrazolium ligation reaction, IEDDA reaction, oxime ligation reaction, thiol-ene reaction, or Diels-Alder diene cycloaddition reaction.
[0108] use
[0109] In a fourth aspect, the present invention proposes the use of the kit described in the second aspect of the present invention in high-throughput sequencing.
[0110] According to an embodiment of the present invention, the high-throughput sequencing includes single-molecule nanopore sequencing.
[0111] Methods for detecting peptides
[0112] In a fifth aspect, the present invention provides a method for detecting peptides. According to an embodiment of the present invention, the method includes: preparing a peptide library to be tested using the method described in the second aspect of the present invention; and detecting the peptide library to be tested to obtain amino acid information of the peptide to be tested. The method according to the embodiments of the present invention can accurately obtain amino acid information of the peptide, such as amino acid sequence, amino acid side chain modifications, or the presence or absence of the target peptide. In particular, when used for nanopore sequencing, it can improve the resolution and accuracy of peptide library sequencing data.
[0113] According to an embodiment of the present invention, detecting the polypeptide library to be tested and obtaining the amino acid information of the polypeptide to be tested includes: adding the polypeptide library to be tested into a detection solution chamber, and controlling the polypeptide library to pass through a nanopore by a motor protein under the action of an electric field, thereby obtaining the electrical signal corresponding to the polypeptide; decoding the electrical signal to determine the amino acid information of the polypeptide to be tested.
[0114] According to embodiments of the present invention, the amino acid information of the polypeptide includes amino acid sequence information and amino acid modification information.
[0115] The sequence of the present invention is shown in the table below:
[0116] sequence list
[0117] In O1, the 5' end P is modified with phosphoric acid, and the 3' end CHO is modified with benzaldehyde; the 5' end DBCO of O2 is modified with dibenzocyclooctyne; the 5' end P of O4 is modified with phosphoric acid, and the 3' end DBCO is modified with dibenzocyclooctyne; the 5' end P of the first linker chain is modified with phosphoric acid; the 5' end Chol-TEG of the anchoring sequence is cholesterol-TEG (triethylene glycol); in the second linker chain, Z refers to iSpC3, and Y refers to iSpC18.
[0118] The present invention will be explained below with reference to embodiments. Those skilled in the art will understand that the following embodiments are for illustrative purposes only and should not be considered as limiting the scope of the invention. Where specific techniques or conditions are not specified in the embodiments, they are performed according to the techniques or conditions described in the literature in the field or according to the product instructions. Reagents or instruments whose manufacturers are not specified are all conventional products that can be obtained commercially.
[0119] Example 1: Preparation of sequencing libraries using the "one-pot" directional coupling method
[0120] 1. Point-to-point modification of samples
[0121] Single-end modification of peptides: Peptide samples P1 and P2 were dissolved in 20 μL of 0.1 mol / L NaHCO3 solution (pH=8.0) (Thermo D12345), respectively. 0.5 μL of 10 mM azide-PEG4-NHS ester (Thermo 26130) was added as a C-terminal amino-selective modification reagent. The reaction was carried out at 25 °C for 2 hours. 2 μL of 1 M Tris-HCl buffer (pH=8.0) was added as a quencher, and the reaction was carried out at room temperature for half an hour. The peptides were then desalted and purified using a peptide desalting column (Thermo 89851), and lyophilized to obtain the modified peptide samples M-P1 and M-P2, respectively.
[0122] 2. Sequencing library construction and sequencing
[0123] (1) Dissolve the required nucleic acid sequence first oligonucleotide fragment (O1) (nucleotide sequence as shown in SEQ ID NO: 1), second oligonucleotide fragment (O2) (nucleotide sequence as shown in SEQ ID NO: 2), and raw material nucleic acid sequence (O3) (nucleotide sequence as shown in SEQ ID NO: 3) in 250 mM phosphate solution (pH = 5.5) to form a 200 μM stock solution.
[0124] (2) Dissolve the modified peptide samples M-P1 and M-P2 from step 1 in dimethyl sulfoxide (Thermo D12345), respectively. Take 10 μL of the lyophilized modified peptide samples M-P1 and M-P2 dissolved in dimethyl sulfoxide (Thermo D12345), add them to a 1.5 mL centrifuge tube containing 20 μL of O1, 20 μL of O2, and 20 μL of O3, and react overnight at 65°C. After overnight reaction, a double-stranded nucleic acid-peptide-double-stranded nucleic acid complex is obtained. The structure of the complex is as follows:
[0125] However, since the conversion rate of bilateral coupling reactions cannot reach 100%, there are some products that are only coupled to one end of the peptide, such as O1-peptide M-P1 / P2 or O2-peptide M-P1 / P2, which are called single-end coupling products.
[0126] The reacted sample was desalted and purified using a nucleic acid purification column (NEB T1030L).
[0127] (3) The sample desalted and purified in step (2) using a nucleic acid purification column (NEB T1030L) was electrophoresed at 150V for 90 minutes using a 12% polyacrylamide denaturing gel. The results are shown in Figure 3. In Figure 3, the first lane is the DNA marker; in this embodiment, Thermofisher ultra-low range DNA molecular weight standard ladder (10597012) is used. The second lane is the band of the raw nucleic acid sequence O3 (nucleotide sequence as shown in SEQ ID NO: 3). The third lane is the single-end ligation product O1-peptide M-P1 composed of the raw nucleic acid sequence O1 and the modified peptide M-P1 (nucleotide sequence as shown in SEQ ID NO: 4). The fourth lane is the double-end ligation product O1-peptide M-P1-O2 composed of the raw nucleic acid sequences O1 and O2 and the modified peptide M-P1. The fifth lane is the double-stranded complex band composed of the raw nucleic acid sequence O3 and the double-end ligation product O1-peptide M-P1-O2. The sixth lane is the band of the raw nucleic acid sequence O3 and the double-end ligation product O1-peptide M-P2 (nucleotide sequence as shown in SEQ ID NO: 4). Lane 5 shows the double-stranded complex band composed of O2 and O2; Lane 7 shows the single-end ligation product O2-peptide M-P1 composed of the raw nucleic acid sequence O2 and the modified peptide M-P1; Lane 8 shows the double-stranded complex band of the positive control product O1-peptide-O2; Lane 9 shows the raw nucleic acid sequences O1, O2 and O3 of the negative control without ligated peptide, where the band in the solid box is the double-stranded complex band composed of O3 and O1-peptide-O2.
[0128] (4) After desalting and purifying the samples in lanes 5 and 6, the samples were detected on a Thermo Fisher LCQ DECA XP PLUS triple quadrupole mass spectrometer. The deconvolution mass spectrometry results are shown in Figure 4: the double-chain complex composed of O3 and O1-peptide M-P1-O2 [M+H]+=20077.7; the double-chain complex composed of O3 and O1-peptide M-P2-O2 [M+H]+=20074.1.
[0129] (5) The double-stranded complex composed of sample O3 and O1-peptide M-P1-O2 in lane 5 and sample O3 and O1-peptide M-P2-O2 in lane 6, which were desalted and purified in step (2), were added to adapters to prepare O1-peptide M-P1-O2 sequencing libraries and O1-peptide M-P2-O2 sequencing libraries, respectively. The specific experimental procedure for making adapters is as follows: three partially complementary DNA strands (first strand: SEQ ID NO:9, second strand: SEQ ID NO:11, anchoring sequence: SEQ ID NO:10) were annealed and then cross-linked with motor protein (Dda) to form adapters.
[0130] (6) A nanopore detection platform based on the patch-clamp platform was used to perform nanopore sequencing on the O1-peptide M-P1-O2 sequencing library and the O1-peptide M-P2-O2 sequencing library, which were coupled with adapters in step (5) above. Referring to the single-channel electrophysiological detection system in Geng Jia and Guo Peixuan (“Application of Phage phi29 DNA Packaging Motor Phospholipid Membrane Chimera in Single Molecule Detection and Nanomedicine”, Life Sciences, 2011, 23(11):1114-1129), a nanopore detection platform based on the patch-clamp platform was built, and the porin (Sigma-Aldrich, H9395-5mg) was inserted into the phospholipid bilayer membrane to form a single-channel nanopore. The O1-peptide M-P1-O2 and O1-peptide M-P2-O2 sequencing libraries obtained in step 3(5) were added to the single-channel system. The changes in current amplitude and the current signals generated when different protein libraries passed through the nanopore were detected and recorded using a patch-clamp system, and the data were analyzed. The different current signal diagrams shown in Figure 5 were obtained through data analysis. The upper part of Figure 5 is the sequencing signal diagram of the O1-peptide M-P1-O2 sequencing library, and the lower part of Figure 5 is the sequencing signal diagram of the O1-peptide M-P2-O2 sequencing library. The signals in the dashed boxes are the signals of different peptide segments. By comparing the images, it can be seen that the current decreases and fluctuates over time. Since the amino acids in the peptide sequence have different charges and polarities, the generated electrical signals are also different, as shown in the figure in the dashed box. If the peptide sequences entering the pore are opposite, the difference can be clearly seen through the peptide signals, and different peptides can be sequenced by direction.
[0131] Example 2: Preparation of sequencing libraries using a stepwise directed coupling method
[0132] 1. Point-to-point modification of samples
[0133] Single-end modification of peptides: Peptide samples P2, P3, and P4 were dissolved in 20 μL of 0.1 mol / L NaHCO3 solution (pH = 8.0) (Thermo D12345), respectively. 0.5 μL of 10 mM azide-PEG4-NHS ester (Thermo 26130) was added as a C-terminal amino-selective modification reagent. The reaction was carried out at 25 °C for 2 hours. Then, 2 μL of 1 M Tris-HCl buffer (pH = 8.0) was added as a quencher, and the reaction was carried out at room temperature for half an hour. The peptides were then desalted and purified using a peptide desalting column (Thermo 89851), lyophilized, and the modified peptide samples M-P2, M-P3, and M-P4 were obtained, respectively.
[0134] 2. Sequencing library construction and sequencing
[0135] (1) The first oligonucleotide fragment (O4) of the required nucleic acid sequence (nucleotide sequence as shown in SEQ ID NO: 4) was dissolved in 1×PBS buffer (Thermo, 10010001) to form a 200 μM 20 μL reaction system. The modified peptide samples M-P2, M-P3, and M-P4 from step 1 were dissolved in dimethyl sulfoxide (Thermo D12345), respectively. 10 μL of each of the modified peptide samples M-P2, M-P3, and M-P4 dissolved in dimethyl sulfoxide (Thermo D12345) was added to a 1.5 mL centrifuge tube containing O4 and reacted overnight at room temperature to obtain nucleic acid-peptide single-end complexes O4-M-P2, O4-M-P3, and O4-M-P4. The reacted samples were then desalted and purified using a nucleic acid purification column (NEB T1030L). 10 μL of each of the nucleic acid-peptide single-terminal complexes O4-M-P2, O4-M-P3, and O4-M-P4 were added to 2 μL of 0.1 M KHCO3 solution (pH = 8.0) and 1 μL of 0.1 M 1H-imidazolium-1-sulfonyl azide hydrochloride (Santa Cruz sc-506866). The reaction was carried out overnight at room temperature. The resulting sample was then desalted and purified using a nucleic acid purification column (NEB T1030L) to convert the N-terminal amino group of the peptide to an azide group. 20 μL each of the nucleic acid sequences O2 and O3 (nucleotide sequences as shown in SEQ ID NO: 3) dissolved in 250 mM phosphate solution (pH = 5.5) were added to 20 μL of the above 100 μM nucleic acid-peptide single-terminal complexes and reacted overnight at 65 °C. The resulting sample was then desalted and purified using a nucleic acid purification column (NEB T1030L).
[0136] (2) The sample reacted overnight at 65°C in step (1) was electrophoresed using a 12% polyacrylamide denaturing gel at 150V for 90 minutes. The results are shown in Figure 6. In Figure 6, the first lane is the DNA marker, and in this embodiment, Thermofisher ultra-low range DNA molecular weight standard ladder (10597012) was used; the second lane is the raw material nucleic acid sequence O3 band; the third lane is the band of the single-end ligation product of O4-peptide M-P2 after the reaction of the first oligonucleotide fragment O4 and the modified peptide M-P2; the fourth lane is the band of the single-end ligation product of O4-peptide M-P3 after the reaction of the first oligonucleotide fragment O4 and the modified peptide M-P3 (nucleotide sequence as shown in SEQ ID NO: 7); the fifth lane is the band of the first oligonucleotide fragment O4 and the modified peptide M-P4 (nucleotide sequence as shown in SEQ ID NO: 7). Lane 8 shows the single-end ligation product band of O4-peptide M-P4 after the reaction; Lane 6 shows the double-stranded complex band of O4-peptide M-P2-O2 after the reaction of the single-end ligation product of the raw material nucleic acid sequence O3, the first oligonucleotide fragment O2, and O4-M-P2; Lane 7 shows the double-stranded complex band of O4-peptide M-P3-O2 after the reaction of the single-end ligation product of the raw material nucleic acid sequences O3, O2, and O4-M-P3; Lane 8 shows the double-stranded complex band of O4-peptide M-P4-O2 after the reaction of the single-end ligation product of the raw material nucleic acid sequences O3, O2, and O4-M-P4; Lane 9 shows the band of the raw material nucleic acid sequence O2. The band in the solid box is the double-stranded complex band composed of O4-peptide-O2, and the band in the dashed box is the single-end ligation product band of O4-peptide (since it is a denaturing gel, theoretically only a single nucleic acid or covalent nucleic acid product band will appear in the electrophoresis, and the position of O3 in the double-stranded complex should be the position of the band under the solid box).
[0137] (3) After desalting and purifying the samples in lanes 6, 7, and 8, the samples were detected on a Thermo Fisher LCQ DECA XP PLUS triple quadrupole mass spectrometer. The deconvolutioned mass spectrometry results are shown in Figure 7: a double-chain complex [M+H] composed of O4-peptide M-P2-O2. + =20364.9; a double-chain complex composed of O4-peptide M-P3-O2 [M+H] + =19883.4; A double-chain complex composed of O4-peptide M-P4-O2 [M+H] + =20415.1.
[0138] (4) Add adapters to the double-stranded complex composed of O3 and O4-peptide M-P2-O2 in the sixth lane sample, the double-stranded complex composed of O3 and O4-peptide M-P3-O2 in the seventh lane sample, and the double-stranded complex composed of O3 and O4-peptide M-P4-O2 in the eighth lane sample, respectively, to prepare O4-peptide M-P2-O2 sequencing libraries, O4-peptide M-P3-O2 sequencing libraries, and O4-peptide M-P4-O2 sequencing libraries, respectively.
[0139] (5) The O4-peptide M-P2-O2 sequencing library, O4-peptide M-P3-O2 sequencing library and O4-peptide M-P4-O2 sequencing library obtained in step (4) were subjected to nanopore sequencing, and the sequencing method was the same as step 2 (6) in Example 1. Different current signal diagrams were obtained through data analysis as shown in Figure 8. The upper diagram of Figure 8 is the sequencing signal diagram of the O4-peptide M-P2-O2 sequencing library, the middle diagram of Figure 8 is the sequencing signal diagram of the O4-peptide M-P3-O2 sequencing library, and the lower diagram of Figure 8 is the sequencing signal diagram of the O4-peptide M-P4-O2 sequencing library. The signals in the dashed boxes are the signals of different peptide segments. As shown in the figure, although the N-terminal and C-terminal amino acids of the three peptide sequences are the same, the current signals generated by entering the pore are related to the length of the sequence and the electrophoresis of the amino acids. The fluctuation trends of the high-order current are different. Three completely different signals can be seen by image comparison, so the peptides can be further distinguished and sequenced by algorithm.
[0140] Comparative Example 1: Preparation of sequencing libraries using non-directional coupling method
[0141] The process of this comparative example is basically the same as that of Example 1, except that...
[0142] 1. Dissolve peptide samples P1 and P2 in 20 μL of 0.1 mol / L PBS (pH 7.4) (Thermo 10010023) to obtain dissolved peptide samples P1 and P2. Add excess of the modification reagent NHS-PEG4-azide (Thermo 26130) to modify the N-terminal amino group and the C-terminal K (lysine) side chain amino group of P1 and P2.
[0143] 2. Sequencing library construction and sequencing
[0144] (1) Dissolve the required nucleic acid sequence first oligonucleotide fragment (O4) (nucleotide sequence as shown in SEQ ID NO: 4), second oligonucleotide fragment (O2) (nucleotide sequence as shown in SEQ ID NO: 2), and raw material nucleic acid sequence (O3) (nucleotide sequence as shown in SEQ ID NO: 3) in PBS (pH 7.4) (Thermo 10010023) to form a 200 μM stock solution.
[0145] (2) Dissolve peptide samples P1 and P2 from step 1 in dimethyl sulfoxide (Thermo D12345), respectively. Take 10 μL of peptide samples P1 and P2 dissolved in NaHCO3 solution and add them to a 1.5 mL centrifuge tube containing 20 μL of O2, 20 μL of O3, and 20 μL of O4. Incubate overnight at room temperature. After overnight incubation, a double-stranded nucleic acid-peptide-double-stranded nucleic acid complex is obtained. The structure of the complex is as follows:
[0146] However, since the conversion rate of the bilateral coupling reaction cannot reach 100%, or the proportions of oligonucleotides O2, O3 and O4 added are not completely equal, there are some products that only couple one end of the polypeptide, such as O2-peptide P1 / P2 or O4-peptide P1 / P2, which are called single-end coupling products. See Figure 9 for the 10 coupling products.
[0147] The reacted sample was desalted and purified using a nucleic acid purification column (NEB T1030L).
[0148] (3) Add adapters to the desalted and purified sample from step (2) to prepare O4-peptide P1-O2 sequencing libraries, O2-peptide P1-O4 sequencing libraries, O2-peptide P2-O4 sequencing libraries, O4-peptide P2-O2 sequencing libraries, O2-peptide P1 sequencing libraries, O4-peptide P1 sequencing libraries, O2-peptide P2 sequencing libraries, and O4-peptide P2 sequencing libraries, respectively. The specific experimental procedure for making adapters is as follows: anneal three partially complementary DNA strands (first strand: SEQ ID NO: 9, second strand: SEQ ID NO: 11, anchoring sequence: SEQ ID NO: 10), and then crosslink them with motor protein (Dda) to form adapters.
[0149] (4) Nanopore sequencing was performed on the sequencing libraries with adapters added in step (3) using a patch-clamp platform. Referring to the single-channel electrophysiological detection system described by Geng Jia and Guo Peixuan (“Application of Phage phi29 DNA Packaging Motor Phospholipid Membrane Chimera in Single-Molecular Detection and Nanomedicine,” Life Sciences, 2011, 23(11):1114-1129), a nanopore sequencing platform based on a patch-clamp platform was constructed. Porin (Sigma-Aldrich, H9395-5mg) was inserted into the phospholipid bilayer membrane to form a single-channel nanopore. The sequencing libraries obtained in step (3) were added to this single-channel system. The changes in current amplitude and the current signals generated when different protein libraries passed through the nanopores were detected and recorded using the patch-clamp system, and the data were analyzed.
[0150] Comparative Example 1 and Example 1 constructed peptide libraries using non-directional coupling and "one-pot" directional coupling methods, respectively. After sequencing the libraries, the sequencing reads of the two types obtained by the two methods were counted, and the results are shown in Figure 10. The total number of sequencing reads (read counts) is the number of times a current signal was read after the peptide library fragment was perforated, that is, the number of current fluctuation signals below 200 pA appearing under the action of intermittent opening current. Each current fluctuation signal can be considered as formed after the peptide library fragment was perforated. The number of sequencing reads (OPO counts) of the paired-end ligated nucleic acid-peptide library is the number of times the current signal within the dashed box shown in Example 5 appeared after the peptide library fragment was perforated. The current signal is characterized by relatively high peak regions on both sides. The effective signal percentage was then calculated based on the results in Figure 10. The effective signal percentage = number of sequencing reads (OPO counts) of the paired-end ligated nucleic acid-peptide library ÷ total number of sequencing reads (read counts). The effective signal percentage of the "one-pot" directional coupling method was 17.77%, higher than the 13.03% effective signal percentage of the non-directional coupling method. This result indicates that, compared with the non-directional coupling method, the directional coupling method has higher ligation conversion rate and yield, and can obtain a higher effective signal percentage during sequencing.
[0151] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0152] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
Claims
1. A method for preparing nucleic acid-peptide complexes, characterized in that, The method includes: Provides a polypeptide fragment to be tested and an oligonucleotide fragment, wherein the oligonucleotide fragment includes a first oligonucleotide fragment and a second oligonucleotide fragment, the first oligonucleotide fragment having a first modifying group at its 3' end and the second oligonucleotide fragment having a second modifying group at its 5' end; The polypeptide fragment to be tested is directionally linked to the first oligonucleotide fragment and the second oligonucleotide fragment to obtain the nucleic acid-peptide complex; The peptide fragment to be tested has a modifying group N at its N-terminus and a modifying group C at its C-terminus. The modifying group N at the N-terminus of the peptide fragment to be tested is adapted to be linked to a first modifying group at the 3' end of the first oligonucleotide, and the modifying group C at the C-terminus of the peptide fragment to be tested is adapted to be linked to a second modifying group at the 5' end of the second oligonucleotide fragment.
2. The method according to claim 1, characterized in that, The polypeptide fragment to be tested was obtained after prior enzymatic digestion.
3. The method according to claim 1, characterized in that, The directional linking includes a first directional linking and a second directional linking. The first directional linking includes linking the N-terminal modification group N of the peptide fragment to be tested to a first modification group at the 3' end of the first oligonucleotide fragment. The second directional linking includes linking the C-terminal modification group C of the peptide fragment to be tested to a second modification group at the 5' end of the second oligonucleotide fragment. The first directional linking and the second directional linking are performed simultaneously or in steps.
4. The method according to claim 3, characterized in that, The first directional linking is carried out via SuFEx reaction, nucleophilic ring-opening reaction, reductive amination reaction, alkylation substitution reaction or amide reaction.
5. The method according to claim 4, characterized in that, The first modifying group is at least one of the following: a thioyl fluoride group, an epoxy group, an aziridine group, an aldehyde group, a haloalkyl group, a carboxyl group, and an enone group; Optionally, the modifying group N is -NH2.
6. The method according to claim 3, characterized in that, The second directional linking is carried out via Staudinger linking reaction, CuAAC reaction, SPAAC reaction, tetrazolium linking reaction, IEDDA reaction, oxime linking reaction, thiol-ene reaction and Diels-Alder diene cycloaddition reaction.
7. The method according to claim 6, characterized in that, The modifying group C and the second modifying group are selected from any one of the following groups: (a) Azide and phosphorus groups; (b) Azide and alkyne groups; (c) Tetrazolyl and cycloalkenyl groups; (d) Tetrazolyl and 1,2,4-triazine groups; (e) Ketone and aminooxy groups; (f) Aldehyde and aminooxy groups; (g) thiol and olefin groups; and (h) Cyclopentadiene and bicyclo[6.1.0]nonyne.
8. The method according to claim 7, characterized in that, When the modifying group C is an azide group, the second modifying group is a phosphorus group or an alkyne group; When the modifying group C is a tetrazolium group, the second modifying group is a cycloalkenyl group or a 1,2,4-triazine group; When the modifying group C is a ketone group or an aldehyde group, the second modifying group is an aminooxy group; When the modifying group C is a thiol group, the second modifying group is an olefin group; or When the modifying group C is cyclopentadiene, the second modifying group is bicyclo[6.1.0]nonyne.
9. The method according to claim 2, characterized in that, The enzyme digestion process was performed using an endonuclease.
10. The method according to claim 9, characterized in that, The endonuclease is selected from at least one of the following: lysine C-terminal restriction endonuclease, glutamate C-terminal restriction endonuclease, and chymotrypsin.
11. The method according to any one of claims 1-10, characterized in that, The method further includes: A third oligonucleotide fragment is provided, which hybridizes with the first and second oligonucleotide fragments through base complementary pairing to obtain the nucleic acid-peptide complex.
12. The method according to claim 11, characterized in that, Simultaneously, the third oligonucleotide fragment is hybridized with the first and second oligonucleotide fragments, and the polypeptide fragment to be tested is directionally ligated with the first and second oligonucleotide fragments.
13. The method according to claim 11, characterized in that, First, the polypeptide fragment to be tested is directionally ligated with the first oligonucleotide fragment and the second oligonucleotide fragment, and then the third oligonucleotide fragment is hybridized with the first oligonucleotide fragment and the second oligonucleotide fragment.
14. The method according to claim 11, characterized in that, First, the polypeptide fragment to be tested is directionally ligated to the first oligonucleotide fragment to obtain a polypeptide-first oligonucleotide fragment ligation product. Then, the polypeptide-first oligonucleotide fragment ligation product is directionally ligated to the second oligonucleotide fragment, and the third oligonucleotide fragment is hybridized to the first oligonucleotide fragment and the second oligonucleotide fragment.
15. The method according to any one of claims 1-14, characterized in that, Following the targeted ligation, the process further includes removing unligated peptide fragments to be tested. Optionally, an unlinked peptide fragment can be removed using a nucleic acid purification column.
16. A method for preparing a polypeptide library, characterized in that, The method includes: The nucleic acid-peptide complex is prepared by the method according to any one of claims 1-15; A connector is provided, and the connector is linked to the nucleic acid-peptide complex to obtain the peptide library.
17. The method according to claim 16, characterized in that, The adapter is linked to the first oligonucleotide fragment in the nucleic acid-peptide complex.
18. The method according to claim 16, characterized in that, The connector is a Y-type connector, which is connected to the first oligonucleotide fragment and the third oligonucleotide fragment in the nucleic acid-peptide complex.
19. The method according to claim 16, characterized in that, The connector includes a molecular tag.
20. The method according to claim 19, characterized in that, The test peptide fragments are multiple, and different test peptide fragments are connected by adapters with different molecular tags to obtain a peptide library labeled with each different test peptide fragment.
21. A reagent kit, characterized in that, It includes a first oligonucleotide fragment, a second oligonucleotide fragment, and a third oligonucleotide fragment, wherein the first oligonucleotide fragment has a first modifying group at its 3' end, the second oligonucleotide fragment has a second modifying group at its 5' end, and the third oligonucleotide fragment is capable of hybridizing with the first oligonucleotide fragment and the second oligonucleotide fragment through base complementary pairing.
22. The kit according to claim 21, characterized in that, The first modifying group is selected from at least one of the following: thioyl fluoride group, epoxy group, aziryl group, aldehyde group, haloalkyl group, carboxyl group, and enone group.
23. The reagent kit according to claim 21, characterized in that, The second modifying group is selected from at least one of azide, phosphorus, alkynyl, tetrazolium, cycloalkenyl, 1,2,4-triazine, ketone, aminooxy, aldehyde, thiol, alkene, cyclopentadiene, and bicyclic [6.1.0]nonyne.
24. The kit according to any one of claims 21-23, characterized in that, The kit also includes at least one of the following: lysine restriction endonuclease, glutamate C-terminal restriction endonuclease, chymotrypsin, enzyme digestion reaction buffer, adapter, ligase, ligation reaction buffer, and reagents required for SuFEx reaction, nucleophilic ring-opening reaction, reductive amination reaction, alkylation substitution reaction, amide reaction, Staudinger ligation reaction, CuAAC reaction, SPAAC reaction, tetrazolium ligation reaction, IEDDA reaction, oxime ligation reaction, thiol-ene reaction, or Diels-Alder diene cycloaddition reaction.
25. Use of the kit according to any one of claims 21-24 in high-throughput sequencing; Optionally, the high-throughput sequencing includes single-molecule nanopore sequencing.
26. A method for detecting polypeptides, characterized in that, include: Prepare a polypeptide library for testing using the method according to any one of claims 16-20; The target polypeptide library is analyzed to obtain the amino acid information of the target polypeptide.
27. The method according to claim 26, characterized in that, The amino acid information of the target peptide is obtained by analyzing the target peptide library, including: The polypeptide library to be tested is added to the detection solution chamber. Under the action of an electric field, the polypeptide library is controlled by a motor protein to pass through a nanopore, thereby obtaining the electrical signal corresponding to the polypeptide. The electrical signal is decoded to determine the amino acid information of the polypeptide to be tested.
28. The method according to claim 27, characterized in that, The amino acid information of the polypeptide includes amino acid sequence information and amino acid modification information.