A method for improving the accuracy of antibody cdr3 polypeptide mass spectrum alignment

By chemically modifying the N-terminus of the antibody CDR3 peptide and combining it with N-terminal ion matching screening in secondary spectra, the false positive problem of antibody CDR3 peptide identification in mass spectrometry was solved, and the accuracy of identification was improved.

CN122218239APending Publication Date: 2026-06-16CHANGPING NAT LAB +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHANGPING NAT LAB
Filing Date
2024-12-09
Publication Date
2026-06-16

Smart Images

  • Figure HDA0005178369810000011
    Figure HDA0005178369810000011
  • Figure HDA0005178369810000012
    Figure HDA0005178369810000012
  • Figure HDA0005178369810000021
    Figure HDA0005178369810000021
Patent Text Reader

Abstract

The present application provides a method for improving the accuracy of mass spectrum alignment of antibody CDR3 polypeptide. The method changes the logic of CDR3 spectrum recognition from the bottom according to the sequence characteristics of the antibody itself, thereby improving the accuracy of recognition.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of analytical chemistry and biochemistry, specifically to a method for improving the accuracy of mass spectrometry alignment of antibody CDR3 peptides and its application in mass spectrometry-based proteomics detection of antibodies. Background Technology

[0002] Antibodies are an important component of humoral immunity. Detecting antibodies produced by the body plays a crucial role in exploring immune mechanisms, vaccine development, and monoclonal antibody development. The antibodies produced by the body are diverse, but in practice, only a specific subset is focused on, such as antibodies against a particular antigen. Therefore, detecting the actual types of antibodies present in a sample is a more important technical challenge than detecting the total antibody count.

[0003] Mass spectrometry-based proteomics is a direct method for detecting antibody types in samples. The antigen specificity of an antibody is primarily determined by the CDR3 region of the antibody heavy chain; therefore, mass spectrometry analysis focuses on CDR3 peptides. Whether a sample contains a CDR3 peptide from a database is determined by comparing the mass spectrometry signal with a simulated spectrum generated by the database. Specifically, mass spectrometry first detects the complete molecular weight of the peptide, generating a spectrum called a primary spectrum. After determining the complete molecular weight of each peptide from the primary spectrum, similar peptide sequences are searched in the database as candidate sequences. Next, individual peptides are fragmented in the mass spectrometer, generating fragment spectra (hereinafter referred to as secondary spectra). The fragment spectra are compared with the theoretical spectra generated from the corresponding candidate sequences, and the most similar candidate sequence is identified as the peptide sequence. Since the algorithms of general mass spectrometry spectrum matching software are designed for ordinary proteomic samples, their resolution is limited for extremely similar antibody sequences. This can lead to mismatches in the secondary spectra, resulting in a high false positive rate and an inability to correctly identify the type of antibody in the sample.

[0004] For the problem of mismatches in secondary spectra, existing solutions involve reducing the tolerance of the primary spectra, which is the difference between the complete molecular weight of the peptide detected by the primary spectra and the theoretical molecular weight of the candidate peptide. When secondary spectra identification is difficult, reducing the tolerance of the primary spectra narrows the range of candidate peptides, thereby reducing the probability of false matches. This method effectively reduces false positives in the identification of general antibody sample spectra and is widely used. However, this global strategy does not address the characteristics of the antibody sequence; for signals with low abundance, false positive matches are still difficult to avoid. Summary of the Invention

[0005] To address the aforementioned issues, the inventors, through in-depth research, proposed a novel method to improve the accuracy of mass spectrometry alignment of the CDR3 antibody peptide. This method consists of two parts: an algorithm and experiments. In the algorithm section, statistical analysis of the sequences surrounding the CDR3 antibody revealed that mismatches are primarily due to the conserved region of the CDR3 peptide holding a large weight in the spectral matching score, leading to inflated false signal scores. The conserved region is located at the C-terminus, while the highly variable region is located at the N-terminus. Therefore, we propose to directly assess the degree of N-terminal ion matching in the secondary spectrum, reducing the tolerance of the primary spectrum. Only spectra with good N-terminal ion matching represent true signals, while those with poor N-terminal ion matching are considered false positives. This strategy, starting from the antibody's inherent sequence characteristics, fundamentally alters the logic of CDR3 spectrum recognition, thereby improving accuracy. In the experimental section, based on traditional enzyme digestion and sample preparation, we propose to chemically modify the N-terminus of the peptide to enhance the N-terminal ion signal, further reducing false positives in CDR3 spectrum recognition and improving accuracy.

[0006] Therefore, on the one hand, the present invention provides a method for improving the accuracy of mass spectrometry alignment of the antibody CDR3 peptide, the method comprising:

[0007] 1) Mass spectrometry analysis is performed on the sample containing the antibody CDR3 peptide to obtain a primary spectrum. By comparing the spectrum with a database, multiple candidate sequences of the CDR3 peptide are obtained from the database. These candidate sequences have a molecular weight that is extremely similar to the CDR3 peptide within a tolerance range (i.e., the difference (error) in molecular weight between the candidate sequence and the CDR3 peptide is within the tolerance range, typically ±1.5 parts per million, 3 parts per million, or 4.5 parts per million, etc.).

[0008] 2) Obtain the secondary spectrum (i.e. fragment spectrum) of the sample and compare it with the theoretical secondary spectrum of the candidate sequence(s), and select the sequence(s) whose N-terminal ion spectrum of the secondary spectrum matches the N-terminal ion spectrum of the sample's secondary spectrum from the candidate sequence(s).

[0009] In some embodiments, the method further includes reducing the tolerance of the primary spectrum in step 1) when secondary spectrum identification is difficult.

[0010] In some embodiments, the sample is subjected to enzymatic digestion before step 1) in the method.

[0011] In some embodiments, enzymatic digestion is performed using a protease, such as trypsin or chymotrypsin or intracellular protease Asp-N, or endonuclease Glu-C.

[0012] In some embodiments, the sample in the method is a sample containing antibody molecules, such as plasma or serum.

[0013] In some embodiments, the method further includes: chemically modifying the N-terminus of the antibody CDR3 peptide with a molecule that can enhance ionic charge and has N-terminal modification specificity to enhance its N-terminal ion spectrum signal.

[0014] In some embodiments, the method involves using a small molecule ox1a to chemically modify the N-terminus of the antibody CDR3 peptide.

[0015] In another aspect of the invention, the application of the method of the invention in mass spectrometry-based proteomics detection is provided, wherein the protein detected in the proteomics is an antibody.

[0016] In another aspect of the present invention, a mass spectrometry-based proteomics detection method is provided, wherein the protein detected by the proteomics is an antibody, and the method is characterized by comprising: in a secondary spectrum comparison, using the ion spectrum of the hypervariable region of CDR3, CDR1 or CDR2 or a combination thereof as one of the screening criteria.

[0017] Other aspects and advantages of this application will readily be apparent to those skilled in the art from the detailed description below. Only exemplary embodiments of this application are shown and described in the following detailed description. As will be appreciated by those skilled in the art, the content of this application enables them to make modifications to the disclosed specific embodiments without departing from the spirit and scope of the invention to which this application pertains. Accordingly, the descriptions in the accompanying drawings and specification of this application are merely exemplary and not restrictive. Attached Figure Description

[0018] The above features and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings, wherein:

[0019] Figure 1 The data presented is a statistical analysis of sequence differences in the CDR3 region of the antibody heavy chain.

[0020] Figure 2 This displays an example of matching a real mass spectrometry spectrum with a highly similar CDR3 peptide (a comparison between the real spectrum generated by mass spectrometry and the theoretical spectra of two candidate peptides obtained from the library search. The real spectrum is represented by black lines, and the colors in the spectrum represent the theoretical spectra of the corresponding peptide fragments. MW: molecular weight).

[0021] Figure 3 The flowchart shown is the secondary spectrum identification strategy flowchart for N-terminal fragments;

[0022] Figure 4This shows an example of antibody identification in a sample using secondary spectra of N-terminal fragments (A is the SDS-Page result. The left lane is the antibody standard (NIST standard material RM 8671), and the right lane is the antibody sample of unknown sequence anti-mumps virus core protein from volunteer plasma, purified by affinity. B shows the list of peptide results detected by Peaks search software at the top, sorted in descending order of software scores; the bottom shows the "spectrum-peptide" matching result for candidate peptide number 71. C shows the ELISA results verifying antibody specificity. The leftmost is the positive control, and the rightmost is the selected candidate antibody. The middle five lanes are all negative control antibodies); and

[0023] Figure 5 This shows an example of enhancing the N-terminal fragment signal through chemical modification ((A) Chemical modification process. Ox1a is a small molecule used for modification, which can modify the N-terminal amino group of the peptide and enhance the N-terminal ion spectrum signal in mass spectrometry. (B) The top is an example of the secondary spectrum result of the unmodified peptide, and the bottom is an example of the secondary spectrum result of the modified peptide. (C) Statistical results of spectrum matching accuracy before and after modification. Blue represents the unmodified peptide, and red represents the modified peptide). Detailed Implementation

[0024] Unless otherwise indicated, the terms used herein have their general technical meanings as understood by those skilled in the art.

[0025] This invention provides a novel method for improving the accuracy of mass spectrometry alignment of antibody CDR3 peptides. It consists of two parts: an algorithm and experimental components.

[0026] (a) Algorithm aspect: To address the error-prone identification of antibody CDR3 peptides in secondary mass spectrometry, false positives are reduced by directly screening the N-terminal ion spectrum matching of the secondary spectrum. Furthermore, the traditional primary signal tolerance reduction strategy is combined with the N-terminal ion screening strategy of the secondary spectrum to identify the CDR3 peptide of the antibody in the real sample.

[0027] (b) Experimental aspects: N-terminal chemical modification of the peptide enhances its N-terminal ion signal, improving the accuracy of spectral identification. This, combined with the developed algorithm, reduces false positives in spectral identification.

[0028] The present invention is further illustrated in the following embodiments. These embodiments are for illustrative purposes only and are not intended to limit the scope of the invention.

[0029] Example 1: Statistical analysis of sequence differences in the CDR3 peptide (feasibility of using sequence difference features for spectral identification)

[0030] Antibody antigen specificity is primarily determined by the heavy chain CDR3 sequence. Antibody antigen specificity analysis mainly focuses on the CDR3 region. We performed high-throughput nucleic acid sequencing on human peripheral blood B cells, such as Illumina sequencing, to obtain a series of antibody sequence information. By performing statistical analysis on the CDR3 region of the obtained antibody sequences, we obtained a distribution map of amino acid usage at each site, such as... Figure 1 In the upper part of the statistical graph, CDR3 is represented by a blue bar. Furthermore, since the antibody mixture requires enzymatic digestion before mass spectrometry analysis, trypsin is most commonly used, with its cleavage site at the carboxyl terminus of arginine R and lysine K. Therefore, the peptide covering CDR3 (hereinafter referred to as the CDR3 peptide) mainly consists of... Figure 1 The green label at the top indicates the peptide composition. (As shown) Figure 1 As shown, the sequence diversity of the antibody CDR3 peptide is uneven, with the diversity of its amino terminus (hereinafter referred to as N-terminus) being significantly higher than that of its carboxyl terminus (hereinafter referred to as C-terminus).

[0031] As mentioned earlier, when analyzing mass spectrometry proteomics data, it is necessary to match the theoretical secondary spectra of peptides with the actual generated secondary spectra. If two candidate peptide sequences are too similar, they will be difficult to distinguish. Figure 2 The analysis used actual mass spectra and search engine results for two candidate CDR3 peptides (HGSIGARQNWFDPWGQGTLVTVSSASTK–SEQ ID No:1; DSGTYPPVPIFEYWGQGTLVTVSSASTK–SEQ ID No:2). The two CDR3 peptides have very similar molecular weights, making them difficult to distinguish in the primary mass spectra. Their secondary spectra show identical C-terminal ions and clear matching signals. These signals dominate the confidence score in the mass spectrometry search software, further hindering the differentiation between the two peptides. Therefore, the focus was on improving the accuracy of identification using N-terminal fragments in the secondary spectra.

[0032] Example 2: New Algorithm: Second-order Spectrum Recognition Strategy Based on N-terminal Ions

[0033] Based on the above analysis, we can conclude that N-terminal ions play a crucial role in spectral identification. Therefore, we propose a two-stage spectral screening strategy based on determining the degree of N-terminal ion matching.

[0034] like Figure 3After obtaining mass spectrometry signals using a Thermo Orbitrap Eclipse Tribrid mass spectrometer, a rigorous first-order spectrum comparison was performed using Peaks search software with a sequence database obtained from Illumina sequencing of human peripheral blood B cells, yielding preliminary results. The obtained spectrum-peptide matching results were then screened based on the degree of N-terminal ion matching; those with good N-terminal ion matching were considered true matches, while those without good N-terminal ion matching had low confidence. This confirmed the presence of the true CDR3 peptide and its corresponding antibody in the sample.

[0035] Specifically, after enzyme digestion, the antibody is analyzed by mass spectrometry to obtain a spectrum. During the primary search with the CDR3 sequencing database, a low-tolerance approach is used to reduce false positives. The obtained candidate spectrum-peptide correspondences are then judged based on the N-terminal fragmentation matching. If the N-terminal fragments cover the peptides well, it is considered a true match; otherwise, it is considered an unreliable match. This process yields information on the actual CDR3 peptides and their corresponding antibodies present in the sample.

[0036] Example 3: New Algorithm: Experimental Example of a Second-Order Spectrum Recognition Strategy Based on N-Terminal Ions

[0037] The sample was a mumps virus core protein (Mumps nucleocapsid, MumpsNP) specific antibody purified from volunteer plasma. Figure 4 A) The enzyme was digested with Promega brand trypsin and then analyzed by mass spectrometry.

[0038] The data was screened using a traditional first-order tolerance method, and the search software provided possible candidate peptide results. Figure 4 B). Using an N-terminal fragment identification strategy, we obtained a CDR3 "spectrum-peptide" match that we considered a true positive at position 71 in the software scoring ranking. Figure 4 The mumps virus core protein specific antibody (C) was expressed and synthesized, and the verification results showed that it was a specific antibody against the mumps virus core protein. Figure 4 (D). This proves that we successfully identified the antibody CDR3 peptide in the sample using a secondary spectral identification strategy based on N-terminal fragments, thus finding the corresponding antibody.

[0039] Example 4: Chemical modification of the N-terminus of the polypeptide to enhance its N-terminal ion signal.

[0040] Our proposed secondary spectrum recognition strategy based on N-terminal ions effectively reduces false positives in mass spectrometry spectrum identification. However, when the CDR3 peptide fragment is too long or its abundance is too low, a good N-terminal ion signal cannot be obtained, thus failing to achieve a correct spectrum-peptide matching result. Therefore, enhancing its N-terminal ion signal through chemical modification is beneficial for obtaining better spectrum-peptide matching results.

[0041] like Figure 5 As shown in Figure A, we commissioned Genscript Biotech to synthesize the CDR3 peptide with the sequence DTYYGGHSNLDLWGQGTLVTVSSASTK (SEQ ID No: 3). We then commissioned Hangzhou Yuhao Chemical Technology Co., Ltd. to synthesize a small molecule named ox1a (https: / / doi.org / 10.1002 / ange.202007608), which we subsequently chemically modified. This small molecule specifically modifies the N-terminus of the peptide and enhances the charge-carrying capacity of the modified portion, thereby increasing the signal of the N-terminal ion in the mass spectrometry spectrum. Figure 5 The top image (B) shows an example of a library search result for the secondary spectra of an unmodified peptide, while the bottom image shows an example of a secondary spectra result for a modified peptide. The top image shows a mismatch due to the absence of an N-terminal ion. The bottom image shows that this small molecule enhances the N-terminal ion signal of the peptide (indicated by the orange-yellow arrow), which can help obtain better "spectrum-peptide" matching results and reduce mismatches. Figure 5 Figure C shows the statistical results of spectral matching accuracy before and after modification. As the peptide abundance decreases, the intensity of the precursor ion decreases, and the recognition accuracy decreases. At the same precursor ion intensity, the modified peptide produces a spectrum with higher recognition accuracy.

[0042] In addition, the experimental method of enhancing N-terminal ions through chemical modification of the N-terminus of peptides and the new algorithm for secondary spectrum recognition based on N-terminal fragment ions can be used together to obtain better results.

[0043] Those skilled in the art should understand that although the present invention has been specifically described with reference to the above embodiments, the present invention is not limited to these specific embodiments. Based on the methods and technical solutions taught in this invention, those skilled in the art can make appropriate modifications or improvements without departing from the spirit of the present invention, and the equivalent embodiments obtained therefrom are all within the scope of the present invention.

Claims

1. A method for improving the accuracy of mass spectrometry alignment of antibody CDR3 peptide, the method comprising: 1) Mass spectrometry analysis was performed on the sample containing the antibody CDR3 peptide to obtain a primary spectrum. Candidate sequences of the CDR3 peptide were obtained from the database by comparison with the database. These candidate sequences have extremely similar molecular weights to the CDR3 peptide (i.e., the molecular weight error is within the tolerance range, typically ±1.5 parts per million, 3 parts per million, or 4.5 parts per million, etc.). 2) Obtain the secondary spectrum (i.e. fragment spectrum) of the sample and compare it with the theoretical secondary spectrum of the candidate sequence, and select the sequence from the candidate sequence whose N-terminal ion spectrum of the secondary spectrum matches the N-terminal ion spectrum of the sample's secondary spectrum.

2. The method according to claim 1, wherein the method further comprises: In cases where secondary spectrum identification is difficult, the tolerance of the primary spectrum in step 1) is reduced.

3. The method according to claim 1 or 2, wherein the sample is subjected to enzymatic digestion before step 1).

4. The method according to claim 3, wherein an enzymatic digestion is performed using a protease, such as trypsin, chymotrypsin, intracellular protease Asp-N, or endonuclease Glu-C.

5. The method according to any one of claims 1 to 4, wherein the sample comprises one or more antibody molecules, such as serum and plasma.

6. The method according to any one of claims 1 to 5, wherein the method further comprises: The antibody CDR3 peptide is chemically modified at its N-terminus. This N-terminal chemical modification can enhance the charge of the N-terminal ion fragments after fragmentation, thereby enhancing their N-terminal ion spectrum signal.

7. The method according to claim 6, wherein a molecule that enhances ionic charge and has N-terminal modification activity of the peptide, such as ox1a, is used to chemically modify the N-terminus of the antibody CDR3 peptide.

8. The application of the method according to any one of claims 1 to 7 in mass spectrometry-based proteomics detection, wherein the protein detected in the proteomics is an antibody.

9. A mass spectrometry-based proteomics detection method, wherein the protein detected in the proteomics is an antibody, the method being characterized by comprising: In secondary spectrum comparison, the ion spectra of the hypervariable regions of CDR3, CDR1, or CDR2, or combinations thereof, are used as one of the screening criteria.