Probe set, method for removing tRNA contamination in ribo-seq intermediates and application thereof

By using a specially designed oligonucleotide probe set and magnetic bead capture technology to remove tRNA contamination in Ribo-seq, the problem of tRNA contamination in existing technologies is solved, achieving efficient RPF signal preservation and improved data quality.

CN122279005APending Publication Date: 2026-06-26BEIJING NOVOGENE TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING NOVOGENE TECH CO LTD
Filing Date
2026-05-29
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies cannot effectively remove tRNA contamination from Ribo-seq intermediates, leading to wasted sequencing resources, increased data noise, and exacerbated quantitative bias, thus affecting the accuracy of translational omics research.

Method used

Using a specially designed oligonucleotide probe set combined with magnetic bead capture technology, tRNA contamination is removed through hybridization and variable temperature incubation, while preserving the biological characteristics of RPFs.

Benefits of technology

It significantly reduces the proportion of tRNA-derived reads, increases the proportion of effective RPF reads by over 300%, improves the signal-to-noise ratio and the ability to detect low-abundance translation events, and simplifies bioinformatics analysis.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122279005A_ABST
    Figure CN122279005A_ABST
Patent Text Reader

Abstract

This invention provides a probe set, a method for removing tRNA contamination from Ribo-seq intermediates, and their applications. The probe set includes one or more probes with nucleotide sequences of any one of SEQ ID NOs: 1-21. This invention addresses the problem of ineffective removal of tRNA-derived fragments from Ribo-seq libraries in existing technologies and is applicable to the field of library construction.
Need to check novelty before this filing date? Find Prior Art

Description

TECHNICAL FIELD

[0001] The present application relates to the field of library construction, and in particular, to a probe set, a method for removing tRNA contamination in Ribo-seq intermediates and application thereof. BACKGROUND

[0002] Ribosome profiling (Ribo-seq) is a revolutionary high-throughput sequencing technology that can capture and sequence ribosome-protected mRNA fragments (RPFs) to resolve the dynamic process of gene translation at near single-nucleotide resolution across the whole transcriptome. This technology has become an indispensable core tool in basic life science research, disease mechanism exploration and drug development. With the help of Ribo-seq, researchers can accurately quantify protein translation levels, identify translation start sites, reveal translation pausing phenomena, discover new open reading frames (such as uORFs and dORFs), and systematically analyze the complex network of translation regulation.

[0003] The standard Ribo-seq experimental procedure usually includes the following key steps: first, use a translation inhibitor (such as cycloheximide) to stabilize the ribosome that is in translation; then use a nuclease (such as RNase I) to digest the mRNA region that is not protected by the ribosome; then recover the ribosome-mRNA complex by sucrose gradient centrifugation or immunoprecipitation; finally, isolate RPF fragments of about 28-32 nucleotides in length from the complex, construct high-throughput sequencing libraries and perform deep sequencing.

[0004] However, despite its excellent resolution and sensitivity in theory, the practical application of Ribo-seq has long been limited by a key technical bottleneck—the serious contamination of tRNA-derived fragments.

[0005] Since tRNA is one of the most abundant non-coding RNAs in cells, and its highly conserved, compact secondary and tertiary conformation shows strong resistance to degradation during nuclease digestion, a large number of tRNA degradation intermediates with lengths ranging from 22-36 nt remain. These fragments are highly overlapped in size with real RPFs and cannot be effectively separated by conventional polyacrylamide gel electrophoresis (PAGE) size selection.

[0006] Furthermore, in certain tissues or cell types (such as tumor tissue, neurons, and liver), tRNA-derived small RNAs (tsRNAs) are also highly enriched, further exacerbating the contamination problem. These tRNA-derived fragments are efficiently enriched along with RPFs in subsequent library construction steps—such as 3' or 5' adapter ligation, reverse transcription, and PCR amplification—and are ultimately integrated into the sequencing library. In actual sequencing data, tRNA-derived reads often account for 10%-50%, severely diluting the effective RPF signal and significantly reducing the signal-to-noise ratio and statistical power of the data, particularly affecting the ability to detect the translational status of low-abundance transcripts.

[0007] Meanwhile, the bioinformatics analysis phase requires a significant additional computational investment to identify, align, and filter these non-specific reads. This not only increases the complexity and runtime of the analysis process but also introduces systematic false positives because tRNA fragments may be incorrectly aligned to repetitive sequences, pseudogenes, or homologous regions in the genome. This interferes with the accurate calculation of key biological indicators such as translation efficiency, start codon identification, and reading frame phase.

[0008] Currently, although some studies have attempted to perform post-sequencing data cleaning using bioinformatics methods, these strategies cannot undo the waste of sequencing throughput and sample resources in the early stages, nor can they improve data quality from the source. Gel purification methods relying on physical separation become completely ineffective due to the overlap in length between RPFs and tRNA fragments. To date, no mature, efficient, highly specific, and scalable preprocessing solution has been widely adopted. Summary of the Invention

[0009] The main objective of this invention is to provide a probe set, a method for removing tRNA contamination from Ribo-seq intermediates, and their applications, in order to solve the problem that existing methods for removing tRNA-derived fragments from Ribo-seq libraries are ineffective.

[0010] To achieve the above objectives, according to a first aspect of the present invention, a probe set is provided, the probe set comprising one or more probes having nucleotide sequences of any one or more of SEQ ID NOs: 1 to 21.

[0011] Furthermore, the probe set described above includes probes with nucleotide sequences of SEQ ID NO: 2, SEQ ID NO: 7, SEQ ID NO: 11, and SEQ ID NO: 1, SEQ ID NOs: 3-6, SEQ ID NOs: 8-10 or SEQ ID NOs: 12-21.

[0012] Furthermore, the 5' end of the probe is attached with a chemically modified group.

[0013] Furthermore, the aforementioned chemical modification groups include biotin.

[0014] To achieve the above objective, according to a second aspect of the present invention, a method for removing tRNA-derived fragments from Ribo-seq intermediates is provided, the method comprising: mixing the Ribo-seq intermediate to be treated with the above-mentioned probe set and reacting to obtain Ribo-seq intermediates with tRNA-derived fragments removed.

[0015] Furthermore, the above reaction includes a first reaction and a second reaction; the first reaction includes mixing the Ribo-seq intermediate to be treated with the probe group and performing a first incubation at a variable temperature stage to obtain a first reaction product; the second reaction includes mixing the first reaction product with magnetic beads and performing a second incubation to obtain a second reaction product.

[0016] Furthermore, the aforementioned temperature change stages include the following first to seventh temperature change stages: First temperature change stage: 68℃, time 5-10 min; Second temperature change stage: 65℃, time 1-3 min; Third temperature change stage: 60℃, time 1-3 min; Fourth temperature change stage: 55℃, time 1-3 min; Fifth temperature change stage: 37℃, time 1-3 min; Sixth temperature change stage: 25℃, time 5-10 min; Seventh temperature change stage: 4℃, time held for 5-30 min.

[0017] Furthermore, the temperature for the second incubation is 25~60°C; preferably, the time for the second incubation is 5~10 minutes.

[0018] Further, after the second reaction is completed, the magnetic beads in the product of the second reaction are removed to obtain the Ribo-seq intermediate product with the tRNA-derived fragment removed.

[0019] To achieve the above objectives, according to a third aspect of the present invention, a method for constructing a Ribo-seq library is provided, the method comprising: constructing a Ribo-seq library using the above-described probe set and the above-described method for removing tRNA contamination from Ribo-seq intermediates.

[0020] The probe set of this application, including probes with nucleotide sequences of any one of SEQ ID NOs: 1-21, achieves good results when applied to remove tRNA-derived fragments from Ribo-seq libraries. Applying the technical solution of this invention can reduce the proportion of tRNA-derived reads, achieving a tRNA removal efficiency of over 95%, while completely preserving the biological characteristics of RPFs, increasing the effective read ratio by over 300%. This application can achieve tRNA contamination removal without changing the original library construction process, improving the data signal-to-noise ratio and the detection capability of low-abundance translation events, and reducing the burden of bioinformatics analysis. Attached Figure Description

[0021] The accompanying drawings, which form part of this application, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings:

[0022] Figure 1 This paper shows the read length distribution results after non-coding RNA filtering in two sets of libraries in Example 3 of this application. Figure 1 Figure A shows the results for the control group. Figure 1 Figure B in the diagram shows the results for the experimental group.

[0023] Figure 2 This diagram shows the proportion of uniquely aligned reads in different regions of the genome (Coding, UTR, and Intron) in the two libraries in Example 3 of this application. Figure 2 Figure A shows the results for the control group. Figure 2 Figure B in the diagram shows the results for the experimental group.

[0024] Figure 3 This paper presents a graph showing the distribution of the start sites of two data read segments relative to the CDS start and stop codons in Embodiment 3 of this application. Figure 3 Figure A shows the results for the control group. Figure 3 Figure B in the diagram shows the results for the experimental group.

[0025] Figure 4 The figure shows the statistical results of tRNA contamination rate after removing tRNA contamination sequences from human samples using the probe set of this application and the probe set of CN 116837071 A in Comparative Example 1 of this application.

[0026] Figure 5 The figure shows the statistical results of tRNA contamination rate after removing tRNA contamination sequences from mouse samples using the probe set of this application and the probe set of CN 116837071 A in Comparative Example 1 of this application. Detailed Implementation

[0027] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. The present invention will now be described in detail with reference to the embodiments.

[0028] As mentioned in the background section, tRNA-derived fragments in Ribo-seq libraries have a high degree of overlap with RPFs in length and are extremely abundant. Conventional gel purification and bioinformatics filtering cannot effectively remove them, leading to wasted sequencing resources, increased data noise, and exacerbated quantification bias, severely restricting the accurate application of this technology. Therefore, in this application, the inventors attempted to develop a new probe set that targets and physically removes tRNA contamination, rather than relying on later computational removal, thereby achieving contamination removal at the library construction stage. Based on this, a series of protection schemes are proposed in this application.

[0029] In a first typical embodiment of this application, a probe set is provided, which includes one or more probes with nucleotide sequences of any one of SEQ ID NOs: 1 to 21.

[0030] In a preferred embodiment, the probe set includes one or more probes with nucleotide sequences of SEQ ID NO: 2, SEQ ID NO: 7, SEQ ID NO: 11, and SEQ ID NO: 1, SEQ ID NOs: 3-6, SEQ ID NOs: 8-10 or SEQ ID NOs: 12-21.

[0031] The names, sequences, lengths, and modification information of the probes in this application are shown in Table 1 below.

[0032] Table 1

[0033]

[0034] During the preparation of different Ribo-seq libraries, the composition of contaminating fragments entering the final library varies due to differences in the abundance and proportion of tRNAs within different samples. In large-sample cohort production practices, the contamination library of tRNA-related fragments is complex and diverse. Therefore, designing a probe set with high coverage and the ability to accurately remove complex and diverse tRNAs is crucial for removing contaminating fragments from Ribo-seq libraries.

[0035] This application provides an oligonucleotide probe set for specifically removing tRNA-derived contaminants from Ribo-seq libraries. Each probe in the probe set contains a nucleotide sequence selected from any one of SEQ ID NO: 1 to SEQ ID NO: 21. In the early stages of this research, the sequence of each probe was carefully designed to be inversely complementary to its target—the most abundant tRNA molecule or its stable degradation fragment in the Ribo-seq library of the target species (e.g., human, mouse, rat, etc.)—ensuring specific recognition and binding of tRNA contaminant sequences during hybridization, and preventing cross-reaction with RPFs or other non-target RNAs.

[0036] The target sequences of the probe set in this application are derived from tRNA contamination hotspot sequences screened by systematic bioinformatics analysis of Ribo-seq data from multiple species and tissue samples, resulting in high coverage of the probe set and improved relevance to clinical and research scenarios.

[0037] To achieve efficient and stable hybridization, the length of each probe in this probe set is optimized to be between 22 and 36 nucleotides. This length range balances binding specificity and thermodynamic stability: on the one hand, a sufficiently long sequence ensures the formation of a stable DNA-RNA hybrid double strand with a sufficient melting temperature (Tm) with the target tRNA fragment, thus improving binding efficiency; on the other hand, it avoids problems such as increased synthetic complexity, increased risk of non-specific hybridization, and increased cost that may be caused by excessively long probes, which is in line with the general principles of oligonucleotide probe design.

[0038] The probe set described in this application, after systematic screening and validation in the early stages of the research, can simultaneously cover the vast majority of tRNA contaminants in Ribo-seq libraries from major model organisms such as humans, mice, and rats. Due to the high conservation of tRNA sequences in mammals, this probe set can also be extended to other mammalian samples. Experimental data from specific embodiments of this application show that after processing Ribo-seq libraries with the probe set described in this application, the proportion of sequencing reads derived from tRNA is significantly reduced, with a removal efficiency exceeding 90%, thus increasing the proportion of effective RPF reads.

[0039] All probes underwent rigorous bioinformatics alignment (e.g., BLAST) to eliminate potential cross-binding with mRNA coding regions, rRNA, or genomic repetitive elements, exhibiting high sequence specificity. Practical application has demonstrated that this probe set efficiently removes tRNA contamination without significant non-specific capture or loss of genuine RPF fragments: the RPF read length distribution, trinucleotide periodicity characteristics, and genomic annotation distribution (e.g., coding region proportion) of the treated library are highly consistent with the untreated control group, proving that this invention achieves efficient contaminant removal while fully preserving the biological authenticity and analytical value of RPFs, laying a technical foundation for high-precision translational omics research.

[0040] In a preferred embodiment, the 5' end of the probe is attached with a chemically modified group.

[0041] In a preferred embodiment, the chemically modified group includes biotin.

[0042] The 5' end of the probe in this application may carry chemical modifications, preferably biotin. These chemical modifications are covalently attached to the 5' end of the probe, without interfering with the hybridization ability of the probe with the target tRNA fragment, while simultaneously endowing the probe with affinity properties that allow it to be recognized by a solid-phase carrier. It should be noted that this application does not exclude other commercially viable modification groups.

[0043] Specifically, after the probe forms a DNA-RNA hybrid double strand with the target tRNA fragment, the modification group at its 5' end can bind with high affinity and specificity to the streptavidin-coated magnetic beads, thereby achieving rapid and efficient capture of the tRNA-probe complex. Subsequently, the contaminating fragments are removed from the reaction system by magnetic separation technology, and the target RPFs in the supernatant are retained to obtain the final library.

[0044] In a second typical embodiment of this application, a method for removing tRNA-derived fragments from Ribo-seq intermediates is provided. The method includes: mixing the Ribo-seq intermediate to be treated with the above-mentioned probe set and reacting to obtain Ribo-seq intermediates with tRNA-derived fragments removed.

[0045] In a preferred embodiment, the Ribo-seq intermediate product comprises an RPF fragment that has been digested with nucleases and recovered by ribosomes. The RPF fragment is purified by PAGE gel. Preferably, the RPF fragment purified by PAGE gel comprises RNA. Preferably, the size of the RNA is 28-32 nt (including but not limited to 28 nt, 29 nt, 30 nt, 31 nt or 32 nt).

[0046] This application utilizes a probe set to specifically identify and capture tRNA-contaminated fragments, thereby eliminating invalid reads at the source. Because tRNA fragments and RPFs (Recurrent Produced Flow Fragments) have highly overlapping lengths, traditional size screening methods cannot effectively distinguish between them. However, this application employs sequence-complementary probes to selectively bind to and remove tRNA-originating molecules without interfering with the structure and abundance of RPFs. Therefore, at the same high-throughput sequencing depth, this method can increase the proportion of RPF reads—10%-50% of the sequencing throughput originally occupied by tRNA contamination is redistributed to the actual translation signal, increasing the proportion of effective data and enhancing the detection sensitivity and quantitative accuracy of low-abundance gene translation events.

[0047] Meanwhile, since tRNA contamination is physically removed before sequencing, downstream bioinformatics analysis does not require additional resources for complex filtering, reducing the computational burden and risk of misjudgment in data processing, and improving the efficiency, signal-to-noise ratio, and reproducibility of Ribo-seq experiments. Furthermore, the method described in this application is simple to operate and highly compatible, and can be integrated into existing standard Ribo-seq library preparation workflows, providing technical support for achieving high-precision translatomics analysis.

[0048] In a preferred embodiment, the reaction includes a first reaction and a second reaction; the first reaction includes mixing the Ribo-seq intermediate to be processed with the probe group and performing a first incubation at a variable temperature stage to obtain a first reaction product; the second reaction includes mixing the first reaction product with magnetic beads and performing a second incubation to obtain a second reaction product.

[0049] In a preferred embodiment, the temperature change stage includes the following first to seventh temperature change stages: First temperature change stage: 68℃, time 5-10 min (including but not limited to 5 min, 6 min, 7 min, 8 min, 9 min or 10 min); Second temperature change stage: 65℃, time 1-3 min (including but not limited to 1 min, 2 min or 3 min); Third temperature change stage: 60℃, time 1-3 min (including but not limited to 1 min, 2 min or 3 min); Fourth temperature change stage: 55℃, time 1-3 min (including but not limited to 1 min, 2 min or 3 min); Fifth temperature change stage: 37℃, time 1-3 min (including but not limited to 1 min, 2 min or 3 min); Sixth temperature change stage: 25℃, time 5-10 min (including but not limited to 5 min, 6 min, 7 min, 8 min, 9 min or 10 min); Seventh temperature change stage: 4℃, time held for 5-10 min (including but not limited to 5 min, 6 min, 7 min, 8 min, 9 min or 10 min). Preferably, the mixing ratio of the Ribo-seq library to be processed and the probe combination is as follows: 0.5-5 μL (including but not limited to 0.5 μL, 1 μL, 1.5 μL, 2 μL, 2.5 μL, 3 μL, 3.5 μL, 4 μL, 4.5 μL, or 5 μL) of the probe combination is added to every 1 μg of the nuclease digestion intermediate product in the Ribo-seq library; more preferably, the concentration of each probe in the probe composition is 0.05-1 pmole / μL (including but not limited to 0.05 pmole / μL, 0.1 pmole / μL, 0.15 pmole / μL, 0.2 pmole / μL, 0.25 pmole / μL, 0.3 pmole / μL, 0.35 pmole / μL, 0.4 pmole / μL, 0.45 pmole / μL, 0.5 pmole / μL, 0.55 pmole / μL, etc.). pmole / μL, 0.6 pmole / μL, 0.65 pmole / μL, 0.7 pmole / μL, 0.75 pmole / μL, 0.8 pmole / μL, 0.85 pmole / μL, 0.9 pmole / μL, 0.95 pmole / μL or 1 pmole / μL).

[0050] In one specific embodiment of this application, the first reaction includes mixing the Ribo-seq intermediate product to be processed (a nucleic acid mixture containing RPFs and tRNA contamination fragments) with an excess of the probe set composed of the above-mentioned probes, and reacting in a hybridization buffer (any hybridization buffer known to those skilled in the art).

[0051] The first reaction described above is equivalent to a hybridization reaction. The multi-stage temperature-controlled hybridization program in this application has been systematically optimized. During the high-temperature stage, it rapidly destructures the RNA secondary structure, fully exposing the target sequence. Subsequently, through gradual cooling, it matches sequences with different Tm values ​​in the probe set, achieving synchronous, efficient, and specific binding to all target sites. Simultaneously, it improves the inhibition of non-specific hybridization between the probe and non-target RNAs (such as RPFs and rRNA). This multi-stage temperature-controlled setting combines the high structural stability of tRNA fragments with the complex hybridization kinetics, improving probe capture efficiency and avoiding uneven binding or increased background that may result from single-temperature hybridization.

[0052] In a preferred embodiment, the temperature of the second incubation is 25~50℃ (including but not limited to 25℃, 26℃, 27℃, 28℃, 29℃, 30℃, 31℃, 32℃, 33℃, 34℃, 35℃, 36℃, 37℃, 38℃, 39℃, 40℃, 41℃, 42℃, 43℃, 44℃, 45℃, 46℃, 47℃, 48℃, 49℃ or 50℃); preferably, the time of the second incubation is 5~10 minutes (including but not limited to 5 minutes, 6 minutes, 7 minutes, 8 minutes, 9 minutes or 10 minutes).

[0053] The second reaction described above is a capture reaction. After the first reaction is completed, magnetic beads are added to the resulting product of the first reaction, and a second incubation is performed within the aforementioned temperature and time range. This allows the streptavidin on the surface of the magnetic beads to bind to the biotin tag at the 5' end of the probe, thereby capturing the tRNA-probe complex onto the surface of the magnetic beads. This second reaction stage is characterized by mild and rapid temperature, does not damage the RPF structure, preserves its key biological characteristics intact, and does not lead to a decrease in library quality.

[0054] In a preferred embodiment, after the second reaction is completed, the magnetic beads in the second reaction product are removed to obtain a Ribo-seq intermediate product free of tRNA contamination.

[0055] After the second reaction (capture reaction) is completed, the magnetic beads adsorbed with the tRNA-probe complex can be removed by simple physical magnetic separation. The supernatant contains RPFs that have been decontaminated with tRNA and can be directly used for subsequent library construction (such as adapter ligation, reverse transcription, PCR amplification).

[0056] Following separation, the supernatant is further recovered and purified. The combined supernatant is then purified to remove salts and other impurities from the hybridization buffer, preparing it for downstream library construction steps (such as ligation and amplification). Purification methods can include commercially available RNA purification kits (such as the Zymo® RNAClean & Concentrator Kit) or any recovery and purification method known to those skilled in the art; this application makes no limitation on these methods.

[0057] Furthermore, this invention encompasses various technical variations to adapt to different experimental needs. For example, in addition to the physical separation scheme based on biotin-streptavidin magnetic bead capture described above, an RNase H enzyme degradation strategy can also be used as an alternative technical path: after the first reaction, instead of introducing magnetic beads, RNase H enzyme is directly added to the hybridization system. This enzyme specifically recognizes and catalyzes the degradation of the RNA strand (i.e., tRNA contamination fragment) in the DNA-RNA hybrid double strand, while being inactive against unhybridized single-stranded RPFs (RNA) and DNA probes. DNase I can then be used to degrade the DNA probe, and the intact RPFs can be recovered using an RNA purification kit. This approach eliminates the need for magnetic bead manipulation, is suitable for high-throughput automated platforms, achieves efficient tRNA removal, and avoids background interference that may arise from affinity tagging systems.

[0058] This application achieves sequence-specific recognition of tRNA by the probe through the first reaction and achieves efficient physical capture of the tRNA-probe complex through the second reaction, together forming a "recognition-capture-separation" tRNA contamination removal system. This system breaks through the technical bottleneck of the long-standing difficulty in removing tRNA contamination in Ribo-seq. The method steps of this application are clear and simple, compatible with mainstream library construction platforms, and have practical value and industrial transformation potential.

[0059] In a third typical embodiment of this application, a method for constructing a Ribo-seq library is provided, which includes constructing the Ribo-seq library using the above-described probe set and the above-described method for removing tRNA contamination from Ribo-seq intermediates.

[0060] The method described in this application has a clear flow and simple operation steps. Experiments have shown that the method can remove tRNA contamination within 2 hours. Furthermore, based on initial experiments, the method can be directly applied to existing Ribo-seq library preparation procedures without significant modifications, demonstrating good practicality and compatibility.

[0061] The beneficial effects of this application will be explained in more detail below with reference to specific embodiments.

[0062] Example 1: Design and preparation of tRNA removal probe kits for mouse liver Ribo-seq libraries

[0063] 1.1 Probe assembly design principles:

[0064] The goal of this embodiment is to design a probe set that can efficiently remove tRNA contamination from Ribo-seq libraries derived from mouse livers.

[0065] Target selection: First, by analyzing Ribo-seq data and tRNA-derived fragments from mouse livers used in internal testing, the most abundant tRNA fragments constituting the main pollutants in this cell line were screened. These tRNA fragments were derived from tRNA-Gly-GCC, tRNA-Glu-TTC, tRNA-Glu-CTC, tRNA-His-GTG, tRNA-Lys-CTT, tRNA-Lys-TTT, tRNA-Lys-TGT, and tRNA-Val-CAC. The probe set in this application is a targeted design driven by data from 50 internal library samples. Specifically, bioinformatics analysis was used to identify and summarize the real tRNA-related pollutant sequences in the Ribo-seq library, and then targeted designs were developed for these pollutant sequences. This achieves highly specific, highly specific, and cost-effective probe sets.

[0066] Sequence design: Design reverse complementary oligonucleotide probes for each target tRNA.

[0067] The target region of the probe has a complementary pairing of 22-36 bases with the tRNA contamination fragment. Using bioinformatics tools such as BLAST, the candidate probe sequences are compared with the mouse reference genome and the RefSeq mRNA database to eliminate any sequences that have potential cross-reactivity with the mRNA coding region or important non-coding RNA.

[0068] Thermodynamic parameter optimization: The melting temperature (Tm) of each probe binding to the target tRNA fragment was calculated using tools such as OligoAnalyzer to ensure that the Tm value of all probes is within a narrow range (e.g., 60-80°C) so as to achieve similar binding efficiency at a uniform hybridization temperature.

[0069] 1.2 Chemical synthesis and modification of probes:

[0070] Based on the designed sequence, oligonucleotide synthesis was commissioned to a commercial company. All probes were DNA probes, ranging in length from 22 to 36 nucleotides. A biotin molecule was covalently linked to the 5' end of each probe.

[0071] The synthesized probes were purified by high-performance liquid chromatography (HPLC) and their quality was verified by mass spectrometry (MS). Finally, all synthesized probes were mixed in equimolar amounts and dissolved in TE buffer to prepare a probe stock solution with a concentration of 1 pmole / μL for each probe.

[0072] The probe group information is shown in Table 1 above.

[0073] Example 2: Experimental procedure for removing tRNA contamination from mouse liver Ribo-seq libraries using probe sets

[0074] This embodiment describes the entire process of applying the probe set prepared in Example 1 to the actual Ribo-seq library sample processing.

[0075] 2.1 Experimental Materials:

[0076] Ribo-seq samples: RPF fragments extracted from mouse livers, which have undergone nuclease digestion and ribosome recovery. These fragments have been purified by PAGE gel extraction, recovering RNA ranging in size from 28 to 32 nt.

[0077] tRNA removal probe set: 1 pmole / μL probe set storage solution for each probe prepared in Example 1.

[0078] Main reagents: Animal RNA enrichment probe + magnetic beads (Tiangen NR201-T1, this module is used to remove rRNA from animal Ribo-seq samples, containing streptavidin-conjugated magnetic beads and hybridization buffer, etc., and can be directly used with the tRNA removal probe set of this invention).

[0079] 2.2 Experimental Procedure:

[0080] Magnetic bead preparation: Take 350 μL of Animal RNA Enrichment beads suspension and wash twice with STB Resuspension Solution according to the instructions. Resuspend the beads in 350 μL of STB Resuspension Solution, and remove 110 μL of the resuspension, labeling it as Magnetic Bead 2. Bind the remaining 240 μL of beads with a magnetic rack to remove the supernatant, then resuspend in 100 μL of STB Resuspension Solution, add 1 μL of RNase Inhibitor, mix well, and set aside as Magnetic Bead 1.

[0081] Construction of hybridization reaction system:

[0082] Dilute the stock solution of the tRNA probe group prepared in Example 1 by 20× with NF-water (nuclease-free water) to obtain 1× working solution (prepare an appropriate volume according to the number of samples). Then, in a 200 μL RNase-Free PCR tube, prepare the probe reaction solution on ice according to the table below, and mix 10 times with a pipette. The system is shown in Table 2.

[0083] Table 2

[0084]

[0085] 1. Hybridization and incubation:

[0086] After brief centrifugation, place the sample in a PCR instrument (with the heated lid on, 99-105 °C is acceptable) and follow the procedure below. The total time is approximately 25 minutes. The system is shown in Table 3.

[0087] Table 3

[0088]

[0089] 2. Capture and removal of tRNA-probe complexes:

[0090] Transfer the above 40 μL RNA-Probe reaction solution to 100 μL of washed Animal RNA Enrichmentbeads (magnetic bead 1, the one containing RNase Inhibitor), and mix by pipetting 10 times.

[0091] Incubate at room temperature (25 °C) for 5 min, then incubate in a metal bath at 50 °C for 10 min. Gently mix once with a pipette after 5 min to prevent magnetic beads from settling.

[0092] After incubation, place another 110 μL of washed and prepared Animal RNA Enrichment beads (magnetic beads 2) on a magnetic rack for 1 min. After the solution becomes clear, carefully remove the supernatant with a pipette (if the sample size is small, after removing the supernatant, cover the container and proceed to the next step quickly to avoid drying out the magnetic beads. If the sample size is large, two magnetic racks should be used to process the samples in batches).

[0093] Immediately place the RNA-Beads binding system from step 2 onto the magnetic rack, open the tube, and allow the liquid to clarify after 1 min. Transfer 140 μL of the RNA-containing supernatant to magnetic bead 2 (from step 3 where the supernatant has been removed), and mix thoroughly 10 times using a pipette.

[0094] Incubate at 25 °C for 10 min. After 5 min, gently mix once with a pipette to prevent magnetic bead settling. Immediately place the mixture on a magnetic rack, open the tube, and allow it to clarify after 1 min. To avoid leaving magnetic beads and bound rRNA, transfer the supernatant to a new 1.5 mL EP tube, place it back on the magnetic rack to adsorb any remaining magnetic beads, and allow it to clarify after 1 min. Without removing any magnetic beads, transfer approximately 135 μL of the RNA-containing supernatant to a new 2 mL EP tube, place it on ice, and immediately proceed to the next purification and recovery step.

[0095] 3. Purification and recovery of RPFs:

[0096] The RNA was purified and recovered from the supernatant containing RPFs according to the Zymo RNA Clean & Concentrator instructions.

[0097] The resulting purified RPF sample can be directly used for subsequent library construction steps such as adapter ligation and reverse transcription.

[0098] Example 3: Verification of tRNA removal effect in this example

[0099] To verify the effectiveness of the present invention, this embodiment sets up an experimental group (treated using the method of the present invention) and a control group (treated using an equal volume of water instead of the probe group and followed the same procedure). The same mouse liver RPF sample was treated, and then a Ribo-seq library was constructed and high-throughput sequencing was performed.

[0100] 3.1 tRNA removal efficiency assessment:

[0101] Using a bioinformatics workflow, clean reads obtained from sequencing were aligned to both the tRNA database and the reference genome. The proportion of reads aligned to the tRNA database was calculated out of the total number of reads. The experimental results comparing tRNA contamination rates are shown in Table 4.

[0102] Table 4

[0103]

[0104] Analysis: The results showed that, compared with the control group, the proportion of tRNA-derived contaminants in the experimental library treated with the probe group of the present invention dropped sharply from 40.58% to 1.86%, and the tRNA fragment removal efficiency was as high as 95.42%, proving that the present invention has an extremely efficient tRNA removal capability.

[0105] 3.2 Impact assessment on the integrity of the ribosome protective fragment (RPF):

[0106] The effect of removing tRNA contamination on the real RPF signal was analyzed.

[0107] Read length distribution: Compare the read length distributions after non-coding RNA filtering in the two libraries. Results are as follows: Figure 1 As shown, where, Figure 1 Figure A shows the results for the control group. Figure 1 Figure B in the diagram shows the results for the experimental group.

[0108] Analysis: The RPF read length distribution patterns of the experimental group and the control group were almost identical, with the main peak clearly located between 28-32 nt, indicating that the processing of the present invention did not cause any perceptible change to the length characteristics of the RPF.

[0109] Genomic distribution of uniquely aligned reads: Ideally, the vast majority of uniquely aligned reads in Ribo-seq should be RPFs, and the vast majority of RPFs should be located in the coding region. The proportion of uniquely aligned reads in different regions of the genome (Coding, UTR, and Intron) in the two libraries was compared, and the results are as follows: Figure 2 As shown, where, Figure 2 Figure A shows the results of the control group. Figure 2 Figure B is a schematic diagram of the experimental group results. Figure 2 The "regions" field indicates a region.

[0110] Analysis: The genome distribution patterns of the experimental group and the control group were almost identical, with the only aligned reads distributed in approximately 97% of the coding region, indicating that the processing of this invention did not cause any perceptible change to the distribution pattern of RPF on the genome.

[0111] Trinucleotide periodicity (P-site analysis): A key quality control indicator for Ribo-seq data is the distribution of reads within the coding region, exhibiting a clear trinucleotide periodicity. We analyzed the distribution of read start sites relative to the CDS start and stop codons in two datasets. The results are as follows: Figure 3 As shown, Figure 3 Figure A shows the results of the control group. Figure 3 Figure B is a schematic diagram of the experimental group results.

[0112] Analysis: Both sets of data exhibit the typical trinucleotide periodicity characteristics of high-quality Ribo-seq data, demonstrating that the process of removing tRNA contamination did not interfere with the phase information of the RPF fragments and preserved the integrity of the translation frames.

[0113] 3.3 Improvement of overall sequencing data quality:

[0114] The final sequencing data alignment of the two libraries was compared. The experimental results of the sequencing data alignment quality comparison are shown in Table 5.

[0115] Table 5

[0116]

[0117] Analysis: The most significant change is that by removing tRNA contamination, the percentage of effective reads remaining after non-coding RNA removal increased dramatically from 16.34% to 65.80%, an increase of over 300%. This means that, with the same sequencing cost, researchers can obtain far more effective data than before, significantly improving the efficiency and cost-effectiveness of Ribo-seq experiments.

[0118] Comparative Example 1: Performance comparison of the probe sets in this application and those in CN 116837071 A

[0119] To verify the superiority of this invention, this comparative example used the probe set of this application and the probe set of CN 116837071 A to remove tRNA contamination sequences in the same sample, respectively. Then, a Ribo-seq library was constructed and high-throughput sequencing was performed. The experimental results comparing the tRNA contamination rates of the two probe sets in mouse and human samples are shown in Table 7. The tRNA contamination rates of the probe set of this application and the probe set of CN 116837071 A after tRNA contamination sequence removal in human samples are shown in Table 7. Figure 4 As shown; the tRNA contamination rates of mouse samples after tRNA contamination sequence removal using the probe set of this application and the probe set of CN 116837071 A are as follows. Figure 5 As shown in the figure; the experimental results show that the probe in CN 116837071 A has limited applicability in production practice and cannot completely solve fragment contamination in some samples, while the probe set in this application can fully identify and remove possible tRNA contamination fragments.

[0120] The experimental group using the probes of this application in the comparative example follows the steps described in Example 2. For the control experiment using the CN116837071 A probe group, except for the rRNA and tRNA removal steps, the library construction steps were identical to the experimental group. The rRNA and tRNA removal steps were performed according to the experimental steps disclosed in CN 116837071 A. Before the reverse transcription reaction, 1 μL of QIAseq FastSelect-rRNA HMR and 1 μL of CN 116837071 A probe group were added. The PCR program was executed in the instrument as shown in Table 6 below. The target fragment was hybridized using the PCR instrument program, and reverse transcription of this fragment was blocked.

[0121] Table 6

[0122]

[0123] Table 7

[0124]

[0125] In Table 7, human sample 1 is a sample derived from K562 cells; human sample 2 is a sample derived from K562 cells; human sample 3 is a sample derived from HELA cells; human sample 4 is a sample derived from HELA cells; cell line samples of different human samples (human sample 1, human sample 2, human sample 3, or human sample 4) came from different batches of culture (for example, although human sample 1 and human sample 2 came from the same cell type, the cell culture batches were completely different). After washing with PBS, the cell pellet was collected by centrifugation, flash-frozen in liquid nitrogen, and stored at -80°C before the experiment; mouse sample 1 is a sample derived from mouse heart; mouse samples 2-8 are samples derived from mouse liver; mouse samples 9-10 are samples derived from mouse spleen; mouse samples 11-12 are samples derived from mouse lung; mouse samples 13-15 are samples derived from mouse kidney; different mouse samples (mouse sample 1 to mouse tissue 15) came from different mice, were washed with PBS after sampling, flash-frozen in liquid nitrogen, and stored at -80°C before the experiment.

[0126] As can be seen from the above description, the embodiments of the present invention achieve the following technical effects: This application provides a probe set for removing tRNA-derived fragments from Ribo-seq libraries and its application method. Through specifically designed oligonucleotide probes with affinity tags, combined with magnetic bead capture technology, tRNA contamination can be specifically removed with an efficiency of over 90%, while retaining the quantity, length distribution, and biological information of the real RPF fragments. This method can improve the effectiveness and quality of Ribo-seq data, overcome technical bottlenecks, and the method of this application has a simple operation process, good compatibility, and good results, possessing scientific research application value and commercial development potential.

[0127] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A probe assembly, characterized in that, The probe set includes one or more probes with nucleotide sequences of any one of SEQ ID NOs: 1 to 21.

2. The probe assembly according to claim 1, characterized in that, The probe set includes one or more probes with nucleotide sequences of SEQ ID NO: 2, SEQ ID NO: 7, SEQ ID NO: 11, and SEQ ID NO: 1, SEQ ID NOs: 3-6, SEQ ID NOs: 8-10 or SEQ ID NOs: 12-21.

3. The probe assembly according to claim 1 or 2, characterized in that, The probe has a chemically modified group attached to its 5' end.

4. The probe assembly according to claim 3, characterized in that, The chemical modification group includes biotin.

5. A method for removing tRNA contamination from Ribo-seq intermediates, characterized in that, The method includes: mixing the Ribo-seq intermediate to be processed with the probe set according to any one of claims 1 to 4 and reacting to obtain the Ribo-seq intermediate free of tRNA contamination.

6. The method according to claim 5, characterized in that, The reaction includes a first reaction and a second reaction; The first reaction includes mixing the Ribo-seq intermediate to be processed with the probe group, performing a first incubation at a variable temperature stage, and obtaining a first reaction product; The second reaction includes mixing the first reaction product with magnetic beads and performing a second incubation to obtain a second reaction product.

7. The method according to claim 6, characterized in that, The temperature change stage includes the following first to seventh temperature change stages: First temperature change stage: 68℃, time 5-10min; Second temperature change stage: 65℃, time 1-3min; Third temperature change stage: 60℃, time 1-3min; Fourth temperature change stage: 55℃, time 1-3 minutes; Fifth temperature change stage: 37℃, time 1-3 minutes; Sixth temperature change stage: 25℃, time 5-10 minutes; Seventh temperature change stage: 4℃, maintain for 5-30 minutes.

8. The method according to claim 6, characterized in that, The second incubation temperature is 25~60℃, and the second incubation time is 5~10 minutes.

9. The method according to claim 6, characterized in that, After the second incubation is completed, the magnetic beads in the second reaction product are removed to obtain the Ribo-seq intermediate product free of tRNA contamination.

10. A method for constructing a Ribo-seq library, characterized in that, The method includes: constructing a Ribo-seq library using the probe set described in any one of claims 1 to 4 and the method for removing tRNA contamination from Ribo-seq intermediates as described in any one of claims 5 to 9.