A SPARK-seq high-throughput platform for identification and kinetic analysis of aptamers and their target proteins

By combining the SPARK-seq platform with CRISPR and single-cell multi-omics sequencing technologies, the low-throughput problem of nucleic acid aptamer target identification has been solved, enabling high-throughput screening of low-abundance proteins, expanding the scope of biomarker discovery, and improving the diagnostic and therapeutic applications of nucleic acid aptamers.

CN122283136APending Publication Date: 2026-06-26HANGZHOU INSTITUTE OF MEDICAL SCIENCES CHINESE ACADEMY OF SCIENCES

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU INSTITUTE OF MEDICAL SCIENCES CHINESE ACADEMY OF SCIENCES
Filing Date
2025-12-15
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies suffer from low throughput and low efficiency in the identification of nucleic acid aptamer targets. They are difficult to screen a large number of nucleic acid aptamers and targets at the same time, and they tend to favor high-abundance proteins while ignoring low-abundance proteins, making it impossible to discover biomarkers in a high-throughput manner.

Method used

By combining single-cell perturbation sequencing (Perturb-seq) and high-throughput nucleic acid aptamer sequencing, and using CRISPR gene perturbation and single-cell multi-omics sequencing technologies, a SPARK-seq platform was constructed. The interaction between nucleic acid aptamers and target proteins was screened based on protein differences between two cell populations to achieve high-throughput identification.

Benefits of technology

It has enabled systematic, high-throughput screening of thousands of nucleic acid aptamer-protein interactions, discovered low-abundance cell surface proteins, expanded the range of biomarkers, and developed a seamlessly integrated nucleic acid aptamer screening platform that improves the performance of nucleic acid aptamers in diagnosis and treatment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122283136A_ABST
    Figure CN122283136A_ABST
Patent Text Reader

Abstract

This invention provides a SPARK-seq high-throughput platform for the identification and kinetic analysis of nucleic acid aptamers and their target proteins. This platform innovatively combines cell screening, CRISPR gene perturbation, and single-cell multi-omics sequencing technologies, enabling the systematic identification of thousands of nucleic acid aptamers and their target protein interactions in a single experiment. Simultaneously, it achieves, for the first time, high-throughput screening of highly stable nucleic acid aptamers with "slow dissociation" characteristics based on dissociation kinetics. The method provided by this invention overcomes the limitations of traditional techniques, such as low throughput, difficulty in identifying relatively low-abundance targets, and inability to perform efficient screening in natural cellular environments. It not only achieves large-scale, unbiased discovery of nucleic acid aptamer targets but also accurately screens for highly stable aptamers with slow dissociation rates, providing a more powerful molecular tool and a novel biomarker discovery pathway for tumor diagnosis, targeted therapy, and precision medicine.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] This application claims priority to the earlier Chinese application, application number 2024119392732, filed on December 26, 2024; all its contents are part of this invention. Technical Field

[0002] This application relates to the field of biotechnology, specifically to a SPARK-seq high-throughput platform for the identification and kinetic analysis of nucleic acid aptamers and their target proteins. Background Technology

[0003] Affinity reagents are crucial for recognizing and binding specific molecular targets, playing a fundamental role in molecular biology and therapeutic discovery. Nucleic acid aptamers, recognized through Systematic Evolution of Ligands by Exponential Enrichment (SELEX), are short single-stranded DNA or RNA molecules with significant target specificity. To date, thousands of nucleic acid aptamers have been successfully generated using SELEX technology, capable of binding to specific targets such as small molecules, metal ions, proteins, peptides, bacteria, viruses, and living cells. Unlike antibodies, which need to be generated in vivo, nucleic acid aptamers are selected in vitro, allowing for customized design. Their nucleic acid composition offers unique advantages, such as ease of synthesis, precise chemical modification, and high stability, ensuring consistency across experiments. These properties have expanded the applications of nucleic acid aptamers from basic research to clinical diagnostics, targeted drug delivery, and molecular imaging. Therefore, nucleic acid aptamers have become an indispensable tool for advancing precision medicine and innovative therapeutic strategies, offering transformative potential for a wide range of scientific and clinical applications.

[0004] Cell-SELEX offers a key advantage by selecting nucleic acid aptamers in the native cellular environment, enabling the identification of aptamers that bind to cell surface proteins in their native conformation. Furthermore, it facilitates the discovery of novel biomarkers without prior knowledge of the target protein, making it more versatile than protein-based screening methods. A crucial step in cell screening is identifying its protein targets. To address this challenge, Larry Gold et al. proposed using red blood cells (RBCs) as a model system in 1998, demonstrating that nucleic acid aptamers can recognize complex cellular targets, specifically the CD71 protein. Based on this concept, Blank et al. employed a similar strategy, successfully generating a nucleic acid aptamer targeting the pigpen protein in rat brain tumor microvessels. After selecting the aptamer, affinity purification is used to pull down the target protein, followed by mass spectrometry identification to pinpoint the aptamer's binding target. This method remains in use today and is the mainstream strategy.

[0005] However, this technology faces the following limitations: (1) Low target identification efficiency, only able to screen specific targets for specific nucleic acid aptamers, completed one-to-one, unable to achieve high-throughput simultaneous screening of a large number of nucleic acid aptamers for targets; (2) The selection of nucleic acid aptamer candidates is usually empirical and low-throughput (selecting a few aptamer sequences from millions), making it difficult to discover valuable targets; (3) Targets identified by Cell-SELEX are often protein molecules that easily bind to nucleic acids, and low-abundance proteins are often ignored, resulting in relatively few usable nucleic acid aptamers that target specific cells, such as tumor cells. These limitations hinder its effectiveness in broader applications. The increasing demand for aptamers urgently requires a high-throughput method for identifying targets for nucleic acid aptamers, providing a fundamental tool for scientific research. Summary of the Invention

[0006] In view of the shortcomings of existing traditional technologies mentioned above, in order to solve the screening of this protein molecule, this invention provides an innovative technology platform combining single-cell perturbation-driven aptamer recognition and kinetics sequencing (SPARK-seq) and high-throughput aptamer sequencing. This innovatively combines Cell-SELEX, CRISPR gene perturbation with single-cell multi-omics sequencing technology to systematically identify the interaction between thousands of aptamers and their target proteins in a single experiment.

[0007] In simple terms, a core idea of ​​this invention is to utilize two cell populations with differences in their proteins. A nucleic acid aptamer library (containing a large number of different nucleic acid aptamers) binds to each of these two cell populations separately. The differences are used to screen or identify target proteins, or to screen or identify new nucleic acid aptamers. The two cell populations can be either normal cells or target cells, or a perturbed cell population. "Normal" is relative to perturbed cells, which are cells or cell populations generated through deliberate, intentional interference, while normal cells are cells without any deliberate, intentional interference. "Deliberate, intentional interference" refers to modifying cells through technical means to reduce, eliminate, or increase the expression of certain proteins on the cell surface. These proteins can be total cellular proteins or proteins on the cell membrane. This reduction or absence is relative to normal cells. Thus, there are differences in proteins between normal cells and disturbed cells. These differences can be differences in a single protein in a single cell (between normal and disturbed cells), differences between multiple different cell groups (between normal and disturbed cell groups), differences in multiple proteins in a single cell, or differences in multiple proteins between multiple disturbed cells.

[0008] Therefore, one objective of this invention is to achieve systematic and high-throughput identification or screening of thousands of nucleic acid aptamer-protein interactions; a second objective is to address the problem that traditional Cell-SELEX methods often tend to enrich proteins with high abundance or easy binding to nucleic acids, while failing to detect low-abundance cell surface proteins, aiming to expand the range of discoverable biomarkers; a third objective is to develop an innovative platform (SPARK-seq) that seamlessly integrates nucleic acid aptamer screening, target perturbation, and detection analysis (single-cell sequencing), realizing a complete workflow from nucleic acid aptamer enrichment to target identification; and a fourth objective is to screen for nucleic acid aptamers with high affinity and slow dissociation rate (high stability), improving the performance of the obtained nucleic acid aptamers in diagnostic and therapeutic applications.

[0009] To achieve the above objectives, the present invention employs the following technical solution:

[0010] On one hand, the present invention provides a method for high-throughput identification of target proteins of nucleic acid aptamers, the method comprising the following steps:

[0011] (1) The enriched nucleic acid aptamer library is combined with a target cell or a target cell population.

[0012] (2) The enriched nucleic acid aptamer library is combined with a perturbed cell or a population of perturbed cells; the perturbed cell population, compared with the target cell, has at least one protein altered;

[0013] (3) Compare the differences in binding of nucleic acid aptamers to disturbed cell populations and target cell populations to identify proteins or nucleic acid aptamers of proteins.

[0014] The nucleic acid aptamer is a nucleic acid aptamer that can bind to the target cell population. The protein here can be the target protein.

[0015] Cell-SELEX can screen nucleic acid aptamers from massive nucleic acid sequences that can specifically bind to specific living cells (not just a single protein). The series of nucleic acid aptamers obtained by screening can effectively target and bind to the target cell population. However, it is unknown which target proteins on the target cell population the series of nucleic acid aptamers specifically bind to. In order to identify the target proteins corresponding to the series of nucleic acid aptamers, a lot of work is still needed.

[0016] In existing methods for screening aptamers and target proteins that bind to target cell populations, the first step is to enrich an aptamer library using SELEX technology. Then, a few candidate sequences are selected through cloning and identification. Finally, the target protein for each candidate aptamer is identified individually. It is evident that traditional screening methods can only identify the target protein of one aptamer per cycle, and cannot identify the target protein of every aptamer in the enriched library. Only a few can be randomly selected. In the process of selecting aptamers, many low-abundance but high-performance aptamer sequences may be missed. Target identification also tends to favor high-abundance proteins, making it difficult to detect relatively low-abundance target proteins. This is typically a low-throughput, serial, and experience-dependent process.

[0017] To overcome the obstacles and bottlenecks of existing technologies, this invention creatively proposes a SPARK-seq high-throughput matching platform. First, Cell-SELEX is used to screen for enriched nucleic acid aptamer libraries that can bind to target cell populations. Then, the enriched nucleic acid aptamer libraries are directly incubated with proteins contained in the target cell population (a collection of proteins obtained by lysing the target cell population), from which a series of target proteins capable of binding the nucleic acid aptamers are selected. Next, perturbed cell populations (obtained from the target cell population by the absence of a certain target protein) are constructed, each lacking a different target protein. The differences in the nucleic acid aptamers bound by the perturbed cell population and the target cell population are compared. Based on the different nucleic acid aptamers and the missing target proteins, the target protein corresponding to the different nucleic acid aptamer can be deduced as the missing target protein.

[0018] For example, if the target cell population contains four proteins (A, B, C, and D), four perturbation cell populations are constructed: the first perturbation cell population lacks protein A, the second lacks protein B, the third lacks protein C, and the fourth lacks protein D. The enriched aptamer library is then bound to each of the four perturbation cell populations and the target cell population to identify differentially expressed aptamers. For instance, if the set of aptamers bound to the first perturbation cell population is m, and the set of aptamers bound to the target cell population is n, then the differentially expressed aptamer is nm. Since the first perturbation cell population differs from the target cell population by one protein (A), it can be directly deduced that protein A is the target protein corresponding to the differentially expressed aptamer nm. Similarly, differentially expressed aptamers bound to the target cell populations from the remaining perturbation cell populations can be identified, thus finding the corresponding differentially expressed aptamers and their target proteins.

[0019] For example, if a target cell population contains four proteins (A, B, C, and D), four perturbation cell populations are constructed: the first perturbation cell population shows decreased expression of protein A; the second perturbation cell population lacks protein B, resulting in decreased expression; the third perturbation cell population lacks protein C, resulting in decreased expression; and the fourth perturbation cell population lacks protein D, resulting in decreased expression. An enriched nucleic acid aptamer library is then bound to each of the four perturbation cell populations and the target cell population to identify differentially expressed nucleic acid aptamers. For instance, if the set of nucleic acid aptamers bound to the first perturbation cell population is m, and the set of nucleic acid aptamers bound to the target cell population is n, then the differentially expressed nucleic acid aptamer is nm. Since the expression of protein A is decreased in the first perturbation cell population compared to the target cell population, it can be directly inferred that the decreased expression of protein A may be the target protein corresponding to the differentially expressed nucleic acid aptamer nm. Similarly, differentially expressed nucleic acid aptamers bound to the target cell populations from the remaining perturbation cell populations can be identified, thus finding the corresponding differentially expressed nucleic acid aptamers and their corresponding target proteins.

[0020] For example, if a target cell population contains four proteins (A, B, C, and D), four perturbation cell populations are constructed: the first perturbation cell population shows increased expression of protein A; the second perturbation cell population lacks protein B but shows increased expression; the third perturbation cell population lacks protein C but shows increased expression; and the fourth perturbation cell population lacks protein D but shows increased expression. An enriched nucleic acid aptamer library is then bound to each of the four perturbation cell populations and the target cell population to identify differentially expressed nucleic acid aptamers. For instance, if the set of nucleic acid aptamers bound to the first perturbation cell population is m, and the set of nucleic acid aptamers bound to the target cell population is n, then the differentially expressed nucleic acid aptamer is nm. Since the first perturbation cell population shows increased expression of protein A compared to the target cell population, it can be directly inferred that the increased protein A may be the target protein corresponding to the differentially expressed nucleic acid aptamer nm. Similarly, differentially expressed nucleic acid aptamers bound to the target cell populations from other perturbation cell populations can be identified, thus finding the corresponding differentially expressed nucleic acid aptamers and their corresponding target proteins.

[0021] When multiple perturbed cell populations and target cell populations are simultaneously combined with enriched nucleic acid aptamer libraries, the differential nucleic acid aptamers of each perturbed cell population and target cell population can be analyzed simultaneously, thereby obtaining a series of differential nucleic acid aptamers. Furthermore, each target protein corresponding to each differential nucleic acid aptamer can be directly analyzed and obtained. In other words, a large number of target proteins corresponding to nucleic acid aptamers can be obtained at once without having to search for their target proteins one by one based on each nucleic acid aptamer. This truly realizes high-throughput identification of nucleic acid aptamers and their target proteins based on target cell populations.

[0022] Furthermore, the protein includes proteins in the target cell population that can bind to nucleic acid aptamers.

[0023] Target cells contain a variety of proteins, including membrane proteins on the cell membrane surface and proteins inside the cell. However, not every protein can bind to nucleic acid aptamers. Theoretically, only proteins that can bind to nucleic acid aptamers can become target proteins. Therefore, a subset of these proteins is preferentially selected and bound to an enriched nucleic acid aptamer library for screening a series of target proteins in the target cell population.

[0024] Furthermore, the proteins include cell membrane proteins of the target cell population.

[0025] Since nucleic acid aptamers primarily bind to surface proteins (i.e., proteins located on the cell membrane) of the target cell population, it is preferable to use cell membrane proteins of the target cell population to screen for a series of target proteins.

[0026] There are many types of cell membrane proteins in the target cell population. The present invention uses the sum of cell membrane proteins of the target cell population, which is a mixed solution. This solution is simultaneously contacted with an enriched nucleic acid aptamer library for screening a series of target proteins.

[0027] Furthermore, step (1) is followed by screening for candidate target proteins of the target cell population.

[0028] Furthermore, the method for screening candidate target proteins of the target cell population includes: first obtaining a protein mixture solution of the target cell population, binding the protein mixture solution of the target cell population to an enriched nucleic acid aptamer library, and screening candidate target proteins of the target cell population from there.

[0029] Furthermore, the protein mixture solution of the target cell population comprises the sum of cell membrane proteins of the target cell population.

[0030] Furthermore, the method for obtaining the protein mixture solution of the target cell population includes:

[0031] i. Separate the cell membranes of the target cell population;

[0032] ii. Lyse the cell membrane to obtain a cell membrane protein solution of the target cell population.

[0033] In some methods, cell membranes of the target cell population are extracted using a cell membrane protein extraction kit, and then lysed to obtain a cell membrane protein solution containing all the cell membrane proteins of the target cell population.

[0034] Furthermore, the screening of candidate target proteins for the target cell population includes screening a series of target proteins that can bind to more nucleic acid aptamers and are ranked highly.

[0035] Furthermore, the ranking of higher-ranked proteins includes ranking of higher-ranked proteins based on differences. This ranking of higher-ranked proteins based on differences involves: binding enriched nucleic acid aptamer libraries and random nucleic acid aptamer libraries to proteins in the target cell population, respectively. The content of proteins bound by the enriched nucleic acid aptamer library minus the content of proteins bound by the random control library equals N. Proteins are ranked according to their N values, and proteins with larger N values ​​are selected as target proteins. The combination of multiple selected target proteins forms a candidate target protein.

[0036] In some methods, the random nucleic acid aptamer library is the initial R0 library, and the enriched nucleic acid aptamer library is a nucleic acid aptamer library that has undergone 1 to n rounds of screening. For example, for the R4 (fourth round of screening) enriched library, candidate target proteins can be screened based on the ranking of the binding differences between R4 and R0; for the R3 (third round of screening) enriched library, candidate target proteins can be screened based on the ranking of the binding differences between R3 and R0. The greater the binding difference, the higher the ranking.

[0037] In some embodiments, this invention first enriches a subset library of nucleic acid aptamers targeting specific cells based on Cell-SELEX technology. Then, each round of nucleic acid aptamer libraries (R0 initial library, R1 (first round of screening) enriched library, R2 (second round of screening) enriched library, R3 (third round of screening) enriched library, and R4 (fourth round of screening) enriched library) are incubated with cell lysis buffer. The protein complexes bound to the nucleic acid aptamers are pulled down by streptavidin beads. Liquid chromatography-tandem mass spectrometry (LC-MS / MS) is then used to identify the specific proteins that have been "fished out," and a series of target proteins are screened from them.

[0038] In some approaches, since there are many types of cell membrane proteins in the target cell population, preliminary screening can be performed based on the binding amount to identify a series of membrane proteins with higher binding amounts to nucleic acid aptamers, and then the membrane proteins with the highest differences can be screened.

[0039] In some methods, the screening for higher binding amounts to nucleic acid aptamers is performed by a computer program, which can select proteins by setting threshold parameters, such as R4 > 8 (8 represents the amount of protein that can bind to nucleic acid aptamers), and screen out a series of membrane proteins with higher binding amounts through this parameter.

[0040] Furthermore, the enriched nucleic acid aptamer library is derived from the random nucleic acid aptamer library. The random nucleic acid aptamer library is contacted with the target cell population, and the enriched nucleic acid aptamer library is obtained through screening.

[0041] Furthermore, the screening is obtained through different screening pressures, including any one or more of the following: screening times, washing times, and increasing the content of BSA or herring sperm DNA.

[0042] In some approaches, different screening pressures include reducing the number of positive screening cells (disease cells), increasing the number of negative screening cells (healthy cells), the number of washes, and increasing the content of BSA and herring sperm DNA.

[0043] Furthermore, the number of screening rounds is 1 to 26.

[0044] In some methods, a 1-Y cycle is performed under different screening pressures, where Y may be an integer from 2 to 6, such as 2, 3, 4, 5.

[0045] In a preferred approach, the filtering cycle consists of 4 cycles.

[0046] Further, the random nucleic acid aptamer library includes a library represented by the following sequence: 5'-CTCGTGGGCTCGGAGATGTGTATAAGAGACAG-Nx-GCAGCTCGGCCCATATAAGAAA-3', where Nx is X random nucleotide sequences, and X is 10 to 100.

[0047] In some embodiments, X is 40 to 50.

[0048] In some approaches, to adapt the platform for recognizing nucleic acid aptamers targeting cell phenotypes affected by specific gene inactivation, this invention first designed a nucleic acid aptamer library compatible with single-cell sequencing. This library contains a central 46-base randomized region, flanked by 32-nucleotide 5' PCR handles and 3' 22-nucleotide capture sequences. Through several rounds of Cell-SELEX, we enriched the library of nucleic acid aptamers that specifically bind to the target cells. In the sequence listing of this invention, the underlined portion of the nucleic acid sequence is N. 46 The base sequence.

[0049] In some ways, during the screening of candidate target proteins in step (1), the present invention not only focuses on proteins that are dynamically enriched in the screening rounds, but also includes proteins that are distributed in different enrichment intervals or have research significance, and finally obtains a combination of a series of candidate target proteins.

[0050] Furthermore, the change in at least one protein in step (2) includes a decrease or increase in the expression level of at least one protein or a loss of at least one protein.

[0051] Further, the perturbation cell population in step (2) is constructed by gene editing, knockout, knockdown or silencing of the target cell population to make one or more of its candidate target proteins not expressed or reduced in expression; the enriched nucleic acid aptamer library is simultaneously bound to one or more perturbation cell populations.

[0052] A perturbed cell population refers to a cell population obtained by modifying a target cell population. This is achieved by deleting at least one candidate target protein from the surface membrane proteins of the target cell population.

[0053] Since there are many candidate target proteins screened in step (1), there can also be many perturbation cell populations.

[0054] Enriched aptamer libraries can bind simultaneously to one or more perturbed cell populations. If the enriched aptamer library binds to one perturbed cell population, the target protein corresponding to the differentially expressed aptamer can be deduced by analyzing the differentially expressed aptamers between the perturbed and target cell populations; this can be considered a candidate target protein missing in the perturbed cell population. If the enriched aptamer library binds to multiple perturbed cell populations simultaneously, the target protein corresponding to each differentially expressed aptamer can be deduced using the same method, achieving high-throughput identification of target proteins.

[0055] Furthermore, the perturbed cell population is obtained by knocking out a candidate target protein through gene editing; different perturbed cell populations knock out different types of candidate target proteins.

[0056] Each perturbation cell population is obtained by knocking out one candidate target protein from the target cell population. In other words, each perturbation cell population differs from the target cell population by one candidate target protein, and different perturbation cell populations knock out different types of candidate target proteins. For example, when there are 13 candidate target proteins, 13 different perturbation cell populations can be generated.

[0057] Furthermore, the gene editing method is CRISPR-Cas9.

[0058] Using CRISPR-Cas9 technology, gRNA sequences are designed to target candidate proteins that need to be knocked out, thereby constructing a perturbed cell population.

[0059] In some approaches, enriched aptamer libraries are incubated with mixed cell populations, each cell undergoing CRISPR knockout (CRISPR KO) targeting different surface proteins and followed by 10×Genomics 5' sequencing. In this assay, guide RNA (gRNA), aptamers, and mRNA are captured via CRISPR poly-dT RT primers, template-switch oligonucleotides (TSO), and poly(T) sequences, followed by single-cell sequencing. After sequencing, cell barcodes from each omics layer are identified and corrected to map each cell and its corresponding perturbation to an aptamer binding profile. The unique DNA barcode assigned to each guide RNA and aptamer allows for deconvolution of the pooled perturbation and aptamer libraries, enabling precise association of cell phenotypic results with guide RNA and aptamer sequences.

[0060] Furthermore, the binding difference mentioned in step (3) refers to comparing the nucleic acid aptamers that bind to the disturbed cell population and the target cell population respectively, and identifying the differential nucleic acid aptamers.

[0061] Furthermore, the differential nucleic acid aptamer refers to the nucleic acid aptamer that can only bind to the target cell population but cannot bind to the perturbed cell population by comparing nucleic acid aptamers that can bind to a perturbed cell population and nucleic acid aptamers that can bind to the target cell population; then the membrane protein of the perturbed cell population that is knocked out is the target protein corresponding to the differential nucleic acid aptamer.

[0062] Furthermore, the process of identifying differentially expressed nucleic acid aptamers is achieved through single-cell multi-omics sequencing, which includes any one or more of single-cell mRNA sequencing, single-cell nucleic acid aptamer sequencing, and single-cell CRISPR gRNA sequencing.

[0063] Furthermore, by simultaneously analyzing the differentially expressed nucleic acid aptamers that bind to each perturbed cell population and the target cell population, the target proteins corresponding to all differentially expressed nucleic acid aptamers can be obtained simultaneously in high throughput.

[0064] Furthermore, the analytical method for simultaneously analyzing the differentially expressed nucleic acid aptamers binding to each perturbed cell population and the target cell population uses the SPARK-seq algorithm, which includes:

[0065] (1) Calculate the difference in binding abundance of nucleic acid aptamers between the perturbed cell population and the target cell population;

[0066] (2) Based on the Gaussian mixture model, a statistical threshold was set to screen nucleic acid aptamer-target proteins with significant binding differences.

[0067] The SPARK-seq platform constructed in this invention combines Cell-SELEX, CRISPR and single-cell multi-omics sequencing technologies, and designs the SPARK-seq algorithm for high-throughput analysis. In a single experiment, it can perform global scanning and correlation analysis on the entire nucleic acid aptamer library and the entire target protein library, and simultaneously obtain all nucleic acid aptamers and their corresponding target proteins.

[0068] To investigate the impact of each perturbation on the binding of specific nucleic acid aptamer families and to discover potential cell surface protein-nucleic acid aptamer interactions, this invention developed the SPARK-seq algorithm, combining Markov clustering (MCL) and statistical methods to identify nucleic acid aptamer families and analyze their binding kinetics under perturbation. This ultimately enables high-throughput identification of the interactions between nucleic acid aptamers and cell surface antigens, while simultaneously discovering multi-parameter cell markers.

[0069] In some embodiments, this invention constructs a perturbation cell population using CRISPR-Cas9 technology, including knockout cells targeting 13 candidate target proteins. The highest enrichment subset library R4 is co-incubated with the perturbation cell population. After incubation, single-cell multi-omics sequencing is performed, simultaneously reading three pieces of information from each cell: (1) gRNA sequence: identifying which gene was knocked out; (2) aptamer sequence: identifying which aptamer(s) are bound to the cell surface; and (3) mRNA sequence: confirming the knockout effect and cell type using mRNA. Finally, data analysis is performed using the SPARK-seq algorithm.

[0070] The core logic of SPARK-seq is that if a nucleic acid aptamer specifically binds to a particular target protein, then in a cell population where that target protein has been knocked out by CRISPR, the binding signal of that aptamer will be significantly weakened or disappear. Using the SPARK-seq platform, the system can simultaneously and in parallel match a large number of nucleic acid aptamers with multiple target proteins within the same dataset. For example, it can match 5535 nucleic acid aptamer sequences with 8 corresponding target proteins, achieving the goal of identifying multiple nucleic acid aptamers and their target proteins in a single operation.

[0071] Furthermore, the target cell population includes diseased cells and / or healthy cells.

[0072] Furthermore, the diseased cells include tumors, inflammatory cells, or any other diseased cells that are not healthy.

[0073] This invention provides a tri-omics approach utilizing single-cell transcriptomics, perturbationomics, and aptameromics. This approach effectively combines high-throughput single-cell sequencing and CRISPR technology to achieve high-throughput sequencing of different protein knockout cell subtypes, thereby characterizing the aptamer binding profiles of these subtypes. Furthermore, by comparing the binding differences of aptamers in different cell populations, target proteins can be identified, achieving high-throughput target identification of aptamers.

[0074] This technology allows for the simultaneous and precise identification of the binding target proteins of 5535 aptamers. Single-cell perturbation aptamer sequencing (SPARK-seq) combines multiplexed CRISPR-mediated gene inactivation with single-cell RNA sequencing, enabling comprehensive analysis of gene expression phenotypes induced by each specific perturbation.

[0075] Furthermore, the method also includes screening slow-dissociating nucleic acid aptamers based on dissociation rate, wherein the binding difference value of nucleic acid aptamers is negatively correlated with the dissociation rate, and is used to select nucleic acid aptamers with high binding stability.

[0076] This invention is the first to realize a dissociation dynamics-based (K) off This invention enables high-throughput screening. The research found that the difference in binding of nucleic acid aptamers between perturbed and control cells is highly correlated with their dissociation rate. This allows SPARK-seq to actively screen for highly stable aptamers with "slow dissociation" characteristics from massive sequences. These aptamers exhibit superior performance in therapy (long-lasting effects) and diagnosis (high signal-to-noise ratio), directly improving the quality of produced aptamers—something traditional methods cannot achieve. The advantages of the SPARK-seq platform provided by this invention extend beyond increased throughput; it comprehensively surpasses traditional methods in terms of realism, target range, aptamer quality, and discovery capabilities, providing an unprecedentedly powerful tool for the development of nucleic acid aptamer drugs and basic research.

[0077] Furthermore, the method is able to identify nucleic acid aptamers of cell surface proteins with different abundances, the expression levels of which span two orders of magnitude.

[0078] This invention develops a novel high-throughput SPARK-seq platform that seamlessly integrates Cell-SELEX (cell screening), CRISPR perturbation (functional gene knockout), and single-cell multi-omics sequencing (simultaneous detection of mRNA, gRNA, and aptamers) into a single workflow. It no longer relies on traditional, low-throughput biochemical pull-down methods, but instead utilizes CRISPR to create thousands of microenvironments with "differential expression of natural proteins" at the single-cell level. By calculating changes in the binding signals of nucleic acid aptamers in these environments, it directly infers their interaction targets. This represents a significant leap in the ability to identify nucleic acid aptamer-target interactions, increasing the number of aptamer-target interactions from "1-2 per experiment" in traditional methods to "5,535 per experiment." Its target identification is not dependent on the physical abundance of proteins, but rather on their functional binding differences. Therefore, it can effectively discover low-abundance membrane proteins (such as NRP2 and PTPRD) that are difficult to detect using traditional mass spectrometry methods, detecting proteins spanning more than two orders of magnitude, greatly expanding the potential target space for drug development.

[0079] The two orders of magnitude mentioned here refer to a difference of 100-fold or more in the expression levels of the two proteins. This invention assessed target abundance at both whole-cell and cell surface levels. Mining published mass spectrometry datasets (6,447 quantified proteins) revealed that five targets (e.g., PTK7) were abundant in the top 30% of cellular abundance, NRP1 was near 30%, while NRP2, PTPRD, and PTPRS were below detection levels. A high-sensitivity MS platform (detecting 10,970 proteins) recovered NRP2 and PTPRS in the bottom third, while PTPRD remained undetectable, indicating extremely low expression. Calibrated flow cytometry confirmed these trends at the cell surface, with quantifications exceeding 10⁵. ITGA3 had 1,000 copies per cell, while PTPRD, PTPRF, and PTPRS had less than 1,000 copies per cell. As can be seen, SPARK-seq can reliably discover nucleic acid aptamers for cell surface proteins, whose expression levels and physicochemical properties span at least two orders of magnitude.

[0080] On the other hand, the present invention provides a system for high-throughput identification of nucleic acid aptamers, the system comprising:

[0081] (1) Nucleic acid aptamer library;

[0082] (2) CRISPR perturbation cell population library, which contains multiple cell subpopulations with differential expression of surface proteins;

[0083] (3) Single-cell multi-omics sequencing module, including single-cell mRNA sequencing, single-cell nucleic acid aptamer sequencing, and single-cell CRISPR gRNA sequencing;

[0084] (4) Data analysis module, used to execute the SPARK-seq algorithm.

[0085] In another aspect, the present invention provides a method for screening slow-dissociating nucleic acid aptamers. The method involves screening a series of nucleic acid aptamers that target a target protein, binding the series of nucleic acid aptamers to the target protein, and selecting the nucleic acid aptamer that binds to the target protein the most.

[0086] The binding process between nucleic acid aptamers and target proteins is actually a dynamic process of continuous dissociation and binding. A fast dissociation rate may lead to the nucleic acid aptamer easily losing its target after binding to the target protein; the slower the dissociation rate, the higher the probability of the nucleic acid aptamer binding firmly to the target protein, and the better the targeting ability of the nucleic acid aptamer relative to the target protein.

[0087] High affinity and slow dissociation rate are two distinct aspects of nucleic acid aptamers; a nucleic acid aptamer with high affinity does not necessarily have a slow dissociation rate. Therefore, screening for nucleic acid aptamers with slow dissociation rates also has high application value. However, existing nucleic acid aptamer screening processes typically focus on affinity and specificity, neglecting dissociation rate. In the pursuit of high affinity and specificity, slow-dissociation-rate nucleic acid aptamers are easily overlooked, thus failing to be screened. This invention creatively proposes a method for screening slow-dissociation nucleic acid aptamers, offering a new direction for nucleic acid aptamer screening.

[0088] Furthermore, the nucleic acid aptamer is a nucleic acid aptamer that targets a specific cell population, and the method includes the following steps:

[0089] a) Combining a nucleic acid aptamer library with a target cell or a target cell population;

[0090] b) The enriched nucleic acid aptamer library is combined with a perturbed cell or a population of perturbed cells; the perturbed cell population has at least one protein altered compared to the target cell.

[0091] c) Compare the differences in binding of nucleic acid aptamers to the disturbed cell population and the target cell population to identify the protein or the nucleic acid aptamer of the protein, and select the nucleic acid aptamer that binds to the protein the most.

[0092] In some embodiments, the method includes the following steps:

[0093] (1) The enriched nucleic acid aptamer library is combined with the protein of the target cell population to screen for candidate target proteins of the target cell population;

[0094] (2) The enriched nucleic acid aptamer library is combined with a perturbed cell population; the perturbed cell population is constructed by the target cell population lacking at least one candidate target protein;

[0095] (3) Compare the nucleic acid aptamers that bind to the disturbed cell population and the target cell population respectively, and find the differential nucleic acid aptamer. The target protein of the differential nucleic acid aptamer is the candidate target protein missing in the disturbed cell population.

[0096] (4) Differential nucleic acid aptamers bind to target proteins, and the nucleic acid aptamer with the highest number of bindings to the target protein is selected.

[0097] Furthermore, this invention provides a method for screening nucleic acid aptamers capable of recognizing cell surface proteins of varying abundances, wherein the nucleic acid aptamers are nucleic acid aptamers targeting a specific cell population, and the method includes the following steps:

[0098] a) Bind an enriched nucleic acid aptamer library to a target cell or a target cell population.

[0099] b) The enriched nucleic acid aptamer library is combined with a perturbed cell or a population of perturbed cells; the perturbed cell population has at least one protein altered compared to the target cell.

[0100] c) Compare the differences in binding of nucleic acid aptamers to disturbed cell populations and target cell populations to identify proteins or nucleic acid aptamers of proteins;

[0101] The protein contains cell surface proteins whose expression levels span two or more orders of magnitude.

[0102] In some embodiments, the method includes the following steps:

[0103] (1) The enriched nucleic acid aptamer library is combined with the protein of the target cell population to screen for candidate target proteins of the target cell population;

[0104] (2) The enriched nucleic acid aptamer library is combined with a perturbed cell population; the perturbed cell population is constructed by the target cell population lacking at least one candidate target protein;

[0105] (3) Compare the nucleic acid aptamers that bind to the disturbed cell population and the target cell population respectively, and find the differential nucleic acid aptamer. The target protein of the differential nucleic acid aptamer is the candidate target protein missing in the disturbed cell population.

[0106] The candidate target proteins include cell surface proteins whose expression levels span two or more orders of magnitude.

[0107] Traditional IP-MS typically only reliably identifies a subset of high-abundance targets (such as PTK7) in the same sample, while easily missing low-expression proteins such as CDCP1. Furthermore, it cannot simultaneously identify both low-abundance and high-abundance targets. The method provided in this invention can simultaneously identify both low-abundance and high-abundance targets, with expression levels spanning two or more orders of magnitude.

[0108] In another aspect, the present invention provides a nucleic acid aptamer for binding PTK7 protein, having a nucleotide sequence as shown in SEQ ID No. 1, SEQ ID No. 7, or SEQ ID No. 17; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with SEQ ID No. 1, SEQ ID No. 7, or SEQ ID No. 17, and still capable of binding PTK7 protein; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with the middle 46 bases of SEQ ID No. 1 or SEQ ID No. 7, and still capable of binding PTK7 protein; or a nucleotide sequence having two conserved regions, Motif 1 and Motif 2; or a nucleotide sequence having two conserved regions, wherein the two conserved regions are less than 30% altered compared to the nucleotide sequences of Motif 1 and Motif 2; wherein Motif 1 and Motif 2 have nucleotide sequences as shown in SEQ ID NO. 18 and SEQ ID NO. 19, respectively.

[0109] Specifically, the 46 N bases in the middle of SEQ ID No. 1 or SEQ ID No. 7 46 This refers to the 32 nucleotides flanking the 5' PCR handle and the 46 bases in the middle of the 3' 22 nucleotide capture sequence.

[0110] In some embodiments, the present invention discovers, through analysis, that N, an aptamer binding to the PTK7 protein, 46 Within this, there are two fixed conservative regions - Motif 1 (SEQ ID NO.18) and Motif 2 (SEQ ID NO.19) ( Figure 15 As long as the nucleic acid sequence contains both conserved regions (a and h) simultaneously (or has less than 30% of its bases changed from both conserved regions), it can bind to the PTK7 protein, and it has a higher affinity and a slower dissociation rate.

[0111] Furthermore, the nucleic acid aptamer that binds to the PTK7 protein includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 1, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 7.

[0112] In another aspect, the present invention provides a nucleic acid aptamer for binding the CDCP1 protein, having a nucleotide sequence as shown in SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13, and still capable of binding the CDCP1 protein; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with the middle 46 bases of SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13, and still capable of binding the CDCP1 protein.

[0113] Specifically, the 46 bases in the middle of SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13 refer to the 46 bases in the middle of the 5' PCR handle (32 nucleotides flanking the sequence) and the 3' 22 nucleotide capture sequence.

[0114] Figures 20-27 This document lists a library of nucleic acid aptamers obtained through screening in this invention, which exhibit better binding to the CDCP1 protein (higher affinity and slower dissociation) and at least 60% homology with each other. The listed nucleic acid sequences are N. 46 The partial sequences, the 32 nucleotides flanking the 5' PCR handle and the 3' 22 nucleotides capture sequence are omitted, and the sequences marked in red are the most preferred aptamers with the highest affinity or the slowest dissociation aptamers.

[0115] Furthermore, the nucleic acid aptamer that binds to the CDCP1 protein includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 5, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 11.

[0116] In another aspect, the present invention provides a nucleic acid aptamer for binding NPR1 protein, having a nucleotide sequence as shown in SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15, and still capable of binding NPR1 protein; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with the middle 46 bases of SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15, and still capable of binding NPR1 protein.

[0117] Specifically, the 46 bases in the middle of SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15 refer to the 46 bases in the middle of the 5' PCR handle of 32 nucleotides on both sides of the sequence and the 3' 22 nucleotide capture sequence.

[0118] Figures 28-46 This document lists a library of nucleic acid aptamers obtained through screening in this invention, which exhibit better binding to the NPR1 protein (higher affinity and slower dissociation) and at least 60% homology with each other. The listed nucleic acid sequences are N... 46 The partial sequences, the 32 nucleotides flanking the 5' PCR handle and the 3' 22 nucleotides capture sequence are omitted, and the sequences marked in red are the most preferred aptamers with the highest affinity or the slowest dissociation aptamers.

[0119] Furthermore, the nucleic acid aptamer that binds to the NPR1 protein includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 2, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 8.

[0120] In another aspect, the present invention provides a nucleic acid aptamer for binding NPR2 protein, having a nucleotide sequence as shown in SEQ ID No. 6, SEQ ID No. 9, or SEQ ID No. 16; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with SEQ ID No. 6, SEQ ID No. 9, or SEQ ID No. 16, and still capable of binding NPR2 protein; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with the middle 46 bases of SEQ ID No. 6, SEQ ID No. 9, or SEQ ID No. 16, and still capable of binding NPR2 protein.

[0121] Specifically, the 46 bases in the middle of SEQ ID No. 6, SEQ ID No. 9, or SEQ ID No. 16 refer to the 46 bases in the middle of the 5' PCR handle of the 32 nucleotides on both sides of the sequence and the 3' 22 nucleotide capture sequence.

[0122] Figures 47-51 This document lists a library of nucleic acid aptamers obtained through screening in this invention, which exhibit better binding to the NPR2 protein (higher affinity and slower dissociation) and at least 60% homology with each other. The listed nucleic acid sequences are N... 46 The partial sequences, the 32 nucleotides flanking the 5' PCR handle and the 3' 22 nucleotides capture sequence are omitted, and the sequences marked in red are the most preferred aptamers with the highest affinity or the slowest dissociation aptamers.

[0123] Furthermore, the nucleic acid aptamer binding to the NPR2 protein includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 6, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 9.

[0124] In another aspect, the present invention provides a nucleic acid aptamer that simultaneously binds PTPRD, PTPRF, and PTPRS proteins, having a nucleotide sequence as shown in SEQ ID No. 4 or SEQ ID No. 10; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with SEQ ID No. 4 or SEQ ID No. 10, and still capable of simultaneously binding PTPRD, PTPRF, and PTPRS proteins; or a nucleotide sequence having at least 60%, 70%, 80%, or 90% homology with the middle 46 bases of SEQ ID No. 4 or SEQ ID No. 10, and still capable of simultaneously binding PTPRD, PTPRF, and PTPRS proteins.

[0125] Specifically, the 46 bases in the middle of SEQ ID No. 4 or SEQ ID No. 10 refer to the 46 bases in the middle of the 5' PCR handle of 32 nucleotides on both sides of the sequence and the 3' 22 nucleotide capture sequence.

[0126] Figures 52-60 The present invention lists a library of nucleic acid aptamers obtained through screening that can better bind simultaneously to PTPRD, PTPRF, and PTPRS proteins (higher affinity, slower dissociation rate), and have at least 60% homology among them. The listed nucleic acid sequences are N. 46 The partial sequences, the 32 nucleotides flanking the 5' PCR handle and the 3' 22 nucleotides capture sequence are omitted, and the sequences marked in red are the most preferred aptamers with the highest affinity or the slowest dissociation aptamers.

[0127] Furthermore, the nucleic acid aptamer that simultaneously binds PTPRD, PTPRF, and PTPRS proteins includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 4, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 10.

[0128] In another aspect, the present invention provides a nucleic acid aptamer for binding ITGA3 protein, having nucleotide sequences as shown in SEQ ID No. 3 and SEQ ID No. 12; or having at least 60%, 70%, 80%, or 90% homology with SEQ ID No. 3 and SEQ ID No. 12, and still being able to bind ITGA3 protein; or having at least 60%, 70%, 80%, or 90% homology with the middle 46 bases of SEQ ID No. 3 and SEQ ID No. 12, and still being able to bind ITGA3 protein.

[0129] Specifically, the 46 bases between SEQ ID No. 3 and SEQ ID No. 12 refer to the 46 bases between the 5' PCR handle (32 nucleotides on either side of the sequence) and the 3' 22 nucleotide capture sequence.

[0130] Figures 61-65 This document lists a library of nucleic acid aptamers obtained through screening in this invention, which exhibit better simultaneous binding to the ITGA3 protein (higher affinity and slower dissociation) and at least 60% homology with each other. The listed nucleic acid sequences are N. 46 The partial sequences, the 32 nucleotides flanking the 5' PCR handle and the 3' 22 nucleotides capture sequence are omitted, and the sequences marked in red are the most preferred aptamers with the highest affinity or the slowest dissociation aptamers.

[0131] Furthermore, the nucleic acid aptamer that binds to the ITGA3 protein includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 3, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 12.

[0132] The method provided by this invention achieves the goal of screening target proteins using multiple nucleic acid aptamers. It also detects novel nucleic acid aptamers with good affinity for binding to the same protein, as well as novel nucleic acid aptamers that can bind to three proteins, PTPRD, PTPRF, and PTPRS, simultaneously. This overcomes the previous difficulty in identifying the consistency of multiple proteins within the same family due to differences in protein function and structural domains. The nucleic acid aptamers of the screened target proteins are also more reasonable and comprehensive.

[0133] In another aspect, the present invention provides the use of nucleic acid aptamers as described in any of the preceding claims in the preparation of reagents for the detection, diagnosis or treatment of diseases.

[0134] In another aspect, the present invention provides the use of the nucleic acid aptamers described above in the preparation of reagents for the detection, diagnosis or treatment of diseases related to the target molecule PTK7.

[0135] In another aspect, the present invention provides the use of the nucleic acid aptamers described above in the preparation of reagents for the detection, diagnosis or treatment of diseases related to the target molecule CDCP1.

[0136] In another aspect, the present invention provides the use of the nucleic acid aptamers described above in the preparation of reagents for the detection, diagnosis or treatment of diseases related to the target molecule NPR1.

[0137] In another aspect, the present invention provides the use of the nucleic acid aptamers described above in the preparation of reagents for the detection, diagnosis or treatment of diseases related to the target molecule NPR2.

[0138] In another aspect, the present invention provides the use of the nucleic acid aptamers described above in the preparation of reagents for the detection, diagnosis or treatment of diseases related to target molecules PTPRD, PTPRF and PTPRS.

[0139] In another aspect, the present invention provides the use of the nucleic acid aptamers described above in the preparation of reagents for the detection, diagnosis or treatment of diseases related to the target molecule ITGA3.

[0140] Compared with the prior art, the beneficial effects of this application are as follows:

[0141] 1. We developed the SPARK-seq platform for high-throughput identification of nucleic acid aptamers and their target proteins. It is an innovative technology platform that combines single-cell perturbation sequencing (Perturb-seq) and high-throughput nucleic acid aptamer sequencing. It combines single-cell multi-omics sequencing, CRISPR-based gene perturbation and real-time dynamic analysis, which has completely changed the discovery process of nucleic acid aptamers and their target proteins.

[0142] 2. High throughput and resolution: SPARK-seq can systematically map more than 5,535 aptamer-protein interactions in a single experiment, which greatly surpasses the traditional methods that limit each study to only one or two aptamer-protein interactions. It can also simultaneously screen for new aptamers and discover new target proteins.

[0143] 3. Kinetic priority of stability: directly quantifying the dissociation rate (k off SPARK-seq prioritizes slow-dissociation-rate aptamers with enhanced binding stability to ensure excellent performance in both therapeutic and diagnostic applications.

[0144] 4. SPARK-seq can identify nucleic acid aptamers for proteins with a wide abundance range: Unlike the traditional SELEX method, which often ignores low-abundance cell surface proteins, SPARK-seq excels at identifying nucleic acid aptamers for cell surface proteins with expression levels spanning two orders of magnitude, expanding its utility in discovering new biomarkers and therapeutic opportunities.

[0145] 5. First discovery of nucleic acid aptamers for low-abundance proteins CDCP1, NRP1, and NRP2; the same nucleic acid aptamer can simultaneously bind to three proteins of the same LAR-PTPR family: PTPRD / PTPRF / PTPRS;

[0146] 6. Compared with traditional target identification methods, SPARK-seq uniquely allows for the simultaneous identification of multiple target proteins and nucleic acid aptamer sequences, thereby facilitating high-throughput screening of optimal nucleic acid aptamers and the discovery of multiple biomarkers on the cell surface; the large amount of nucleic acid aptamer-target protein interaction data generated by SPARK-seq paves the way for more precise targeting strategies in therapeutic and diagnostic applications. Attached Figure Description

[0147] Figure 1This is an overview of the SPARK-seq workflow, which includes: (A) the Cell-SELEX process for generating a library of nucleic acid aptamers targeting cell surface proteins; (B) the binding of the enriched nucleic acid aptamer library to cells under native conditions; (C) the composition of the enriched library, with a 46-nucleotide random sequence flanked by 54-nucleotide PCR primer regions, and the 3' primer containing a capture sequence for compatibility with single-cell sequencing; (D) differential analysis of single-cell multi-omics data; and (E) single-cell sequencing and integration analysis revealing nucleic acid aptamer-protein interactions based on sgRNA perturbation and transcriptome changes.

[0148] Figure 2 To conduct proof-of-concept using the nucleic acid aptamer sgc8c, the following were performed: (A) Western blotting analysis of PTK7 protein knockout in human SUM159 and mouse EMT6 cells, with GAPDH as the loading control and WT as the PTK7 normal expression control, N=3 biologically independent replicates; (B) Flow cytometry results of binding of 100 nM sgc8c and negative control library to human SUM159 (Control, PTK7g1, PTK7g2) and mouse EMT6 (Control, PTK7g3, PTK7g4) cells at 4°C, N=3 biologically independent replicates, with Control being wild-type; (C) Five-end single-cell sequencing of the above six cell types and aptamers after incubation, followed by tri-omics analysis of mRNA, nucleic acid aptamer, and gRNA; (D) Uniform manifold approximation and projection (UMAP) diagram of mixed SUM159 and EMT6 cells based on single-cell transcriptome. Human (blue) and mouse (yellow) cells form distinct clusters, indicating effective species segregation; (EF) Expression of different gRNAs (PTK7g1, PTK7g2, PTK7g3, PTK7g4) and sgc8-C binding UMAP mapping in human-derived SUM159 and mouse-derived EMT6 cells; (G) Binding of sgc8-C in PTK7-containing and non-PTK7-containing cell populations obtained from single-cell sequencing.

[0149] Figure 3To conduct a proof-of-concept study using the PTK7 binding aptamer sgc8c, (A) PTK7g1, PTK7g2, PTK7g3, and PTK7g4 gRNAs were introduced into SUM159 and EMT6 cells, followed by Western blot analysis to quantify PTK7 protein levels. GAPDH was used as a loading control, and wild-type (WT) cells were used as a PTK7 expression control.N = 3 biologically independent replicates; (B) Quantitative analysis of sgc8c binding fluorescence value, in Western blot, GAPDH was used as the loading control, WT was used as the PTK7 expression control, N = 3 biologically independent replicates; (C) Flow cytometry results of 100 nM sgc8c and negative control libraries, incubated at 4°C with 1:1 mixed human SUM159 cells (control: PTK7g1) and mouse EMT6 cells (control: PTK7g4), N = 3 biologically independent replicates, control (NC) was wild-type cells; (DE) At 25°C, in DPBS containing 1 mM MgCl2, sgc8-C aptamers modified with 5' PCR primers and 3' capture sequence (5'AAGCTGCACGCTGACTGTACT(21nt)ATCTAACTGCTGCGCCGCCGGGAAA) were used. ATACTGTACGGTTAGA(41nt)TACAATCTGCGATCTCCAATTTGGCTAGTCCGTTATCAACTTG(43nt)3') Kinetic binding and dissociation curves of PTK7 protein in SPR, with the affinity of modified sgc8c being 3.21±1.51 nM (D), while the affinity of unmodified sgc8c was 2.36±0.905 nM (E); (F) Transcript counts associated with each cell barcode: red (>90% human reads); green (>90% mouse reads); blue (>10% human and mouse reads, indicating multivariate), fitted using Gaussian distribution mixture; (G) PTK7g1 count detected per cell, retaining cells with counts greater than 8.786629 for further analysis; (H) PTK7g2 count per cell, retaining cells with counts greater than 6.852745 for further analysis. Analysis: (I) Filtered sgRNA reads associated with each cell barcode (light blue represents PTK7g1; dark blue represents PTK7g2); (J) PTK7g3 count for each cell, retaining cells with counts greater than 6.563874 for further analysis; (K) PTK7g4 count for each cell, retaining cells with counts greater than 5.99208 for further analysis; (L) Filtered sgRNA reads associated with each cell barcode (dark green represents PTK7g3; light green represents PTK7g4); (M) sgc8c-S count detected in each human cell, retaining cells with counts greater than 3 for further analysis; (N) sgc8c-S count detected in each mouse cell, retaining cells with counts greater than 2.1 for further analysis. Highly significant differences are indicated as *p<0.05, **p<0.01, ***p<0.001, and ****p<0.0001.

[0150] Figure 4For the selection of nucleic acid aptamer enriched libraries and knockout cells, (A) flow cytometry was used to characterize the binding affinity of R1–R4 enriched aptamer libraries to SUM159 cells; (B) flow cytometry was used to characterize the binding affinity of R4 enriched nucleic acid aptamer library to SUM159 cells, as observed at different concentrations; (C) the differences in protein binding between enriched nucleic acid aptamer libraries and SUM159 cells in each round were compared by affinity purification and LS / MS analysis, and the differential expression of proteins bound by nucleic acid aptamer libraries was monitored in each screening round; (D) selected proteins that were subsequently subjected to single-cell sequencing and CRISPR knockout were also tracked.

[0151] Figure 5 To compare the differences in aptamer binding between interfered and control cells in different families, the (AH) plot depicted the statistical differences in aptamer binding between target knockout (KO) cells and control cells in each family. Each data point represents the median difference in aptamer binding among cell populations with different proteins interfered with. The significance of the binding difference was expressed by the p-value, where *p<0.05, **p<0.01, ***p<0.001, and ****p<0.0001 indicate progressively higher statistical significance levels. The results validated the effectiveness of the aptamer families in identifying target proteins and confirmed the specificity of aptamer-protein interactions. Families (JP) 8, 9, 10, 12, 16, 18, 19, and 20 did not show significant differences or did not reach the algorithm threshold (e.g., family 8 had insufficient sgRNA counts; family 9 had more than three output proteins, but no protein accounted for more than 50% of the total binding score).

[0152] Figure 6 The development of a high-throughput identification system for nucleic acid aptamer libraries includes: (A) a workflow for nucleic acid aptamer library family analysis; (B) the distribution of nucleic acid aptamer families; (C) multi-omics analysis, which involves analyzing and quality-controlling omics data, and then integrating them to create a matrix with sgRNA identity, nucleic acid aptamer binding abundance, and cell-specific differences; (D) a heatmap of differential nucleic acid aptamer binding in protein-perturbed cell populations; (E) an algorithmic workflow for predicting nucleic acid aptamer-protein interactions; (F) a mapping between nucleic acid aptamer families and predicted protein targets, with red squares representing significant binding interactions between nucleic acid aptamer families and protein targets, providing a visual summary of predicted nucleic acid aptamer-protein pairs; and (G) statistical analysis of nucleic acid aptamer binding differences, with graphs comparing nucleic acid aptamer binding between CRISPR-targeted knockout cells and control groups in different families, with statistical significance expressed as *p<0.05, **p<0.01, ***p<0.001, and ****p<0.0001.

[0153] Figure 7For nucleic acid aptamer targeting validation, eight surface proteins (PTK7, CDCP1, NRP1, NRP2, ITGA3 / ITGB1, PTPRD, PTPRF, PTPRS) were analyzed: The volcano plot (left) shows the log2FC and -log10(P) of nucleic acid aptamer abundance (control / knockout) in single-cell SPARK-seq analysis (two-sided Wilcoxon rank-sum test), and the flow cytometry histogram (middle) compares FAM-labeled nucleic acid aptamers with SUM15. 9. Binding of control and knockout cells, with the non-binding control aptamer (Ctrl-lib) as a negative control (n = 3 biological replicates); SPR sensor map (right) shows concentration-dependent binding titration of purified extracellular domains (n ​​= 3 biological replicates); (Y) Real-time interaction cytometry (RT-IC, live-cell SPR) sensor map showing dose-dependent binding of the nucleic acid aptamer to live SUM159 cells; (Z) Flow cytometry determination of the binding affinity of the nucleic acid aptamer to the target protein, except for Apt... NRP2 -14-26 Except for assays performed on A549 cells, all nucleic acid aptamers were tested on SUM159 cells (n = 3 biological replicates); (AA) The Sankey diagram associates 5535 unique sequences with eight validated protein targets, showing the family size (n) and percentage of total reads, with the line thickness reflecting the number of unique sequences assigned to each target.

[0154] Figure 8To validate the target proteins of the nucleic acid aptamers, (A) flow cytometry histograms show the binding of 100 nM nucleic acid aptamers (Apt-1-25, Apt-1-24, Apt-1-22, Apt-1-21, Apt-1-19, Apt-1-14, Apt-1-9, and sgc8c) to SUM159 cells and PTK7 knockout (KO) cells at 4 °C. The fluorescence intensity of each nucleic acid aptamer was compared under control and PTK7KO conditions. The negative control was an unselected nucleic acid aptamer. (a) Cell bank, “cells only” represents unstained cells, N = 3 independent biological replicates; (b) Surface plasmon resonance (SPR) binding and dissociation kinetics of various nucleic acid aptamers (Apt-1-1 and Apt-15-27) with PTK7 protein in DPBS containing 5 mg Cl2 at 25 °C, with responses measured in relative units (RU) over time, providing insights into the binding affinity and dissociation rate of each nucleic acid aptamer; (c) Western blot showing SUM159 control and NRP1. (A) NRP1 protein levels in KO cells, with GAPDH as a loading control. NRP1 bands were observed at 130 kDa, and GAPDH at 37 kDa. N=3 independent biological replicates. (D) Flow cytometry analysis showed the binding of CDCP1 antibody in SUM159 and SUM159 NRP2 KO cells, with IgG as a negative control. PE intensity indicated the binding of CDCP1 antibody across different cell types. N=3 independent biological replicates. (E) Mass spectrometry (MS) identified proteins pulled down by the Apt-2-2 aptamer. Scatter plots showed the relationship between log2-converted protein binding strength and fold change, with NRP1 highlighted as the primary target. (F) Mass spectrometry identified proteins pulled down by the Apt-5-4 aptamer. Based on the fold change in binding strength, CDCP1 was highlighted as the key binding target. (G) Flow cytometry analysis showed the NRP2 antibody levels in A-549, SUM159, and SUM159 NRP2 cells. Binding in KO cells, with IgG as a negative control, FAM intensity representing NRP2 antibody binding between different cell types, N = 3 independent biological replicates; (H) Flow cytometry histograms showing the binding of nucleic acid aptamers Apt-11-15 and Apt-14-26 to A-549 cells, with the negative control being an unselected aptamer library, and unstained cells represented as "cells only"; (I) Quantitative comparison of normalized fluorescence intensity (FL) of binding of anti-NRP2, Apt-11-15, and Apt-14-26 to A-549, SUM159, and SUM159 NRP2 KO cells, with significant differences in binding marked as *p<0.05, **p<0.01, ***p<0.001, and ****p<0.001.0001, N = 3 independent biological replicates; (J) SPR binding and dissociation kinetics of Apt-14-26 and Apt-11-15 with NRP2 protein in DPBS containing 5 mM MgCl2 at 25 °C.

[0155] Figure 9 To validate the target proteins of Apt-3-3, the following were performed: (A) Flow cytometry analysis of the binding of 100 nM Apt-3-3 to SUM159 ITGB1 knockout (KO) cells, compared with library control and cell-only control. Changes in fluorescence intensity highlighted the interaction between Apt-3-3 and ITGB1. Data represent the mean of three independent biological replicates; (B) Western blot analysis of ITGA3 and ITGB1 protein levels in SUM159 control cells, ITGA3 knockout (KO) cells, and ITGB1 KO cells; GAPDH was used as a loading control. ITGA3 and ITGB1 bands were detected at 130 kDa, and GAPDH was detected at 37 kDa; (C) Quantitative density (grayscale) data of ITGA3 and ITGB1 proteins, N = 2 biological replicates. Significance of differences is expressed as p < 0.05, p < 0.01, p < 0.001, and p < 0.0001. (D) Flow cytometry analysis of ITGA3 and ITGB1 antibody binding: Binding of ITGA3 and ITGB1 antibodies to SUM159, ITGA3KO, and ITGB1KO cells confirmed that ITGA3 and ITGB1 are key proteins involved in Apt-3 binding. IgG was used as a negative control. Data were based on three independent biological experiments. (E) Mass spectrometry (MS) analysis of proteins separated by ITGA3 antibody: Scatter plots show the binding strength of log2-converted proteins and the fold change relative to the library. ITGA3 and ITGB1 were again identified as major targets, confirming the binding specificity of Apt-3 to these proteins. (F) Mass spectrometry (MS) analysis of proteins separated by Apt-3: Scatter plots show the relationship between the binding strength of log2-converted proteins and the fold change relative to the library control. Based on binding strength and statistical significance, ITGA3 and ITGB1 were highlighted as major target proteins.

[0156] Figure 10To validate aptamer binding and protein knockout, (A) an Apt-4-5 aptamer pull-down assay confirmed its binding to specific proteins. Mass spectrometry (MS) analysis showed the relationship between binding strength and fold change at log2 conversion, with PTPRF identified as the primary target, while PTPRD and PTPRS showed secondary binding. (B) The mRNA expression levels of different proteins after knockout were quantified by RT-qPCR and normalized to control sgRNA. Data were expressed as the ratio of mRNA level to GAPDH expression, with N = 3 independent biological replicates. Each bar represents the mean mRNA expression, and error bars represent the standard deviation. Significance of binding difference was expressed as p < 0.0001. (C) Flow cytometry quantified the binding of Apt-4-5 in SUM159 cells transduced with different cell types (control, PTPRDKO, PTPRFKO, PTPRSKO). Shifts, with the library serving as a negative control for nonspecific binding (FNC), were quantified on SUM159 cells using different cell types (control, PTPRDKO, PTPRFKO, and PTPRSKO), showing relative fluorescence intensity compared to the IgG negative control (FNC). Data represent the mean ± SEM of N = 3 independent biological replicates. (DH) Affinity measurements of nucleic acid aptamers to their respective targets were determined by flow cytometry, and fluorescence intensity (F–F0: where F represents fluorescence intensity at aptamer binding and F0 represents fluorescence intensity of individual cells) values ​​were plotted against the concentrations of Apt-2-2 (A), Apt-6-6 (B), Apt-11-15 (C), Apt-13-18 (D), and Apt-17-39 (E). The dissociation constant (Kd) of each nucleic acid aptamer was calculated by curve fitting. 2 The value represents the goodness of fit, and the data represents the mean ± SEM. N = 3 independent biological replicates.

[0157] Figure 11To validate the specificity of nucleic acid aptamers, (A) flow cytometry competition assays of representative nucleic acid aptamers were performed. Density maps showed the binding of FAM-labeled nucleic acid aptamers to SUM159 cells (using NRP2 signaling in A549 cells), from bottom to top: Ctrl-lib (gray, unbound aptamer control), FAM-labeled nucleic acid aptamer binding (binding agent), and FAM-labeled nucleic acid aptamer binding in the presence of a 10-fold molar excess of unlabeled non-homologous competing aptamer (indicated by an asterisk), with colors distinguishing different competing aptamers; (B) surface plasmon resonance (SPR) sensing maps of different nucleic acid aptamers binding to indicator proteins, with Ctrl-lib and homopolyT sequences (Ctrl-polyT) included as negative controls (n = 3 biological replicates); (C) structural alignment of the extracellular domains of NRP1 (green) and NRP2 (brownish), with root mean square deviation (RMSD) of the N-terminal, intermediate, and C-terminal domains indicating high overall structural similarity; (D) Apt-2-2 (green, Alexa Fluor) (e) Flow cytometry competition of Apt-2-2 with known NRP1 ligand (SUM159 cells), from bottom to top: Ctrl-lib (gray, non-binding aptamer control), FAM-labeled aptamer binding (binding agent), and FAM-labeled aptamer binding at a 10-fold molar excess of known NRP1 ligand (indicated by an asterisk), (n = 3 biological replicates); (f) Flow cytometry competition of Apt-14-26 with known NRP2 ligand (A549 cells), from bottom to top: Ctrl-lib (gray, non-binding aptamer control), FAM-labeled aptamer binding (binding agent), and FAM-labeled aptamer binding at a 10-fold molar excess of known NRP1 ligand (indicated by an asterisk), (n = 3 biological replicates); (f) Flow cytometry competition of Apt-14-26 with known NRP2 ligand (A549 cells), from bottom to top: Ctrl-lib (gray, non-binding aptamer control), Alexa Fluor Binding of 647-labeled aptamers (binding agent), and binding of Alexa Fluor 647-labeled aptamers with a known 10-fold molar excess of NRP1 ligand (indicated by an asterisk), (n = 3 biological replicates); (G) Quantitative SUM159 cell surface calibration, top panel: banding of eight validated target proteins, bottom panel: fluorescence-based calibration curves converting average fluorescence intensity into protein copy number per cell, thus revealing target protein abundance across the entire panel, (n = 3 biological replicates, mean ± SD).

[0158] Figure 12To orthogonally validate the interaction between nucleic acid aptamers and targets through competition analysis and pull-down experiments, the following steps were performed: (A) Homologous nucleic acid aptamer competition was conducted by titrating the concentration of unlabeled nucleic acid aptamers (0-10 μM) and a fixed amount of Alexa Fluor 647 labeled nucleic acid aptamers (20 nM) (n = 3 biological replicates); (B) Competition curves were plotted, with the x-axis representing the concentration ratio of competing nucleic acid aptamers (competitors) to labeled nucleic acid aptamers (binders), and the y-axis representing the residual fluorescence intensity; (C) Biotinylated nucleic acid aptamers were used for streptavidin pull-down, followed by Western blot analysis with specified target-specific antibodies, with Ctrl-lib and Ctrl-polyT serving as negative controls.

[0159] Figure 13 Physicochemical properties and expression abundance of eight nucleic acid aptamer targets are shown below. (A–E) Structural and biophysical properties of each target (PTK7, CDCP1, ITGA3, NRP1, NRP2, PTPRD, PTPRF, and PTPRS) are arranged from left to right as follows: (A) Protein name and isoelectric point (pI) of the extracellular domain (ECD); (B) ECD hydrophobicity profile, showing the GRAVY index (top) and Kyte & Doolittle scale (bottom); (C) Schematic diagram of the ECD domain architecture, illustrating key motifs and repetitive sequences; (D) Electrostatic surface potential maps obtained from two opposite views (rotated 180°), calculated using adaptive Poisson-Boltzmann and visualized in PyMOL; (E) Enlarged view of the boxed area in Figure (D), highlighting positively charged, surface-exposed pockets that may interact with negatively charged nucleic acid aptamers; (F) Orbitrap Fusion. Protein levels in SUM159 cells were quantified using data-dependent acquisition (DDA) on an LC-MS / MS system (Uddin M.H. et al., Front. Oncol. 2022, 12:908603), identifying 6447 proteins (ND indicates "not detected"); (G) Protein levels in SUM159 cells were quantified using data-independent acquisition (DIA) at 24-minute gradients on a Thermo Fisher Astral platform, expanding the proteome coverage to 10,970 proteins (each data point represents the mean of three independent biological replicates); (H) Cell surface copy number of each target was determined using fluorescence calibration analysis (n = 3 biological replicates; mean ± SD).

[0160] Figure 14This serves as a benchmark for two target identification methods. (A) Target identification results for the top-ranked sequence in each aptamer family determined by IP-MS and SPARK-seq. A validated target is strictly defined as the candidate protein identified as the top-ranked in all three biological replicates. If the top-ranked identification results are inconsistent, they are marked as "NA". The line connecting the blue (IP / MS) and red (SPARK-seq) dots indicates that the two methods are consistent in identifying the same target for a given aptamer. (B) Target identification results for the top 10 sequences in the CDCP1 aptamer family determined by IP-MS and SPARK-seq. "NA" indicates no reliable target identification.

[0161] Figure 15 To obtain aptamers with different dissociation rates, (a) conserved regions of 3096 aptamers targeting PTK7 protein are displayed, with letter size representing the ratio at that position. Negative log2 [fold change] indicates reduced aptamer binding on knockout cells compared to control cells, with the earlier the value, the greater the difference; (b) Single-cell sequencing results (left) or flow cytometry results of binding in Apt-1-21 (-1,-25,-14,-19,-9,-22,-24) and PTK7 knockout cells, and control cells. (a) Right), N=3 biologically independent replicates; (c) Comparison of the differences between single-cell sequencing and flow cytometry (N=3 biologically independent samples); (d) Correlation between the differences in binding of different nucleic acid aptamers and the affinity characterized by flow cytometry, as well as the correlation with dissociation rate (e: SPR, f: flow cytometry) (N=3 biologically independent samples); (g) Dissociation rate of different nucleic acid aptamers on SPR (N=3 biologically independent samples); (h) Mutations (in color) of nucleic acid aptamers in conserved regions.

[0162] Figure 16To investigate the correlation between differential logFC and dissociation rate, (a) at 4°C, flow cytometry was used to analyze the affinity of different concentrations of nucleic acid aptamers (Apt-1-21, Apt-1-1, Apt-1-25, Apt-1-14, Apt-1-19, Apt-1-9, Apt-1-22, and Apt-1-24) with SUM159 cells. F0 represents "cells only," indicating unstained cell fluorescence background. R² is a goodness-of-fit index used to measure the fit between the model and the data (the closer to 1, the better the fit). N = 3 independent biological replicates; (b) at 25°C, containing 5 mM The surface plasmon resonance (SPR) binding and dissociation kinetics of various nucleic acid aptamers (Apt-1-21, Apt-1-1, Apt-1-25, Apt-1-14, Apt-1-19, Apt-1-9, Apt-1-22, and Apt-1-24) with PTK7 protein in DPBS of MgCl2 were studied, with the response measured as a change in relative units (RU) over time, providing insights into the relationship between the binding affinity and differential logarithm (FC) of each nucleic acid aptamer, with N = 3 independent biological replicates; (c) The relationship between binding rate Kon and differential logarithm (FC) measured by SPR, N=3 independent biological replicates; (d) Binding and dissociation kinetics of the above aptamers with different concentrations of PTK7 protein on SPR and their corresponding KD values ​​under 25°C, 5mM MgCl2 / DPBS conditions, N=3 independent biological replicates; (eh) SPR sensing plots of the interaction between aptamers (200nM) and recombinant CDCP1(e), NRP1(f), NRP2(g), and PTPRF(h), measured in DPBS with 1mM MgCl2 added at 25°C, with different concentration-dependent binding and dissociation curves observed for each aptamer-target pair; (il) Correlation between -log2 (FC) measured by SPR and binding kinetics, although no equilibrium dissociation constant (K) was found. D ) or binding rate constant (k on The general trend is that of -log2(FC) and dissociation rate (k) off A consistent negative correlation was observed between the two, suggesting that nucleic acid aptamers with higher KO-dependent enrichment tend to exhibit slower dissociation and stronger retention. All flow cytometry and SPR measurements were repeated three times (n = 3 biological replicates). Error bars represent standard deviation (SD), and dashed lines represent 95% confidence intervals.

[0163] Figure 17To investigate the effect of conserved region mutations on the binding ability of nucleic acid aptamers to PTK7 protein, the following parameters were used: (a) 200 nM nucleic acid aptamers in DPBS containing 1 mM MgCl2 at 25 °C, to study the surface plasmon resonance (SPR) binding and dissociation kinetics of PTK7 protein with different mutant nucleic acid aptamers. The "normal" sequence refers to the unmutated conserved region sequence that binds to PTK7. Mutations were as follows: "4A8T" indicates mutations at positions 4(A) and 8(T); "4G8C" indicates mutations at positions 4(G) and 8(C); "4T8A" indicates changes at positions 4(T) and 8(A); "4A" indicates a mutation only at position 4(A); and "8A" indicates a mutation only at position 8. (b) Under the same conditions (200 nM nucleic acid aptamers, 25 °C, 1 mM MgCl2 in DPBS), the following parameters were used to study the binding and dissociation kinetics of PTK7 protein with different mutant nucleic acid aptamers. (c) SPR binding and dissociation kinetics of PTK7 protein with different groups of mutant aptamers (normal, 9A20T, 9T20A, 9G20C, 9T, and 20A); (d) SPR response (relative unit, RU) of various mutant aptamers at 180 seconds compared to controls (Norm represents the unmutated conserved region aptamer, N represents a random mutation of a specific base to any of the other three nucleotides, and Control represents a random aptamer library). A total of 43 mutant clusters were examined, of which 13 clusters (shown in red) showed strong binding to PTK7 (30.23 mmol / L). (d) Distribution of aptamers matching conserved sequences in 3096 PTK7-binding aptamer sequences, of which 752 aptamers (92.50%) showed strong binding, while 61 (7.50%) showed weak binding or no binding; (e) Analysis of the number of reads corresponding to clusters matching conserved regions in 3096 PTK7-binding aptamer sequences, of which 29,313,117 (99.79%) represented strong binding aptamers, while 62,555 (0.21%) corresponded to weak binding or no binding aptamers.

[0164] Figure 18 To screen aptamers with slow dissociation rates, the relationship between the enrichment and binding variance of aptamers (selected by conventional sequencing methods) binding to PTK7 (a,b), NRP1 (c,d), NRP2 (f,g), and PTPRD / F / S (g,h) was investigated. Flow cytometry was used to characterize the dissociation changes of aptamers (Apt-1-7 and Apt-1-21, Apt-6-292 and Apt-2-2, Apt-11-116 and Apt-11-15, Apt-4-200 and Apt-4-5) over time in SUM159 cells (b,d,h) or A-549 cells (f), with n=3 biological replicates.

[0165] Figure 19To screen aptamers with slow dissociation rates and characterize their affinity: the expression correlation of each aptamer sequence in single-cell sequencing and conventional sequencing (a), and the relationship between the enrichment and binding difference values ​​of aptamers bound by CDCP1 (b, c) and ITGA3 (d, e) (selected by conventional sequencing methods). Flow cytometry was used to characterize the dissociation changes of aptamers Apt-13-18 and Apt-5-4, Apt-3-191 and Apt-3-3 over time in SUM159 cells (c, e) or A-549 cells (f), with n = 3 biological replicates. (fh) Flow cytometry affinity measurements of different concentrations of Apt-4-200 (f), Apt-6-292 (g), and Apt-11-116 (h) at 4℃. NC The background fluorescence of the library at different concentrations is represented by R², which represents the goodness of fit (the closer to 1, the better the fit), and N = 3 independent biological replicates.

[0166] Figures 20-27 N is a nucleic acid aptamer that can bind to the CDCP1 protein and has at least 60% homology with each other. 46 Partial sequence;

[0167] Figures 28-46 N is a nucleic acid aptamer that can bind to the NPR1 protein and has at least 60% homology with each other. 46 Partial sequence;

[0168] Figures 47-51 N is a nucleic acid aptamer that can bind to the NPR2 protein and has at least 60% homology with each other. 46 Partial sequence;

[0169] Figures 52-60 N is a nucleic acid aptamer that can simultaneously bind PTPRD, PTPRF, and PTPRS proteins and has at least 60% homology with each other. 46 Partial sequence;

[0170] Figures 61-65 N is a library of nucleic acid aptamers that can simultaneously bind to the ITGA3 protein and have at least 60% homology with each other. 46 Partial sequence. Detailed Implementation

[0171] To make the inventive objectives, technical solutions, and beneficial effects of this application clearer, the following description, in conjunction with embodiments, further illustrates this application. It should be understood that the embodiments described are for illustrative purposes only and are not intended to limit the scope of the application. Unless otherwise specified, the experimental methods used in the following embodiments are conventional methods, and those skilled in the art can easily understand other advantages and effects of this application from the content disclosed in this description.

[0172] The following examples are provided to facilitate a better understanding of the present invention. Unless otherwise specified, the experimental methods used in the following examples are conventional methods. Unless otherwise specified, the experimental materials used in the following examples were all purchased from conventional biochemical reagent stores.

[0173] Example 1: Workflow of the SPARK-seq method

[0174] I. Cell-SELEX screening of nucleic acid aptamer subsets

[0175] The process of Cell-SELEX screening for nucleic acid aptamer libraries targeting cell surface proteins is as follows: Figure 1 As shown in A, the details are as follows:

[0176] (a) Synthesize the random single-stranded DNA library and primers shown in the following sequences:

[0177] Random single-stranded DNA library TSO: 5'-CTCGTGGGCTCGGAGATGTGTATAAGAGACAG-N46-GCAGCTCGGCCCATATAAGAAA-3'(N46)

[0178] In this context, "N46" represents a sequence consisting of 46 arbitrary nucleotide bases linked together. This library and subsequent sequences were synthesized by Sangon Biotech (Shanghai) Co., Ltd. Primer information is shown in Table 1:

[0179] Table 1 Primers and their sequences

[0180] Primer name Sequence (5'-3') TSO-For-FAM FAM-CTCGTGGGCTCGGAGATGT(SEQ NO.1) TSO-Rev-biotin Biotin-TTTCTTATATGGGCCGAGCTGC(SEQ NO.2) TSO-For-Q CTCGTGGGCTCGGAGATGT(SEQ NO.3) TSO-Rev-Q TTTCTTATATGGGCCGAGCTGC(SEQ NO.4)

[0181] Note: "For" in primer names indicates forward primer, and "Rev" indicates reverse primer; biotin can bind to SA protein, and FAM is a fluorescent reporter group.

[0182] The library and primers (Table 1) were prepared into 100 μM stock solutions using calcium- and magnesium-free DPBS buffer and stored at -20°C for later use. Wash buffer (WB) consisted of DPBS buffer (pH 7.4), 1 mM MgCl2, and 4.5 g / L glucose. Binding buffer (BB) consisted of wash buffer with 1 mg / mL bovine serum albumin (BSA) and 0.1 mg / mL herring sperm DNA. 1 mL of 5X PCR Mix (reagents purchased from TaKaRa) consisted of 500 μL of 10X enzyme buffer, 200 μL of dNTPs (2.5 mM), 30 μL of Taq enzyme, 50 μL of TSO-For-FAM (50 μM), 50 μL of TSO-Rev-biotin (50 μM), and 170 μL of ddH2O.

[0183] (II) The specific screening method is as follows:

[0184] 1. Cell Culture

[0185] All cells used in the experiment were derived from the American Type Culture Collection (ATCC), including MCF-10A (healthy cells) cells (in MCF-10A-specific medium) and SUM159 cells (disease cells, triple-negative breast cancer cells) (DMEM medium supplemented with 10% FBS and 1% penicillin and streptomycin). All cells were cultured in a 37°C CO2 incubator with a CO2 concentration of 5%. The digestion solution used for cell passage was 0.25% Trypsin-EDTA, and the cryopreservation solution used was a commercially available serum-free cryopreservation solution.

[0186] 2. The specific steps for cell screening are as follows:

[0187] (1) First round of screening

[0188] 1) Library preparation: Dissolve 1 nmol TSO library in 200 μL of DPBS, mix well, denature at 95 °C for 10 min, and cool on ice for 10 min.

[0189] 2) Prepare positive screening target cells SUM159 cells and negative screening MCF-10A cells (newly revived cells should be cultured for three generations or more). Culture the cells until they reach 90% confluence and adhere more naturally. Use 10cm×2cm culture dishes for the first round of screening.

[0190] 3) Remove the culture medium and wash twice with WB buffer stored at 4°C.

[0191] 4) Add the prepared TSO single-stranded DNA (ssDNA) library solution to 800 μL of BB buffer for resuspending and mix well.

[0192] 5) Add the resuspension (TSO single-chain library) from the previous step to an MCF-10A cell culture dish for negative screening, incubate on a shaker at 4°C for 30 min, then take the supernatant and add it to a SUM159 cell culture dish for positive screening, incubate on a shaker at 4°C for 30 min.

[0193] 6) After incubation, remove the supernatant and gently wash three times with WB buffer. Ensure that the sequences on the cells are not washed away to avoid sequence loss.

[0194] 7) After washing, add 300 μL of ultrapure water and collect the cells in a 1.5 mL EP tube.

[0195] 8) The collected cell suspension was heated at 95°C for 10 min. After denaturation, it was immediately placed on ice for 10 min and then placed at room temperature (15 min) for later use as a template for PCR.

[0196] 9) PCR amplification

[0197] a) PCR1 system: 300 μL supernatant + 100 μL 5X Mix + 100 μL ultrapure water, annealing temperature set at 60℃, cycle number 11;

[0198] b) Optimization of cycle number for PCR2: 1 μL PCR1 product + 10 μL 5X Mix + 39 μL ultrapure water, aliquoted into 5 tubes, 10 μL per tube. Set the temperature range to the optimal annealing temperature, and the cycle number to 6-14, with a 2-cycle interval. Observe the bands on the agarose gel to confirm the optimal cycle number (determined based on the uniformity of the bands at different cycle numbers).

[0199] c) PCR2: PCR2 was performed under optimized conditions (PCR2 system: 10 μL PCR1 product + 100 μL 5X Mix + 390 μL ultrapure water).

[0200] 10) Preparation of ssDNA

[0201] a) Add 70 μL of streptavidin agarose bead suspension to an empty microcolumn and filter under pressure. The remaining beads are in the upper part of the filter cartridge.

[0202] b) Wash the column twice with 100 μL of DPBS.

[0203] c) Pass the solution obtained from PCR2 through a microcolumn.

[0204] d) Wash twice with 100 μL of DPBS each time.

[0205] e) Rinse the microcolumn with 200 μL of 0.2 M NaOH solution (containing 0.2 M NaCl to prevent biotin and SA dissociation) and collect the eluent.

[0206] f) Desalting using NAP-5 desalting column (GE Healthcare, UK). Prewash the desalting column with deionized water, at least 15 mL.

[0207] g) Add 200 μL of ssDNA to the column.

[0208] h) After all the contents have entered the column, add 300 μL of ultrapure water.

[0209] i) After all the DNA has entered the column, add 700 μL of ultrapure water to elute the ssDNA (keep the desalting column moist).

[0210] j) Collect the eluent using a 1.5 mL EP tube, with a volume of approximately 700 μL.

[0211] k) UV260 calibration of ssDNA content.

[0212] l) Vacuum dry and freeze at -20℃.

[0213] m) Label the prepared single-stranded DNA for use in the next round of screening.

[0214] (2) Rounds 2 to 4 of screening

[0215] 11) Denature the single-stranded DNA prepared in the previous round and resuspend it in binding buffer.

[0216] 12) In order to further enhance the affinity and specificity of nucleic acid aptamers, the screening pressure is gradually increased during the screening process: including reducing the number of positive screening cells, increasing the number of negative screening cells, the number of washing cycles, and increasing the content of BSA and herring sperm DNA.

[0217] After completing four rounds of screening, the PCR products from the fourth round were characterized by flow cytometry and then used for high-throughput sequencing.

[0218] (III) Flow cytometry monitoring and high-throughput assays

[0219] (1) Flow cytometry monitoring

[0220] Library preparation: The prepared single-stranded DNA libraries of different rounds were denatured at 95°C for 5 min, cooled on ice for 5 min, and annealed at room temperature for 15 min for later use.

[0221] Cell suspension preparation: Prepare SUM159 cells, remove the culture medium, wash the cells three times with DPBS, and then digest them in DPBS solution containing 5 mM EDTA at 37°C for 3 min. Gently pipette the cells and collect them into 1.5 mL EP tubes. Add washing buffer, centrifuge and wash twice, and resuspend in binding buffer for later use.

[0222] The enrichment of the obtained library during the screening process was monitored in real time using flow cytometry, such as... Figure 1 As shown in B, with the increase of screening rounds, the binding of enriched libraries from different rounds to the target cell line SUM159 gradually becomes stronger.

[0223] (2) High-throughput measurement

[0224] Based on the flow cytometry monitoring results, we selected the single-stranded DNA library obtained in the fourth round of screening. After PCR amplification, we selected PCR bands with no nonspecific bands, clear target bands, and bright lanes for amplification.

[0225] After amplification, the sequences were sequenced by Sangon Biotech (Shanghai) Co., Ltd. Analysis of the high-throughput sequencing results showed that the length of most random sequences was around 46 bases, indicating consistency with the designed library sequences. Figure 1 C).

[0226] II. Screening of target proteins (candidate target proteins) based on nucleic acid aptamer enrichment libraries

[0227] 1. Affinity purification of nucleic acid aptamer libraries and enrichment of target proteins by LC-MS / MS

[0228] To enable protein pulldown based on nucleic acid aptamer libraries, standard FAM-labeled ssDNA libraries are converted to biotin-labeled ssDNA libraries. The workflow employed in this invention is as follows:

[0229] 1) Preparation of biotinylated nucleic acid aptamers

[0230] For each selection round, 20 μL of PCR1 product was mixed with 400 μL of 5× Biotin-TSO PCR mixture, which consisted of 1 mL 10× PCR buffer, 600 μL 2.5 mM dNTPs, 100 μL 50 μM forward primer TSO-For-biotin (biotin-CTCGTGGGCTCGGAGATGT), 100 μL 50 μM reverse primer TSO-Rev-polyA20 (AAAAAAAAAAAAAAAAAAAAA / iSp18 / TTTCTTATATGGG-CCGAGCTGC), 50 μL Taq DNA polymerase (TaKaRa, TKR-RR001A), and 150 μL ddH2O (stored at -20°C). The final volume was adjusted with sterile water, and PCR was performed at 95°C for 30 s, 60°C for 30 s, and 72°C for 30 s for 20 cycles.

[0231] 2) DNA concentration and purification

[0232] Transfer the PCR product to a 15 mL centrifuge tube and add 4-5 volumes of n-butanol to precipitate the DNA. Invert the centrifuge tube to mix thoroughly. If there is no turbidity, add 50-100 μL of ultrapure water. Centrifuge the sample at 7,500 g for 10 minutes (centrifuge tube strength has been pre-validated to prevent breakage). After centrifugation, the solution will become clear and phase separation will occur. Transfer the lower phase, rich in PCR product, to a 1.5 mL centrifuge tube and centrifuge at 13,000 g for 2 minutes. Aspirate the remaining n-butanol, leaving approximately 100 μL of concentrated product.

[0233] 3) Denaturing PAGE and ssDNA purification

[0234] Add 2×TBE-urea loading buffer (Sangon Biotech, C506046) to the concentrated product. Load the sample onto an 8% denaturing PAGE gel and electrophoresis at 300V for 15 minutes. Extract a 100nt band from the DNA ladder (a weak FAM signal may be visible at approximately 100nt under UV light due to residual FAM-labeled single-stranded DNA; avoid the 120nt antisense band containing TSO-Rev-polyA20). Extract the band using the EZ-10 Spin Column DNA PAGE gel extraction kit (Sangon Biotech, B610357) and quantify the purified DNA using a NanoDrop spectrophotometer.

[0235] 2. Target recognition based on LC-MS / MS

[0236] Target proteins interacting with the enriched nucleic acid aptamer library were identified by liquid chromatography-tandem mass spectrometry (LC-MS / MS).

[0237] a) Membrane protein extraction: SUM159 cells (>5×10⁻⁶) were digested with enzyme-free cell dissociation buffer at 37°C. 7 Cells were incubated for 6 minutes. The supernatant was discarded, and membrane proteins were extracted using a membrane and cytoplasmic protein extraction kit (Beyotime, P0033).

[0238] b) Nucleic acid aptamer-based library pulldown: In rounds R1-R4, 100 nM biotinylated enriched aptamers (with the initial biotinylated library R0 as a control) were heat-denatured at 95 °C and refolded at 4 °C in each round of experiments. Each nucleic acid aptamer library was then incubated with membrane proteins (pre-blocked with binding buffer) by gently rotating an agarose gel containing 10% FBS at 4 °C for 1 hour. Subsequently, 50 μL of streptavidin (SA) agarose beads (cytiva, 17511301) pre-blocked for 1 hour in TBST containing 5% BSA were added, and incubation continued at 4 °C for 1 hour. After centrifugation, the agarose beads were washed three times with washing buffer to remove unbound proteins.

[0239] c) Protein elution and SDS-PAGE electrophoresis: Add an equal volume of protein loading buffer (Sangon Biotech, C506025) to the bead precipitate. Denature the sample at 100°C for 10 min, cool at 4°C for 10 min, and centrifuge to collect the supernatant. Perform SDS-PAGE electrophoresis using a 4-20% gradient gel (GenScript, M42012C) in GenScript electrophoresis buffer. Clean the wells with electrophoresis buffer before loading. Load 20 μL of sample; if more sample is needed, load another 20 μL after 10 min at 60V. Electrophore the gel at 60V for 30 min, then at 120V for 70 min, until the bromophenol blue front reaches the bottom. Stain the gel with Coomassie Brilliant Blue staining solution (Sangon Biotech, E607056) for more than 2 hours, then destain with Coomassie Brilliant Blue destaining solution (Sangon Biotech, E607057) until the bands are clearly visible. The gel strips were cut and sent to Westlake Omics (Astral DIA, 24 min) for proteomics analysis.

[0240] 3. Data Analysis and Filtering

[0241] To avoid interference from false signals, the UniProt database, which summarizes 5567 membrane proteins, was used to filter out non-membrane proteins from the protein list, leaving 239 membrane proteins. In the negative control experiment, a natural library (also known as R0) labeled with biotin and attached to streptavidin-coated magnetic beads was used to eliminate DNA-binding and bead-binding proteins. Proteins were ranked according to the difference between each round and R0. For example, for R4, the N value was calculated as: the amount of protein bound by R4 - the amount of protein bound by R0 = N. Proteins were ranked according to their N values, and those with larger N values ​​were selected as target proteins. Furthermore, this embodiment only considered proteins with high confidence in mass spectrometry identification (proteins with a binding count greater than 8 in R4). The screening results are as follows: Figure 4 As shown in C, after comprehensive evaluation, 13 target proteins were selected as candidate target proteins for further identification: PTPRF, NRP1, PTK7, ITGA3, NRP2, ITGB1, PTPRS, PTPRD, CD151, SLC25A5, RAB23, RAB5C, and CDCP1.

[0242] III. Constructing a Perturbed Cell Population using CRISPR-Cas9

[0243] For proof-of-concept experiments, we used the pLentiCRISPRv2 plasmid (Addgene, 52961) as a vector. In the high-throughput recognition phase, we used pLentiCas9-BFP (Addgene, 78545) and lentiGuide-Puro (Addgene, 52963) to target specific genes (genes corresponding to 13 candidate target proteins, with one target protein gene knocked out for each perturbed cell population). Guide RNAs (gRNAs) were designed using CRISPR tools to ensure minimal off-target effects. Each gRNA sequence (Table 2) was synthesized using the ZhangLab assembly method and cloned into the BsmBI restriction site of the pLentiCRISPRv2 plasmid. After ligation, the constructs were converted into Stbl3 competent cells using heat shock. The plasmids were purified using the EndoFreeMini Plasmid Kit, and Sanger sequencing confirmed successful gRNA insertion.

[0244] For lentivirus production, HEK293T cells were co-transfected with lentiCas9-Blast, the gRNA-containing lentiGuide-Puro or pLentiCRISPRv2 plasmid, and packaging plasmids pRSV-Rev (Addgene, 12259), pMDLg / pRRE (Addgene, 12251), and pMD2.G (Addgene, 12253) using Lipofectamine 3000 (Thermo Fisher Scientific). Forty-eight hours post-transfection, viral supernatant was collected, filtered, and used to transduce target cells in the presence of 8 μg / mL polybrene. Successfully transduced cells were selected with puromycin (1 μg / mL) for 48 hours. For pLentiCas9-BFP, cells were selected using a fluorescent sorter after successful transfection and then amplified. LentiGuide-Puro viruses containing different gRNAs were introduced into target cells in the presence of 8 μg / mL polybrene. Cells that were successfully transduced were then screened with puromycin (1 μg / mL) for 48 hours. Knockout efficiency was assessed by Western blotting, flow cytometry using specific antibodies, or RT-qPCR.

[0245] IV. Single-cell multi-omics sequencing

[0246] The R4 library (single-stranded DNA library obtained in the 4th round of screening) was dissolved in DPBS, denatured at 95°C for 10 min, annealed at 4°C for 10 min, and diluted with BB to a final concentration of 8 μM.

[0247] Premixed cells containing different gene interferences (a total of 46 gRNAs, mixed in equal volumes 3 days in advance) were dissociated using an enzyme-free dissociation buffer, washed once by Western blotting, and then 1×10⁻⁶ gRNAs were added. 6 Cells were diluted in 50 μL of BB.

[0248] After 10 minutes, add 50 μL of R4 library (8 μM) and incubate at 4°C for 30 minutes. Centrifuge the cells, wash three times, and rinse with a solution containing 1 mM Mg. 2+ The cells were resuspended in DPBS. The cells were then filtered through a 40 μm filter to obtain a single-cell suspension. After incubation at 4°C for 30 minutes, a single-cell RNA sequencing library was generated using the 10X Genomics Next GEM Single Cell 5' Kit v2 (Dual Indexing) according to the manufacturer's instructions.

[0249] CRISPR perturbation sequences were integrated into the library by capturing gRNA transcript barcodes (5′-GGCTAGTCCGTTATCAACTTG-3′), and aptamer sequences were integrated into the library by capturing aptamer library barcodes (5′-GCAGCTCGGCCCATATAAGAAA-3′). The libraries were pooled and sequenced on an Illumina NovaSeq 6000 platform. Data were processed using the Cell Ranger pipeline, and differential expression analysis was performed using RStudio. Cell quality control was performed by analyzing mRNA, cell aggregation was confirmed by analyzing guide RNA expression, and differences in aptamer binding among different cell populations were analyzed.

[0250] V. SPARK-seq mapping of 5535 nucleic acid aptamer-target interactions

[0251] 1. Data Preprocessing for SPARK-seq Analysis Method

[0252] Raw data was processed using 10x CellRanger (v.6.0.0), including the extraction and analysis of cell barcodes and UMI barcodes, followed by alignment with the reference genome “Homo_sapiens.GRCh38.dna_sm.toplevel.fa”. This process generated gene expression and gRNA UMI count matrices via the gene expression output and CRISPR Guide Capture output modules, respectively. Subsequently, aptamer sequences (45-47 bp) were extracted from Read2 of the aptamer omics data, upstream of the adaptor primer (GCAGCTCGGCCCATATAAGAAA), while cell barcode information was obtained from the corresponding Read1 sequence. This enabled the generation of UMI count matrices for the aptamer sequences.

[0253] 2. Cell quality assessment and filtration

[0254] Low-quality cells were filtered based on the following criteria: high mitochondrial gene content (>10%), low gene detection (<200), and low aptamer detection (<100). Downstream analysis was performed using the Seurat package (version 4.2.1) to enable comprehensive analysis of multimodal single-cell datasets (transcriptomics, aptamers, and gRNA). Gene expression counts were log-normalized according to the standard Seurat workflow. Aptamer and gRNA counts were normalized using a central log-log ratio (CLR) transformation with a margin of 2, across cells.

[0255] 3. gRNA identity allocation in single cells

[0256] To ensure effective gene knockout, at least two gRNAs were designed for each target protein (e.g., three gRNAs for PTK7: PTK7-1, PTK7-2, and PTK7-3). Based on gRNA expression data, each cell was assigned a corresponding gRNA identity. All cells with gRNA counts less than 200 were classified as “negative.” For the remaining cells, gRNA enrichment was calculated as the ratio of the most abundant gRNA count to the total gRNA count in the cell. Cells with a ratio exceeding 70% were assigned the corresponding gRNA; those not meeting this threshold were classified as “dual.” “Negative” and “dual” cells were excluded from further analysis. Furthermore, cells with fewer than 50 detected gRNAs or no other gRNAs targeting the same protein were removed. Finally, cells were divided into two groups: cells with effective gRNA-mediated target protein knockout and control cells with non-targeting gRNAs (7NC gRNAs).

[0257] 4. Construct a nucleic acid aptamer family binding spectrum matrix

[0258] We analyzed the nucleic acid aptamer sequence and identified 4.027 × 10 7 1.72 × 10 different sequences were generated. 8The top 10,000 sequences, representing 68.84% of the total reads, were selected for family clustering and classified into 1,905 families using the SMART-Aptamer “BLAST-short-MCL” strategy. Specifically, BLAST search was performed using BLAST v2.7.1+ (https: / / blast.ncbi.nlm.nih.gov / ) with a full-pair whole-nucleotide-nucleotide search (blastn-short mode) and an e-value threshold of 0.05. The BLAST results were used to construct an orthogonal group graph, where homologous aptamer pairs were connected by edges weighted according to normalized BLAST bit scores. Finally, MCL was used to cluster the aptamers into discrete families, and an aptamer family abundance matrix was generated based on the UMI counting matrix of the aptamer sequences.

[0259] 5. Prediction of nucleic acid aptamer-target interactions

[0260] The top 20 aptamer families with the highest abundance were selected for further analysis. For each aptamer family, the binding abundance difference between each cell and the control reference (median abundance in control cells) was calculated. The median of these differences was used as the difference value for each gRNA-defined cell group. Aptamer families with nearly identical difference values ​​(differences within three unique values) were excluded from all gRNA-defined cell groups. The top three gRNA-defined cell groups with the highest difference values ​​in each aptamer family were considered to have large binding differences. GRNA-defined groups with large binding differences in more than half of the aptamer families were removed. For each remaining aptamer family, the binding difference values ​​were further analyzed using Gaussian mixture modeling and the normalmixEM function in the "mixtools" package (version 2.0.0). We hypothesized that binding differences caused by random factors and binding differences caused by target protein knockout follow different distributions. The threshold for distinguishing between these two cases was set to the mean of the first Gaussian distribution. GRNA-defined cell groups with binding differences below this threshold were considered to be caused by random factors and were excluded. Furthermore, if only one gRNA-defined cell group remains for a specific protein, it is excluded to minimize false positives due to randomization. For the remaining gRNA-defined cell groups, a scoring system is applied, summing the differences in gRNA groups for each potential target protein into a composite score. protein =∑ gRNA target to the protein AD gRNAs(where AD represents binding differential values ​​of gRNA-defined groups). This score represents the protein's binding potential to a family of nucleic acid aptamers; a higher score indicates a greater likelihood of binding. The predictions are as follows:

[0261] If only one protein remains, it is considered a target of the nucleic acid aptamer family.

[0262] If two proteins remain, choose the one with the larger binding fraction (≥2x); otherwise, retain both proteins.

[0263] If multiple proteins remain, select the protein with a binding score ≥ 50% of the total score.

[0264] VI. Results Analysis

[0265] Given that the 10x Genomics single-cell platform can analyze approximately 10,000 cells per sample, we balanced the number of cells per gRNA transduction cluster with the number of perturbed target proteins. To refine our target proteome, we performed pull-down analysis and mass spectrometry analysis on the enriched libraries in each round of screening (R0, R1, R2, R3, R4) to identify potential nucleic acid aptamer targets.

[0266] Mass spectrometry results showed that specific target proteins gradually enriched during screening rounds, which led us to select the top nine enriched membrane proteins (NRP1, PTK7, PTPRF, NRP2, PTPRS, CD151, ITGB1, PTPRD, ITGA3) and four proteins distributed in different enrichment intervals (SLC25A5, RAB5C, RAB23, CDCP1) as perturbation targets. Figure 4 A- Figure 4 D). We then transduced SUM159 cells with a mixed library of 46 gRNAs targeting 13 selected proteins, each with three different gRNAs, including seven negative control sgRNAs. These mixed cell populations were then incubated with an enriched R4 aptamer library and followed by SPARK sequencing to assess target interactions at single-cell resolution.

[0267] To investigate the impact of each perturbation on the binding of specific nucleic acid aptamer families and to reveal potential cell surface protein-nucleic acid aptamer interactions, multi-omics data obtained from SPARK-seq were analyzed using SPARTA. Based on transcriptome data, empty and low-quality cells were removed, retaining 14,275 high-quality cells. Then, gRNA data were denoised, cells were grouped into different populations based on gRNA expression, and duplexes with one additional gRNA expression were removed. A total of 8,466 single cells with definitive gRNA expression were identified, including 473 control cells (NC, seven gRNAs). Figure 5 D). Cell populations with smaller cell numbers, such as those with RAB23 and RAB5C knockout, were filtered out, leaving 11 protein perturbation populations (33 gRNA populations) and 1 NC population (7 gRNAs) for further analysis.

[0268] To obtain the nucleic acid aptamer binding profiles of cells in these cell populations, we subsequently analyzed the nucleic acid aptamer sequences and identified 4.027 × 10⁻⁶ aptamers. 7 1.72 × 10 different sequences were generated. 8 Of the readings, the top 10,000 sequences accounted for 68.84% of the total readings. For downstream analysis, we focused on this subset. To improve the reliability of subsequent differential analysis, we used the BLAST-short-MCL algorithm to group similar nucleic acid aptamer sequences (assuming they bind to the same target protein (3, 31, 32)) into 1,906 families. Figure 3 A). The top 20 families account for 97.92% of the total sequence ( Figure 6 B). Overall, we obtained 8,466 cells from 34 cell populations with different gRNA expressions, and aptamer binding profiles for 20 nucleic acid aptamer families. Figure 6 C and Figure 6 D).

[0269] To identify nucleic acid aptamer-protein interactions, SPARK integrates three main steps: combining differential computation, quality control, and a scoring system. Figure 6E). First, the median abundance difference of each aptamer family in the cell population defined by the gRNA is calculated, representing the relative potential binding difference, and compared with the NC population (details in Formula * in the Methods section). Aptamer families with nearly identical binding differences in the cell population defined by the gRNA, as well as gRNAs that cause large binding differences in half of the aptamer families, are excluded to minimize false positives (additional quality control details are provided in the Methods section). For each aptamer family, a threshold for significant binding differences is established using a Gaussian-fit scoring system. Proteins with fewer than two corresponding gRNA cell populations exceeding the threshold are excluded from the target candidate list to minimize the impact of random variation. For the remaining proteins, a single candidate is designated as a target, while for multiple candidates, a scoring system is applied, prioritizing the most likely binding targets using score ratios and proportions (…). Figure 6 E). This method effectively filters out nonspecific signals and improves the reliability of nucleic acid aptamer-target protein pairing.

[0270] Finally, we characterized each potential protein-nucleotide aptamer pair between the inferred binding difference or protein knockout and NC cell population by calculating the log2 fold change (logFC) of aptamer abundance. Of the 20 aptamer families, 8 did not match any well-defined knockout target cell type. This was primarily due to low knockout efficiency of these targets or insufficient cell numbers available for analysis in specific perturbations. Figure 13 A- Figure 13 H). Therefore, SPARK-seq identified binding differences among 12 nucleic acid aptamer families in 8 different perturbed cell populations (H). Figure 6 F). Specifically, aptamers from families 5, 13, and 17 bind to CDCP1; aptamers from family 3 bind to the ITGA3 / ITGB1 protein complex; families 2, 6, and 7 bind to NRP1; families 11 and 14 bind to NRP2; families 1 and 15 bind to PTK7; and family 4 binds to the PTPRD / PTPRF protein. Figure 6 G and Figure 5 A- Figure 5 H).

[0271] This embodiment also systematically compares the SPARK-seq high-throughput platform with the traditional Cell-SELEX combined with IP-MS method. In target identification, traditional methods typically perform serial affinity purification and mass spectrometry analysis on individual aptamers, identifying only 1-2 interactions per experimental cycle, and heavily relying on the researcher's empirical selection. SPARK-seq, by integrating CRISPR-mediated perturbation of 13 surface proteins and tri-omics sequencing at the single-cell level, achieves simultaneous parallel scanning of 5535 aptamer sequences and potential targets. Results show that SPARK-seq successfully mapped the correspondence between 12 aptamer families and 8 clearly defined target proteins in a single experiment, including several low-abundance membrane proteins (expression levels spanning two orders of magnitude) that are difficult to detect using traditional mass spectrometry methods, such as NRP2 and PTPRD. In contrast, traditional IP-MS can only consistently identify some high-abundance targets (such as PTK7) in the same sample, while missing low-expression proteins such as CDCP1. This demonstrates that SPARK-seq not only significantly improves identification throughput, but more importantly, it expands the abundance range and accuracy of detectable targets, overcoming the systematic bias caused by traditional methods that rely on the physical abundance of proteins.

[0272] VII. Proof of Concept

[0273] PTK7 is a novel biomarker for tumor diagnosis and prognosis. As a proof of concept, this invention selected the sgc8c aptamer that binds to PTK7 as the model to verify the universality of the SPARK-seq method.

[0274] 1. Single-cell sequencing

[0275] The aptamer Sgc8-C was denatured and then dissolved in BB. Cells (SUM159, SUM159 PTK7g1 (PTK7 gRNA 1: protein gRNA sequence), SUM159 PTK7g2, EMT6, EMT6 PTK7g3, EMT6PTK7g4) were digested with trypsin-free digestion solution.

[0276] Table 2 gRNA sequence listing

[0277] PTK7 CRISPR guide RNA, PTK7 sequence(5'-3') human PTK7 gRNA1 CACGGAGCGGCGTTTCGCCC human PTK7 gRNA2 GCTGCAGGACTCACGGTTCG mouse PTK7 gRNA3 CAAACCGTCGCTCCGTGTCC mouse PTK7 gRNA4 TTACCGGCTTCGGTTAGTGA

[0278] Dissolve in BB buffer, then take 1×10 in equal proportions. 6 The cells were incubated with 200 nM Sgc-C at 4 °C for 30 min, followed by centrifugation to remove the supernatant, washing three times, and finally dissolving in a solution containing 1 mg of Mg. 2+Single-cell suspensions were prepared by passing the DPBS through a 40 μm filter. After 30 min at 4 °C, single-cell RNA-seq libraries were generated using the 10X Genomics Next GEMSingle Cell 5'Reagent Kits v2 (Dual Index) according to the manufacturer's protocol.

[0279] The CRISPR perturbation sequence was integrated into the library by capturing the transcript sgRNA barcode (5'-GGCTAGTCCGTTATCAACTTG-3'). Figure 4 The aptamer sequences were integrated into the library by capturing the Sgc8-C barcode (identical to the sgRNA barcode). The library was collected and sequenced on the Illumina NovaSeq 6000 platform. Data were processed using the Cell Ranger pipeline, and differential expression analysis was performed using RStudio. Cell quality control and human-mouse separation were performed by analyzing mRNA, guide RNA expression was analyzed to confirm cell populations, and the binding differences of aptamers in different cell populations were analyzed.

[0280] 2. Human-mouse mixed data analysis

[0281] To process 10X Genomics single-cell sequencing data containing both human and mouse cells, we used CellRanger software (version 3.0.2). We first used the "mkref" function in CellRanger to assemble the human (Homo_sapiens.GRCh38.dna_sm.toplevel.fa) and mouse (Mus_musculus.GRCm38.dna_sm.toplevel.fa) genomes into a combined reference. Sequencing reads were then aligned and quantified according to an integrated annotation file combining Homo_sapiens.GRCh38.94.gtf and Mus_musculus.GRCm38.94.gtf.

[0282] Downstream analysis was performed using the Seurat package (version 4.2.1). To distinguish between human and mouse cells in the mixed dataset, we calculated the proportion of human and mouse gene expression in each cell. Cells expressing more than 90% of human or mouse genes were classified as human or mouse. High-quality cells retained for further analysis met the following criteria: (1) mitochondrial RNA content <10%; (2) >500 genes detected. RNA expression data were normalized using the NormalizeData function. Principal component analysis (PCA) was performed after scaling and centering the data. Uniform manifold approximation and projection (UMAP) dimensionality reduction was performed using the RunUMAP function with the first 20 principal components and default parameters.

[0283] False positives can occur when detecting gRNA and aptamer expression due to inherent random errors in sequencing and minute variations in cell barcoding. To address this issue, we applied a Gaussian mixture model (GMM) to accurately identify gRNA and aptamer expression. First, we performed a log2 transformation on the expression data of PTK7-targeting gRNA and sgc8 DNA aptamer sequences. Gaussian mixture modeling was performed using the normalmixEM function in the mixtools package (version 2.0.0) to estimate an appropriate threshold. The threshold was set to 1.96 times the mean (μ) minus the standard deviation (σ) of the second Gaussian distribution. Cells meeting these criteria were then visualized on a UMAP plot.

[0284] 3. Analysis and Verification

[0285] This invention first utilizes the CRISPR knockout system to knock out PTK7 in human SUM159 cells and mouse EMT6 cells using gRNA targeting the PTK7 protein, observing a significant decrease in PTK7 expression. Figure 2 A, Figure 3 A), and then their binding to sgc8c was characterized by flow cytometry, showing no binding on cells that do not express PTK7, and 20% relative shift binding on cells with incomplete knockout. Figure 1 B Figure 3 (B) This is consistent with protein content, indicating that the aptamer sgc8c only shows strong binding differences when cells express PTK7 gRNA and successfully perturb PTK7 protein. When two cell populations that do not express PTK7 and those that express PTK7 are mixed together, sgc8c can clearly distinguish between the two cell populations of the two cell lines. Figure 3C) indicates that in the same environment, if the nucleic acid aptamer is specific for the PTK7 protein, it will exhibit certain differences in PTK7 protein expression between two cell populations. Similarly, we hoped to obtain similar conclusions in single-cell sequencing. To be compatible with the 10x sequencing platform, we extended the sgc8c sequence (sgc8-C), and its SPR results showed no significant change in binding ability. Figure 3 D、 Figure 3 E). We then mixed the six cell types from humans and mice with and without PTK7 protein together and incubated them with Sgc8c for single-cell perturbation sequencing. Figure 2 C), information from three omics sources, including gRNA, mRNA, and aptamer sequences, was obtained. First, human and mouse cells were isolated using mRNA data analysis. Figure 1 (D, Figure S1F), after using a double Gaussian model to determine the threshold for denoising the gRNA and sgc8-C data () Figure 3 G- Figure 3 I) We observed that the species segregation based on gRNA and endogenous cellular transcriptome was 99.1% ( Figure 3 M), which helps identify undetectable intraspecific duplexes, followed by clustering of cells expressing different gRNAs ( Figure 3 M- Figure 3 N), mapping the binding of sgc8-C on UMAP ( Figure 2 E- Figure 2 F), we observed that sgc8-C exhibited high binding levels in cell populations that did not express PTK7 gRNA, and low binding levels in cell populations expressing PTK7 perturbation. Figure 2 The result (G) indicates that the binding differences of sgc8c can be clearly distinguished at the single-cell level, which is beneficial for identifying the target proteins of nucleic acid aptamers. In summary, nucleic acid aptamers can produce significant differences in cell populations with and without their bound target proteins, confirming the feasibility of using nucleic acid aptamers to identify target proteins through single-cell sequencing.

[0286] Example 2: Experimental verification of the interaction between nucleic acid aptamers and targets predicted by SPARK-seq

[0287] I. Experimental Methods

[0288] 1. Sequence selection

[0289] The sequenced sequences were sorted by sequence number, and then family analysis of similar nucleic acid aptamer sequences was performed using the BLAST-short-MCL algorithm. 5535 different sequences from each family were selected (SPARK seq data 1): for example, N46 = GAGTGGCATCTATTACTTAGTGCTACGGCTCTCGGGATGCTCTTCA (5'-3', targeting NRP1), and the primer sequence was added: 5'-CTCGTGGGCTCGGAGATGTGTATAAGAGACAG-N46-GCAGCTCGGCCCATATAAGAAA-3'. The fluorescent tag FAM was added in subsequent experiments for further study.

[0290] 2. Target identification and verification of nucleic acid aptamer sequences

[0291] 1) Extraction of cell membrane proteins

[0292] Preparation: Cell membrane protein and cytoplasmic protein extraction kit (P0033, Beyotime), PMSF (ST506, Beyotime).

[0293] (1) SUM159 cells were seeded in 10cm culture dishes and cultured for 48 hours. This yielded approximately 10 large dishes and about 40 million cells.

[0294] (2) Discard the old culture medium and wash twice with DPBS;

[0295] (3) Add trypsin-free digestion solution to the cells and digest at 37°C for 7 min. Then, aspirate the supernatant, pipette the cells off using WB, and collect them in a 15 mL centrifuge tube.

[0296] (4) Wash with WB at 4℃, centrifuge at 400g, and discard the supernatant;

[0297] (5) Add 1 mL of membrane protein extraction solution A (containing 1 mM PMSF) to a 15 mL centrifuge tube, vortex to mix, and incubate at 4 °C for 10-15 min. Then perform cell disruption: seal the EP tube with sealing film and repeatedly freeze-thaw at liquid nitrogen and room temperature 4 times to achieve disruption; after cell disruption, centrifuge at 4 °C, 400 g for 10 min, carefully collect the supernatant, do not aspirate the precipitate. Then centrifuge at 4 °C, 14000 g for 30 min to collect cell membrane fragments;

[0298] (6) Extraction of membrane proteins: Centrifuge at 14000g for 10s at 4℃, remove as much supernatant as possible, or aspirate a small amount of precipitate. Add 300μL of membrane protein extraction reagent B, shake vigorously for 10s to resuspend the precipitate, and incubate at 4℃ for 5min. Repeat the above steps 3-5 times to fully extract membrane proteins. Then, centrifuge at 14000g for 5min at 4℃ and collect the supernatant, which is the cell membrane protein solution. This solution can then be used directly for subsequent capture experiments.

[0299] 2) Aptamer-pulldown experiment

[0300] (1) Blocking of streptavidin-coated agarose gel beads: Take three 1.5 mL EP tubes, add 100 μL of agarose gel beads to each tube, centrifuge at 2500 rpm for 3 min, and label them as blank sample, Biotin library sample, and Biotin-GY-2 sample, respectively. Add 1 mL of 5% BSA (dissolved in TBST) to each EP tube, shake to mix, block on a shaker at 4 °C for 1 h, wash 3 times with washing buffer, 2500 rpm, 4 °C for 3 min, and place on ice for later use;

[0301] (2) Blocking of membrane proteins: Add 3 mL of DNA blocking solution (BB + 10% FBS to adsorb non-specific nucleic acids + 1 mg / mL salmon DNA to block protein sites that easily bind nucleic acids) to the collected protein sample, shake well and incubate at 4℃ for 1 h. After blocking, take out an appropriate amount to keep as the whole protein sample group.

[0302] (3) Incubation of Biotin library, Biotin-GY-2 aptamer and membrane protein: The blocked membrane protein was evenly divided into 3 groups and each group was added with the library and TY8. The mixture was shaken and incubated on a shaker at 4°C for 1 hour;

[0303] (4) Agarose gel beads and the above liquid incubation: After the reaction, the membrane protein was evenly added to three pre-sealed agarose gel beads in each EP tube, and labeled accordingly. The mixture was shaken to mix and incubated on a shaker at 4°C for 1 hour.

[0304] (5) After incubation, wash with washing buffer and centrifuge 5 times at 2500 rpm, 4°C, for 3 min. Discard the supernatant after centrifugation.

[0305] (6) Protein denaturation: Add 2×loading buffer of the same volume as the agarose gel beads, denature at 100℃ for 10 min, place on ice for 10 min, and store at -80℃.

[0306] 3) SDS-PAGE

[0307] (1) Preparation of PAGE gel: Commercially available 4-20% gradient gel or 8% separating gel and 5% stacking gel 10% SDS-PAGE protein separating gel (5 mL): In a small beaker, add 2 mL ddH2O, 1.6 mL 30% acrylamide gel, 1.25 mL 1.5 M Tris-HCl (pH = 8.8), 0.05 mL 10% SDS, 0.05 mL 10% APS, and 0.002 mL TEMED in sequence, mix thoroughly, and add to the gel plate. Let stand at room temperature until the gel solidifies, then add 5% SDS-PAGE protein stacking gel. In sequence, pipette 3.4 mL ddH2O, 0.85 mL 30% acrylamide, 0.625 mL 1.0 M Tris-HCl (pH = 6.8), 0.05 mL 10% SDS, 0.05 mL 10% APS, and 0.005 mL TEMED.

[0308] (2) Electrophoresis: Correctly load the prepared SDS-PAGE gel into the electrophoresis tank, add 1× electrophoresis buffer (750mL per gel), and add the prepared samples in the order of marker, blank bead, library group bead, and nucleic acid aptamer bead into the wells. Perform electrophoresis at 60V. After the bromophenol blue band migrates to the lower separating gel, increase the voltage to 120V and continue electrophoresis until the bromophenol blue band migrates to the bottom of the gel, thus completing the electrophoresis.

[0309] (3) Protein fixation: Take out the SDS-PAGE gel, remove the stacking gel, put the separating gel into the fixative for 2 hours, and then rinse with ultrapure water 3 times for 15 minutes each time.

[0310] (4) Transfer the PAGE gel to Coomassie Brilliant Blue staining solution and leave it overnight. Wash the Coomassie Brilliant Blue stained gel four times with boiling ultrapure water for 15 minutes each time, until the protein bands on the gel are clearly visible.

[0311] 4) To identify differentially expressed proteins in SDS-PAGE gels, they were extracted using enzyme digestion and subjected to LS / MS mass spectrometry. To identify high-confidence membrane-binding targets, we filtered out non-membrane proteins using the UniProt database, retaining only plasma membrane proteins. Proteins were further screened according to the supplier-recommended scoring threshold (score ≥ 20), which reflects the quality of the match between the observed MS / MS spectra and the theoretical peptide spectra—a higher score indicates higher confidence. For downstream analysis, we compared the aptamer-enriched samples with the Ctrl-lib IP / MS results using the following formula: log 2[(average aptamer intensity + 1) / (average Ctrl-lib intensity + 1)]. Proteins were then ranked based on fold change, and the top-ranked proteins were designated as primary candidate targets for aptamers. For proteins with significant differences (e.g., NRP1), we hypothesized that these proteins (top 1, ranked first in abundance) were likely targets for their aptamers (e.g., Apt-2-2).

[0312] 3. Surface Plasmon Resonance (SPR) Analysis

[0313] Instrument software settings were performed according to the manufacturer's instructions. Control molecules (e.g., His-tagged peptides or IgG) were immobilized on channel 1, while the target protein was immobilized on channel 2. Protein immobilization was performed using default instrument parameters. Specifically, equal volumes of 0.4 M EDC and 0.1 M NHS were premixed and added to 96-well microplates to activate the carboxyl groups on the CM5 sensor chip. The target protein was diluted in 10 mM sodium acetate buffer and covalently immobilized for 900 seconds. The chip surface was then blocked with ethanolamine. All immobilization steps were performed at a flow rate of 10 μL / min. Aptamers were first diluted to 20 μM in DPBS, denatured at 95 °C for 5 min, and then cooled on ice for 5 min. They were then further diluted to 200 nM or serially diluted using SPR running buffer (DPBS with 1 mM MgCl2 added) to create a concentration gradient. Binding time was assessed using a Biacore SPR instrument at 120 s for binding and 180 s for dissociation. Regeneration was performed between cycles using 1.5M NaCl at a flow rate of 30 μL / min. Kinetic parameters, including KD and kt, were calculated using BiacoreInsight evaluation software (v5.0.18.22102) and a 1:1 Langmuir model. on and k off The raw sensor data was exported as PowerPoint plots and further visualized using GraphPadPrism. Affinity and dissociation values ​​were determined based on response units (RUs) from three biologically independent replicates.

[0314] Regarding the SPR experiment, we now specify recombinant human PTK7 protein (Met1–Thr704; Sino) Biological (Cat#19399-H08H) and other recombinant proteins, including CDCP1 (Phe30–Leu666; Cat#10402-CU); NRP1 (Phe22–Lys644; Cat#3870-N1), NRP2 (Gln23–Tyr855; Cat#2215-N2), ITGA3 (Phe33–Glu991) / ITGB1 (Glu21–Asp728) heterodimer (Cat#2840-A3), PTPRD (Glu21–Ser1174; Cat#9995-PR), PTPRF (Ala27–Glu1251; Cat#9377-PF) and PTPRS (Glu30–Gly1260; Cat#3430-PR), were all purchased from R&D Systems.

[0315] 4. Microscale thermophoresis (MST)

[0316] Microscale thermophoresis (MST) was used on a Monolith NT.115 instrument (NanoTemper Technologies) to measure the binding affinity between aptamers and target proteins. Aptamers were labeled with Alexa Fluor 647 or Cy5 (hydrophilic) 3′ and resuspended in MST buffer (DPBS, with 1 mM MgCl2 added) to a final concentration of 10 nM or 15 nM. The recombinant target protein (same as SPR) was serially diluted to 16 concentrations in MST buffer. The labeled aptamers and protein dilutions were incubated at 4 °C for 25 min. After incubation, the mixture was loaded into standard-treated capillaries (NanoTemper, MO-K022). Thermophoresis was recorded at 22 °C with LED power set to 40% (or 20%).

[0317] KD was determined using the built-in KD model in MO.Affinity Analysis software (NanoTemper). Combined curves were replotted in GraphPad Prism using a dose-response-stimulus model, employing a logarithmic (agonist) versus response-variable slope (four parameters). All measurements were performed in at least three independent biological replicates, and results are expressed as mean ± SD.

[0318] 5. Nucleic acid aptamer immunoprecipitation assay (AptIP) and Western blot analysis

[0319] To investigate the interaction between nucleic acid aptamers and target proteins, biotinylated nucleic acid aptamers were subjected to immunoprecipitation, followed by Western blotting analysis. Biotin-labeled aptamers were dissolved in DPBS, heat-denatured at 95°C for 10 min, and then rapidly cooled on ice for 30 min to allow for proper refolding. A total of 300 pmol of biotinylated nucleic acid aptamers was mixed with an equal volume of binding buffer (containing 5 mM MgCl2) and briefly incubated on ice for pre-blocking. Each nucleic acid aptamer was then placed in a 15 cm SUM159 cell culture dish (approximately 90% confluence, >10 cells / mL). 8 (Number of cells) were paired. Cells were collected by enzyme-free dissociation and centrifuged at 300g for 5 minutes. The precipitate was washed twice with washing buffer (containing 5 mmol / L MgCl2), followed by centrifugation at 300g for 5 minutes after each wash. Cells were then lysed in 1 mL RIPA buffer (APPLYGEN, C1053) with protease inhibitor (Roche, 81410500) and phosphatase inhibitor (Beyotime, P1082) added, and lysed at 4°C for 15 minutes. The lysate was centrifuged at 13,000g for 10 minutes at 4°C, and the supernatant (total cell lysate) was collected; a portion was reserved as an input control for IP-WB. The lysate was blocked at 4°C for 1 hour with binding buffer containing 20% ​​FBS at a 2:1 volume ratio, and gently inverted. The pre-blocked biotin-aptamer was incubated with the lysate at 4°C for 1 hour to form an aptamer-target complex.

[0320] Simultaneously, 30 μL of streptavidin magnetic beads (Smart-Lifesciences, SM017010) were prepared for each nucleic acid aptamer sample. The magnetic beads were washed with 500 μL of TBST. For antibody-based control immunoprecipitation (IP): 1:50 (v / v) of target-specific IP antibody or isotype IgG was added to 500 μL of lysate, and the mixture was incubated upside down at 4°C overnight. 30 μL of protein A / G magnetic beads were used for each sample. The magnetic beads were blocked in 0.5 mL of TBST containing 5% BSA at 4°C for 1 hour, washed with 1 mL of TBST, and then resuspended in 30 μL of RIPA buffer. They were then incubated with the aptamer-protein complex at 4°C for 1 hour. After incubation, the magnetic beads were separated, and the supernatant was discarded. The magnetic beads were washed once with 1 mL of TBST on ice, then once with 1 mL of washing buffer on ice, and then resuspended in 30 μL of RIPA buffer. Add SDS loading buffer (Sangon Biotech, C506032-0005) and heat at 95°C for 10 minutes to denature the sample. Separate proteins by 4-20% SDS-PAGE gradient gel electrophoresis. Use pre-stained protein markers to indicate molecular weight. Then, use wet transfer to transfer the proteins to a PVDF membrane (Millipore, 0000377714) at 80V for 120 minutes on ice. Pre-activate the PVDF membrane in 100% methanol for 1-2 minutes before use. Block with 5% (w / v) skim milk powder (Beyotime, PO216-300G) from TBST for 1 hour at room temperature. Incubate the membrane with primary antibody at the recommended dilution overnight at 4°C. The next day, wash the membrane three times with TBST and then incubate with HRP-labeled secondary antibody (1:5000 dilution) at room temperature for 1 hour. After washing, the membrane was incubated with the ECL substrate (BioSharp, BL520B), and the chemiluminescence signal was detected using an Amersham ImageQuant 800 imaging system.

[0321] Immunoblots (IBs) were performed using the following antibodies: anti-PTK7 (CST, 25618, 1:2000), anti-CDCP1 (CST, 4115S, 1:2000), anti-NRP1 (HUABIO, ET1609-69, 1:2000), anti-NRP2 (Proteintech, 27193-1-AP, 1:2000), anti-ITGA3 (HUABIO, HA600100, 1:1000), and anti-PTPRF (Proteintech, 14138-1-AP, 1:1000). SUM159 cell lysates were analyzed by co-immunoprecipitation (Co-IP). The lysates were subjected to immunoprecipitation (IP) with anti-ITGA3 antibody (Invitrogen, MA5-28565), anti-NRP2 antibody (Proteintech, 27193-1-AP), or control IgG (Wanleibio, WLA125). Immunoblotting was performed using the following antibodies: anti-ITGA3 (HUABIO, HA600100, 1:1000), anti-CD151 (Proteintech, 66567-1, 1:10000), anti-NRP2 (Proteintech, 27193-1, 1:2000), and anti-SLC25A5 (Abclonal, A15639, 1:2000).

[0322] 6. Flow cytometry: Cross-competition between targets

[0323] SUM159 or A549 cells were seeded in 15cm cell culture dishes 48 hours in advance. Once the cells reached 90% confluence, they were collected using enzyme-free dissociation buffer and centrifuged at 300g for 5 minutes. The cell pellet was washed sequentially with DPBS and binding buffer. After another 5-minute centrifugation at 300g, the cell pellet was resuspended in binding buffer to a final suspension volume of 50μL per sample.

[0324] Fluorescently labeled aptamers (Alexa Fluor 647 or FAM-labeled Binder-Apt) and unlabeled competing aptamers (Competitor-Apt) were dissolved in DPBS, denatured at 95°C for 10 min, and then rapidly cooled on ice for 30 min to allow for proper refolding. Competitor-Apt is an aptamer targeting different proteins. To initiate the competition experiment, 20 pmol Binder-Apt (or a saturated concentration of aptamer) was added to 50 μL of binding buffer. After a brief incubation, the aptamer was mixed with 50 μL of cell suspension and incubated for at least 10 min. Subsequently, 200 pmol Competitor-Apt (different concentrations of homologous aptamers or an excess (≥20-fold) of competing protein) was added, gently mixed, and incubated on ice in the dark for 30 min with gentle agitation. After incubation, the cells were centrifuged at 300 g for 5 min. The supernatant was discarded, and the cells were washed once with washing buffer. The cells were then resuspended in 100 μL of washing buffer, filtered through a 40 μm cell filter, and then analyzed by flow cytometry to assess fluorescence intensity.

[0325] The protein competitors used in this experiment included: VEGF165 (11066-HNAH), TGFBR1 (10459-H02H), ANGPTL4 (10563-H01H), VEGF-C (10542-H08H), SARS-CoV-2 (2019-nCoV) Spike S1 (40591-V08H), HGF (10463-HNAS), PDGFRA (10556-H02H), and PGF (10274-H05H), all of which were derived from SinoBiological.

[0326] 7. Flow cytometry quantitative fluorescence calibration of surface protein expression

[0327] Forty-eight hours prior to assay, SUM159 cells were seeded in 15 cm cell culture dishes. Once cells reached 90% confluence, they were collected with enzyme-free dissociation buffer and centrifuged at 300 g for 5 minutes. The supernatant was discarded, and the cell pellet was washed once with DPBS, followed by another centrifugation at 300 g for 5 minutes. The cells were then resuspended in flow cytometry binding buffer (DPBS containing 2% FBS and 0.05% NaN3), and the cell volume was adjusted to 50 μL of final suspension per sample. The primary antibody was diluted 1:50 using flow cytometry binding buffer and added to the cell suspension at a final antibody dilution of 1:100, with 50 μL added to each sample. The cells were incubated at 4°C for 30 minutes and gently mixed. After incubation, the cells were centrifuged at 300 g for 5 minutes, and the supernatant was discarded. Alexa Fluor 647 conjugated secondary antibody (species-matched) was diluted 1:500 in flow cytometry binding buffer, and the cell pellet was resuspended in 100 μL of sample. Cells were incubated at 4°C in the dark for 20 min with gentle agitation, followed by centrifugation at 300 g for 5 min. The supernatant was discarded. Cells were then washed once with 200 μL of wash buffer and resuspended in 100 μL of wash buffer. The sample was filtered through a 40 μm cell filter and analyzed by flow cytometry to measure fluorescence intensity.

[0328] Quantitative fluorescence calibration was performed using Quantum. TM MESF kit (Bangs Laboratories). Calibration microspheres were diluted with wash buffer and analyzed on the same flow cytometer. Data analysis was performed using FlowJo software in conjunction with Bangs Laboratories' [software name / function name]. Use the v3.0 quantitative analysis template (www.bangslabs.com / quickcal) for analysis.

[0329] 8. Proteomics analysis of SUM159 cells

[0330] Dissociate cells exceeding 2 × 10 using enzyme-free cell dissociation buffer. 7 SUM159 cells were washed once with DPBS, and the cell pellet was rapidly frozen in liquid nitrogen. Samples were transported on dry ice to a commercial service provider (Westlake Omics) for proteomics analysis using the Astral DIA 24-minute workflow. Three independent biological replicates were analyzed, and downstream quantification and visualization were performed using average protein intensity values.

[0331] 9. Real-time Interactive Cellular Therapy (RT-IC)

[0332] Real-time interacting cell counting (RT-IC) uses The experiment was conducted using a system (RT-IC; Dynamic Biosensors) that integrates a permeable microfluidic cell trap and time-resolved fluorescence detection technology, enabling kinetic analysis of aptamer-target binding on the surface of living cells. SUM159 or A549 cells were seeded in culture dishes 48 hours prior to the experiment. When cell confluence reached approximately 90%, cells were separated using enzyme-free dissociation buffer and washed once with binding buffer. Cells were then resuspended in buffer A (wash buffer supplemented with 2% FBS, 0.1 mg / mL BSA, and 0.01 mg / mL salmon sperm DNA), filtered through a 30 μm cell filter, and then gently introduced into the culture. Microfluidic traps (6-25 μm) were used on the chip. Captured cells were equilibrated at 25 °C in DPBS containing 1 mM MgCl2. FAM-labeled aptamers (10 nM) were injected under continuous flow. The binding phase typically lasted 420 seconds, followed by a 1200-second dissociation phase using only buffer. Fluorescence signals from individual cells were recorded in real-time. The binding flow rate was 25 μL / min, and the dissociation flow rate was 50 μL / min. The raw sensor map was dual-referenced by subtracting the cell-free trap signal and the baseline buffer signal. Data were normalized to account for autofluorescence and nonspecific binding. Curve fitting was performed using GraphPad Prism software, applying global nonlinear least squares fitting to a 1:1 Langmuir binding model throughout the binding and dissociation phases. Extracted kinetic parameters included the binding rate constant (kJ / kb). on ), dissociation rate constant (k off ) and K D .

[0333] 10. Quantitative Reverse Transcription PCR (RT-qPCR)

[0334] Primers are designed to target the downstream region of the gRNA target site, with one primer spanning two exons to ensure specificity. Primer sequences (5′–3′) are as follows: NRP1-For GGATGACAGCAAACGCAAGG, NRP1-RevAGAGAGCTGGAAAAGTCCGC; SLC25A5-For GACACTGCAAAGGGAATGCTT, SLC25A5-RevGCAGTGACAGTCTGTGCGAT; PTPRF-For CTCCTCTGACCCTGTGGAGA, PTPRF-RevCTTTGAGGCGCTCGATGTTG; PTPRD-For ATCCACAAGGGTATGCCTGC, PTPRD-RevCGCCAGAAGTCTTCAGTGGT; PTPRS-For TCAGAAGAGCGAGCCTACCT, PTPRS-RevCGATCACCCAGATAAGCCCC; PTK7-ForTGTGGCCTACATCATTGCCG, PTK7-RevCTGGTCAAGGCCACTTCTTCT; ITGA3-ForA TGGCAAGTGGCTGCTGTAT,ITGA3-RevTGTCCCCAGGGTCAGAAAGA;CD151-ForCGGAGCTCAAGGAGAACCTG,CD151-ForCGGATCCACTCACTGTCTCG;SLC25A5-For GACACTGCAAAGGGAATGCTT, SLC25A5-ForGCAGTGACAGTCTGTGCGAT; GAPDH-For TTCCAGGAGCGAGATCCCT, GAPDH-RevGGCTGTTGTCATACTTCTCATGG.

[0335] Total RNA was extracted from cells using the FastPure Cell / Tissue Total RNA Isolation Kit V2 (Vazyme Biotech Co., Ltd.). RNA (1 μg) was reverse transcribed using the HiScript III All-in-one RT SuperMix Perfect for qPCR kit. Quantitative PCR (qPCR) was performed on a CFX96 real-time PCR detection system (Bio-Rad) using the Taq Pro Universal SYBR qPCR Master Mix. Relative mRNA expression levels were normalized to GAPDH and analyzed using 2... -ΔΔCt The method calculates the multiple change. Data is plotted using GraphPad Prism.

[0336] II. Results Analysis

[0337] To validate the specificity of the aptamer-target interaction candidates identified from SPARK-seq, we assessed copy number differences among aptamers within each aptamer family by comparing them with target knockout cells and control cells. For each of the eight protein targets, we observed different changes in aptamer binding levels after target knockout, indicating that these aptamers exhibit varying sensitivities to target expression levels. Figure 7 A- Figure 7 To confirm that aptamers with different copy number variations specifically bind to their targets, we selected aptamers with different binding abilities for validation. For example, within the PTK7-binding aptamer family, we selected Apt-1-1 (showing a large fold change in copy number, -logFC = 2.3060) and Apt-15-27 (showing a small fold change, -logFC = 0.1049) for evaluation. Flow cytometry results confirmed that both aptamers specifically bound to PTK7-expressing cells but failed to bind to PTK7 knockout cells, demonstrating their specificity for PTK7. Figure 7 C). Furthermore, surface plasmon resonance (SPR) assays further validated the in vitro binding affinity of the two nucleic acid aptamers to the PTK7 protein, with optimal dissociation constants within the picomolar range, further supporting their high specificity and binding strength to PTK7. Figure 8 A and Figure 8 B)

[0338] Similarly, for the targets NRP1, CDCP1, and NRP2, we selected aptamers with different copy number variations and performed flow cytometry and SPR assays. Figure 7 D- Figure 7 L, and Figure 8 C- Figure 8 These results consistently confirm that nucleic acid aptamers identified by the SPARK algorithm specifically bind to their targets with affinity in the nanomolar to picomolar range. Figure 7 Y and Figure 7 Z) confirmed the accuracy of SPARK-seq in identifying precise nucleic acid aptamer-target interactions.

[0339] Interestingly, the SPARK algorithm predicted multiple targets for aptamer families 3 and 4: ITGA3 / ITGB1 and PTPRD / PTPRF, respectively. To validate these predictions, we first examined aptamer family 3 and selected Apt-3-3 (SEQ ID NO. 3) for further analysis. Flow cytometry analysis showed that Apt-3-3 could not bind to cells with ITGB1KO or ITGA3 knockout, which reinforced the specificity of Apt-3-3 for these targets. Figure 7 N and Figure 9 A). Since ITGB1 and ITGA3 belong to the same protein family and have been reported to interact functionally, we hypothesized that the expression of one might affect the other. To verify this, we examined ITGB1 expression after ITGA3 knockout and vice versa. Western blot analysis showed that while ITGA3 knockout did not affect ITGB1 protein levels, ITGB1 knockout significantly reduced ITGA3 protein levels. Figure 9 B and Figure 9 C). Flow cytometry further confirmed the absence of ITGA3 membrane expression in ITGB1 knockout cells, while ITGB1 membrane protein expression was unaffected in ITGA3 knockout cells. These findings suggest that ITGB1 regulates ITGA3 membrane protein expression. Figure 9 D). Co-immunoprecipitation (Co-IP) analysis using an antibody against ITGA3 further confirmed the interaction between ITGB1 and ITGA3, showing a strong interaction between the two proteins. Figure 9 E), consistent with the results of Apt-3-3Apt-IP / MS ( Figure 9 F). Furthermore, given that ITGA3 knockout does not alter ITGB1 expression but still disrupts Apt-3-3 binding, this indicates that Apt-3-3 specifically binds to ITGA3. For aptamer family 4, we selected Apt-4-5 for validation. Using the same Co-IP method, we identified PTPRS, PTPRD, and PTPRF as candidate targets. Figure 10 A). To refine these results, we performed knockout experiments for each target ( Figure 10 B). Knockout of PTPRS, PTPRD, and PTPRF disrupts the binding of Apt-4-5 to varying degrees, and surface plasmon resonance (SPR) analysis reveals nanomolar binding affinity with each purified target. Figure 8 P- Figure 8 X); however, knocking out any one of these proteins did not significantly alter the expression of the other two proteins on the cell membrane. Figure 10C) These results indicate that the binding sites of nucleic acid aptamer family #4, represented by Apt-4-5, are three members of the LAR-RPTP subfamily: PTPRS, PTPRD, and PTPRF. In summary, as Figure 7 Z and Figure 10 D- Figure 10 As shown in H, for eight target proteins, the method of this invention identified several novel nucleic acid aptamers with good affinity: Apt-1-1 (SEQ ID No. 1) showed the best binding affinity to PTK7 protein, especially the middle 46 base sequences (see underlined part of the sequence listing, which can be truncated); Apt-13-18 (SEQ ID No. 5) and Apt-17-39 (SEQ ID No. 13) both showed good affinity to CDCP1 protein, with Apt-13-18 showing the highest affinity; Apt-2-2 (SEQ ID No. 2), Apt-6-6 (SEQ ID No. 14), and Apt-7-10 (SEQ ID No. 15) showed high affinity to NRP1 protein, with Apt-2-2 showing the highest affinity; Apt-11-15 (SEQ ID No. 16) and Apt-14-26 (SEQ ID No. 15) showed good affinity to NRP1 protein, with Apt-2-2 showing the highest affinity; No. 6) has a good affinity for NRP2 protein, with Apt-14-26 having the highest affinity; while Apt-4-5 (SEQ ID No. 4) binds to three proteins, PTPRS, PTPRD and PTPRF, and has excellent affinity.

[0340] Furthermore, taking Apt-4-5 as an example, this invention also compared SPARK-seq with the conventional co-immunoprecipitation-Western blotting (Co-IP / WB) method. SPARK-seq, through multi-gRNA perturbation and cluster analysis, directly indicated that it can simultaneously bind to three members of the LAR-RPTP subfamily: PTPRD, PTPRF, and PTPRS; while traditional Co-IP, although detecting the interaction among the three, could not definitively determine whether the aptamer simultaneously targets all three. Further, by knocking out each protein separately using CRISPR and performing flow cytometry validation, it was found that knocking out any protein significantly reduced the Apt-4-5 binding signal, and SPR confirmed that it has nanomolar affinity for all three recombinant proteins. This indicates that SPARK-seq can accurately identify the complex binding patterns between aptamers and multiple targets. This finding highlights the advantages of the SPARK-seq method in target recognition, enabling it to simultaneously identify and bind multiple proteins within the same family, especially those with similar preservation regions. Compared with traditional methods, the SPARK-seq method can overcome the difficulty of simultaneously identifying multiple proteins within the same family due to differences in protein function and domain, demonstrating its unique advantages in molecular development.

[0341] In summary, we identified 5,535 aptamer sequences using SPARK-seq high-throughput sequencing, and validated these sequences through subsequent experiments. Figure 7 AA). Among the 8 target proteins identified, we obtained aptamers with high binding affinity in the nanomolar range, and 5 of these target proteins (NRP1, NRP2, PTPRF / PTPRD / PTPRS) even showed aptamer affinity in the picomolar range. Figure 7 Y and Figure 7 Z, Figure 10 A- Figure 10 E). Furthermore, this invention also compares the six nucleic acid aptamers with the highest affinity for eight target proteins (Apt-1-1 (SEQ ID No. 1, targeting PTK7 protein); Apt-2-2 (SEQ ID No. 2, targeting NRP1 protein); Apt-3-3 (SEQ ID No. 3, targeting ITGA3); Apt-4-5 (SEQ ID No. 4, targeting PTPRS, PTPRD, and PTPRF); Apt-13-18 (SEQ ID No. 5, targeting CDCP1 protein); Apt-14-26 (SEQ ID No. 6, targeting NRP2 protein)) with the nucleic acid aptamers with the best affinity for eight target proteins selected by traditional methods. The results show that the nucleic acid aptamers selected by the method of this invention have significantly better affinity than those selected by traditional methods, indicating that the optimal nucleic acid aptamers for target proteins selected by this invention are more comprehensive and reasonable.

[0342] This outstanding performance is primarily attributed to the SPARK-seq method, which relies on differential expression of key target proteins between two cell populations to determine aptamer sequences. By integrating thousands of control cells (including non-knockout cells and other target protein knockout cells) into a single-cell sequencing framework, the nucleic acid aptamers identified by this method exhibit extremely high sensitivity to their corresponding target expression, ensuring high affinity and specificity. Notably, because SPARK-seq identifies targets based on differential expression rather than protein abundance, it is able to recognize low-abundance surface proteins, such as NRP2. These results highlight the superior accuracy, specificity, and sensitivity of SPARK-seq, establishing its advantage over traditional nucleic acid aptamer target discovery methods.

[0343] Example 4: Validation of nucleic acid aptamer specificity and target diversity

[0344] To confirm the molecular specificity of the nucleic acid aptamers obtained from SPARK-seq, we employed a three-layer validation strategy. First, in a flow cytometry-based competition assay, cells were incubated with fluorescently labeled "hot" nucleic acid aptamers, followed by challenge with progressively increasing concentrations of unlabeled competitors. Homology competition resulted in a significant dose-dependent decrease in fluorescence (…). Figure 12 A and Figure 12 B) indicates a saturable high affinity binding to the target surface antigen. Conversely, a tenfold excess of non-homologous aptamers failed to attenuate the "hot" nucleic acid aptamer signal. Figure 11 A) confirmed that off-target sequences do not disrupt binding. Secondly, we performed co-immunoprecipitation of nucleic acid aptamer proteins to biochemically validate binding in complex cell lysates. Biotinylated nucleic acid aptamers capture their targets via streptavidin pull-down, followed by detection with target-specific antibodies. Figure 12 C). Consistent and stable enrichment of the predicted protein was obtained in each experiment, highlighting selective recognition across the entire cellular proteome. Finally, we performed SPR cross-binding assays on 12 representative aptamer families, covering all 8 identified Spark-seq targets. Each aptamer was tested with its homologous protein and 7 irrelevant proteins. The results showed that high-affinity interactions were associated only with the expected target, while signals from non-target targets were consistent with the negative control. Figure 11 B). These orthogonal experiments strongly demonstrate the precise specificity of our nucleic acid aptamer library at the molecular level.

[0345] Based on this specificity, we evaluated the ability of SPARK-seq to identify highly conserved protein families. Neurociliin-1 (NRP1) and neurociliin-2 (NRP2) share extensive sequence and structural homology. Figure 11 C). Notably, SPARK-seq identified different aptamers that specifically bind to NRP1 or NRP2, without exhibiting cross-reactivity or competition. Figure 11 D). Notably, competitive binding assays showed that aptamers targeting NRP1 specifically blocked VEGF-165 binding ( Figure 11 E), indicating that it binds to a typical VEGF-binding epitope. This highlights its potential as a molecular probe for anatomical angiogenesis signaling and a candidate for anti-angiogenic therapeutic development. Conversely, the NRP2 aptamer binds to a non-canonical site, unlike known interacting regions, including VEGF-165, VEGF-C, and TGF-β1. Figure 11 F). This binding interface may provide a new perspective for exploring NRP2-specific biology and differential ligand recognition within the neurocilia protein family.

[0346] To investigate whether SPARK-seq can capture proteins with diverse physicochemical properties, we performed charge, hydrophobicity, structural motif, and abundance analyses on eight validated targets. First, we calculated their isoelectric points (pI), revealing a wide distribution (pI 4.5–8.5), encompassing negatively charged, neutral, and positively charged proteins at physiological pH levels. Figure 13 A). Furthermore, hydrophobicity analysis showed no correlation between the hydrophobicity content of the primary sequence and the nucleic acid aptamer capture efficiency. Figure 13 B). These results collectively demonstrate that the performance of SPARK-seq is independent of the target charge and the overall hydrophobicity of the primary sequence.

[0347] Next, using the AlphaFold model, we observed that all targets possess large extracellular domains composed of repetitive or modular motifs—immunoglobulin folds in the PTK7 and PTPR family proteins, and fibronectin repetitive sequences in integrins. Figure 13 C and Figure 13 D). These structures may provide stable binding surfaces for nucleic acid ligands. Surface electrostatic potential mapping further identified positively charged patches on these domains, providing a reasonable basis for electrostatic complementarity with nucleic acid aptamers. Figure 13 E).

[0348] Finally, we assessed target abundance at both whole-cell and cell surface levels. Mining published mass spectrometry datasets (6,447 quantified proteins), we found that five targets (e.g., PTK7) were among the top 30% of cellular abundance, NRP1 was close to 30%, while NRP2, PTPRD, and PTPRS were below detection levels. Figure 13 F). The high-sensitivity MS platform (detecting 10,970 proteins) recovered NRP2 and PTPRS in the bottom third, while PTPRD remained undetectable, indicating extremely low expression. Figure 13 G). Calibrated flow cytometry confirmed these trends on the cell surface, quantifying over 10-10 5 ITGA3 has 1,000 copies per cell, while PTPRD, PTPRF, and PTPRS have less than 1,000 copies per cell. Figure 11 G and Figure 13 In summary, these findings demonstrate that SPARK-seq can reliably identify nucleic acid aptamers for cell surface proteins, whose expression levels and physicochemical properties span two orders of magnitude.

[0349] Compared to traditional methods that focus on high-abundance sequences and rely on pull-down experiments, SPARK-seq offers a more robust and sensitive single-cell differential screening approach. It effectively identifies nucleic acid aptamer-target interactions, such as CDCP1, which might be missed by traditional biochemical techniques due to issues like protein denaturation or low ionization efficiency. Figure 14 A and Figure 14 B). This cell-based functional strategy highlights SPARK-seq's exceptional ability to discover targets sensitive to in vitro artifacts and conformational changes, providing a powerful complementary tool for aptamer discovery.

[0350] Example 3: Screening of slow dissociation rate aptamers and their characteristics based on SPARK-seq data

[0351] This invention has yielded an unprecedented large number of nucleic acid aptamers targeting known targets, which we then analyzed in depth. To explore the factors influencing the different binding properties exhibited by different sequences, and given the most significant difference in logFC (logFC) among PTK7-binding aptamers, we focused our research on PTK7-binding aptamers. We obtained 3096 aptamer sequences (top 10000) from the PTK7-binding sequences, and analysis revealed two fixed conserved regions – Motif 1 (SEQ ID NO. 18) and Motif 2 (SEQ ID NO. 19). Figure 15 a) That is to say, as long as both of these conserved regions are present (or less than 30% of the total bases of the two conserved regions are changed), the N of the nucleic acid aptamer sequence... 46 Nucleic acid sequences with 60% or more homology to each other can bind to the PTK7 protein. Subsequently, we selected eight aptamers with different sequences based on the logFC values ​​of different regions and analyzed them on SPARK-seq, flow cytometry, and SPR. The differences in binding between PTK7-perturbed cells and control cells showed consistency in the results of SPARK-seq and flow cytometry. Figure 15 b) demonstrates the accuracy of single-cell perturbation sequencing analysis and also indicates that the logFC value is related to the performance of nucleic acid aptamers.

[0352] To explore what specific properties of the nucleic acid aptamers caused the differential binding values, we examined the affinity of eight nucleic acid aptamers on SUM195 cells. Figure 16 a) It was found that there was no correlation between affinity value and binding difference. Figure 15 c). However, after verification analysis through SPR experiments, Apt-1-24 showed a weak correlation with its affinity for SPR. Figure 16 b and Figure 16d) The difference between the two stems from the difference in formulas. Based on the SPR calculation formula of dissociation rate / binding rate, we further explored and found that the difference in binding rate is not correlated with the binding rate. Figure 16 c), but it is strongly correlated with the dissociation rate. Figure 15 e). To verify this result in a more physiological environment, this invention also verified, through flow cytometry experiments, that the dissociation rate and differential expression of nucleic acid aptamers were strongly correlated. Figure 15 f) The smaller the binding difference value of the nucleic acid aptamer, the greater the dissociation rate. We have also verified this on other target proteins. Figure 16 e- Figure 16 l).

[0353] This invention further explores what factors directly determine the dissociation rate of nucleic acid aptamer sequences. This was achieved by analyzing the dissociation rate and base differences in conserved regions of 10 representative PTK7 aptamers. Figure 15 g and Figure 15 h), we found that the most frequently occurring base sequence in the conserved region does not have the slowest dissociation rate. The classic Sgc8c sequence has the same conserved region as we observed, but its dissociation rate is not the slowest there either. The dissociation rate changes when a portion of the bases is mutated. Through these patterns, we can see that when position 9 mutates to a T base, position 20 mutates to the paired base A. The accompanying mutation rates are 76.50% (9T-20A) and 96.66% (20A-9T), respectively. Figure 17 b) indicates that there is an interaction between the two conserved regions. Through similar analysis, it was found that this interaction also exists in conserved region 1. For example, after the 4th base mutates to T or G, the 8th base will mutate to A or C, and exhibit a certain binding ability. Figure 17 a). Most mutations increase the dissociation rate; however, not all sites can produce mutations that maintain a strong binding affinity. Figure 17 c- Figure 17 e) For this reason, only a small number of base mutations exist in all sequences of the family. When the nucleic acid aptamer reaches a certain fast dissociation rate, it will be eliminated by the system and will not be easily retained by screening.

[0354] In summary, we found that the differences in binding to the same target are not directly related to the magnitude of the affinity, but rather to the dissociation rate; a higher dissociation rate results in smaller binding differences. Nucleic acid aptamers with slow dissociation rates exhibit significant advantages in therapeutic and diagnostic applications. Their long-term binding to the target improves potency and efficacy, making them particularly valuable for sustained drug-receptor interactions and for enhancing molecular imaging by improving the signal-to-noise ratio. In drug delivery, these aptamers can control the stepwise release of drugs, contributing to more precise treatment. Unlike antibodies, which are limited by intra-receptor selectivity and batch-to-batch variability, aptamers can be engineered in vitro to achieve specific binding properties, including customized dissociation rates, thus providing greater flexibility for biotechnological and medical applications.

[0355] To obtain the nucleic acid aptamers with the slowest dissociation rates binding to different target proteins, we first compared the correlation between the logFC value of PTK7-binding aptamers and the nucleic acid aptamer enrichment (SPARK-seq). Figure 19 a) A significant difference was found between the slow dissociation rate aptamer Apt-1-21 and the logFC of Apt-1-7, the most enriched aptamer in R4 obtained by NGS sequencing. Flow cytometry showed that Apt-1-21 dissociated more slowly than Apt-1-7. Figure 19 (b) In traditional identification methods, we typically select the most enriched sequence for the aptamer, but this is often not the sequence with the slowest dissociation rate. Therefore, it is difficult to obtain the aptamer with the slowest dissociation rate using traditional identification methods. Apt-1-1, as the most enriched PTK7-binding aptamer in SPARK-seq, still has a worse dissociation rate than Apt-1-21. Figure 15 g), but its dissociation rate was slower than Apt-1-7, which to some extent indicates that single-cell perturbation sequencing increased the screening pressure. Among the nucleic acid aptamers that it binds to several targets, we found that the sequences with the highest enrichment were consistent in both SPARK-seq and NGS ( Figure 18 a) Then, similar methods were used to analyze aptamers of other targets. Apt-6-292, which binds to NRP1, dissociated more slowly than the most enriched sequence Apt-2-2. Figure 19 c and Figure 19 d) Apt-11-116, which binds to NRP2, dissociates more slowly than the most enriched sequence Apt-11-15. Figure 19 e and Figure 19 f), Apt-4-200 bound to PTPRD / PRPRF has a slower dissociation rate than the most enriched sequence Apt-4-5. Figure 19 g and Figure 19 h), similar results were observed on the targets of CDCP1 and ITGA3. Figure 18 b- Figure 18 e). We characterized the affinity of Apt-6-292, Apt-11-116, and Apt-4-200 for flow cytometry (e). Figure 18 f- Figure 18 h), the same conclusion as before ( Figure 15 d) Affinity was not significantly improved. The above results demonstrate that the aptamer with the strongest affinity for the target protein does not necessarily have the slowest dissociation rate. For the eight target proteins screened in this invention—PTK7 (UniProt: Q13308), NRP1 (UniProt: O14786), NRP2 (UniProt: O60462), PTPRD (UniProt: P23468) / PTPRF (UniProt: P10586) / PTPRS (UniProt: Q13332), CDCP1 (UniProt: Q9H5V8), and ITGA3 (UniProt: P26006)—the aptamers with the slowest dissociation rates were, in descending order: Apt-1-21 (SEQ ID No. 7), Apt-6-292 (SEQ ID No. 8), Apt-11-116 (SEQ ID No. 9), Apt-4-200 (SEQ ID No. 10), and Apt-5-4 (SEQ ID No. 10). No. 11), Apt-3-191 (SEQ ID No. 12). In summary, we have developed a novel method for screening aptamers with slow dissociation rates, which outperforms traditional identification methods due to the simultaneous analysis of a large number of aptamers by a high-throughput identification system.

[0356] Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any modifications and alterations made by those skilled in the art without departing from the spirit and scope of the present invention shall still fall within the protection scope of the present invention.

[0357] Sequence Listing SEQ ID No. 1

[0358] Apt-1-1:

[0359] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CTGCGCCGCCGGGAGGTAAATGTGTTTCGCTGTACGG TCTTGGTAA GCAGCTCGGCCCATATAAGAAA

[0360] SEQ ID No.2

[0361] Apt-2-2:

[0362] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG GAGTGGCATCTATTACTTAGTGCTACGGCTCTCGGGA TGCTCTTCA GCAGCTCGGCCCATATAAGAAA

[0363] SEQ ID No.3

[0364] Apt-3-3:

[0365] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG TTTCGGCGGGTGAATATCCAACTGGTCCGTCCCTTGG GATCTTTGT GCAGCTCGGCCCATATAAGAAA

[0366] SEQ ID No.4

[0367] Apt-4-5:

[0368] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG GGTTTGCTGAGGTGGGCGTCGTTGAATGTTAGTTCGG GAATACTTG GCAGCTCGGCCCATATAAGAAA

[0369] SEQ ID No.5

[0370] Apt-13-18:

[0371] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG TTGCTTTTCCCCGCAGCAGGACGTAAGCTCGTCCATT GGGTGGGTA GCAGCTCGGCCCATATAAGAAA

[0372] SEQ ID No.6

[0373] Apt-14-26:

[0374] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CGTGCATTCGTAACGCTTAGTAATGTGGAATTCCATG TCTTGTCAA GCAGCTCGGCCCATATAAGAAA

[0375] SEQ ID No.7

[0376] Apt-1-21:

[0377] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CTGCGCCGCCGGGGTAAGATGTTTATCGAACGGTACG GGTTCTTTG GCAGCTCGGCCCATATAAGAAA

[0378] SEQ ID No.8

[0379] Apt-6-292:

[0380] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG ATAGCGTTCCTAGAGCGTGGGGTGGCAGGTTTTTGAG CTTGGCTGC GCAGCTCGGCCCATATAAGAAA

[0381] SEQ ID No.9

[0382] Apt-11-116:

[0383] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CAAGCTGTCCTAATAGATGACGTCCTCACCTCGCTGT GTTTAATCG GCAGCTCGGCCCATATAAGAAA

[0384] SEQ ID No.10

[0385] Apt-4-200:

[0386] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG GGTTTGCTGAGGTGGGCGTCGTTGAATGTTAGTTCGG GAATACTTT GCAGCTCGGCCCATATAAGAAA

[0387] SEQ ID No.11

[0388] Apt-5-4:

[0389] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CGGTGGAGCTGTCGGTGAGCAGCGGTTGACATTGTGA GCCTTATTC GCAGCTCGGCCCATATAAGAAA

[0390] SEQ ID No.12

[0391] Apt-3-191:

[0392] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG TTTCGGCGGGTGAATATCCAACTGGTCCGTCCCTTGG GGTCTTTGT GCAGCTCGGCCCATATAAGAAA

[0393] SEQ ID No.13

[0394] Apt-17-39:

[0395] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG TCCGGGATTCTACCTACTTCCCTGATAAAGGGGAGGC TTGTCGTAA GCAGCTCGGCCCATATAAGAAA

[0396] SEQ ID No.14

[0397] Apt-6-6:

[0398] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG ATAGCGTTCCTAGAGCGTGGGGTGGCAGGTTTTTGAG CTTAGCTGC GCAGCTCGGCCCATATAAGAAA

[0399] SEQ ID No.15

[0400] Apt-7-10:

[0401] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG GGCTCCTCTTAGGGGCTGTGACCGGCGGGCGGGAATG TAGCAGGAT GCAGCTCGGCCCATATAAGAAA

[0402] SEQ ID No.16

[0403] Apt-11-15:

[0404] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CAAGCTGTCCTAATAGATGACGTCTTCACCTCGCTGT GTTTAATCG GCAGCTCGGCCCATATAAGAAA

[0405] SEQ ID No.17

[0406] Apt-1-7:

[0407] CTCGTGGGCTCGGAGATGTGTATAAGAGACAG CTGCACCGCCGGGAGAGCTCTTGGAGTCTCGTACGGT TTATTAGAC GCAGCTCGGCCCATATAAGAAA

[0408] SEQ ID NO.18

[0409] Motif 1:CTGCGCCGCCGGGA

[0410] SEQ ID NO.19

[0411] Motif 2:GTACGGT

Claims

1. A method for high-throughput identification of target proteins of nucleic acid aptamers, characterized in that, Includes the following steps: (1) Combine the enriched nucleic acid aptamer library with a target cell or a target cell population; (2) The enriched nucleic acid aptamer library is combined with a perturbed cell or a population of perturbed cells; compared with the target cell, at least one protein of the perturbed cell is altered; (3) Compare the differences in binding of nucleic acid aptamers to disturbed cell populations and target cell populations to identify proteins or nucleic acid aptamers of proteins.

2. The method as described in claim 1, characterized in that, The proteins include those in the target cell population that can bind to nucleic acid aptamers.

3. The method as described in claim 2, characterized in that, The proteins include cell membrane proteins of the target cell population.

4. The method as described in claim 1, characterized in that, Step (1) is followed by screening for candidate target proteins of the target cell population.

5. The method as described in claim 4, characterized in that, The method for screening candidate target proteins of a target cell population includes: first obtaining a protein mixture solution of the target cell population, then binding the protein mixture solution of the target cell population to an enriched nucleic acid aptamer library, and screening candidate target proteins of the target cell population from there.

6. The method as described in claim 5, characterized in that, The protein mixture solution of the target cell population includes the sum of cell membrane proteins of the target cell population.

7. The method as described in claim 5, characterized in that, The method for obtaining the protein mixture solution of the target cell population includes: i. Separate the cell membranes of the target cell population; ii. Lyse the cell membrane to obtain a cell membrane protein solution of the target cell population.

8. The method of claim 7, wherein screening candidate target proteins for the target cell population includes screening a series of target proteins that can bind to more nucleic acid aptamers and are ranked highly.

9. The method as described in claim 8, characterized in that, The ranking of higher-ranked proteins includes ranking higher-ranked proteins based on differences. This ranking of higher-ranked proteins based on differences involves: binding enriched nucleic acid aptamer libraries and random nucleic acid aptamer libraries to proteins in the target cell population, respectively. The protein content bound by the enriched nucleic acid aptamer library minus the protein content bound by the random control library equals N. Proteins are ranked according to their N values, and proteins with larger N values ​​are selected as target proteins. The combination of multiple selected target proteins forms a candidate target protein.

10. The method as described in claim 9, characterized in that, The enriched nucleic acid aptamer library is derived from the random nucleic acid aptamer library. The random nucleic acid aptamer library is contacted with the target cell population, and the enriched nucleic acid aptamer library is obtained through screening.

11. The method as described in claim 10, characterized in that, The screening was achieved through different screening pressures, including any one or more of the following: screening times, washing times, and increasing the content of BSA or herring sperm DNA.

12. The method as described in claim 11, characterized in that, The number of screening rounds is 1 to 26.

13. The method as described in claim 12, characterized in that, The random nucleic acid aptamer library includes a library represented by the following sequence: 5'-CTCGTGGGCTCGGAGATGTGTATAAGAGACAG-Nx-GCAGCTCGGCCCATATAAGAAA-3', where Nx is X random nucleotide sequences, and X is 10 to 100.

14. The method as described in claim 1, characterized in that, The change in at least one protein in step (2) includes a decrease or increase in the expression level of at least one protein or a loss of at least one protein.

15. The method as described in claim 1, characterized in that, The perturbation cell population in step (2) is constructed by gene editing, knockout, knockdown or silencing of the target cell population to make one or more of its candidate target proteins not expressed or reduced in expression; the enriched nucleic acid aptamer library is simultaneously bound to one or more perturbation cell populations.

16. The method as described in claim 14, characterized in that, The perturbed cell population is obtained by knocking out a candidate target protein through gene editing; different perturbed cell populations knock out different types of candidate target proteins.

17. The method as described in claim 16, characterized in that, The gene editing method used is CRISPR-Cas9.

18. The method as described in claim 1, characterized in that, The binding difference mentioned in step (3) refers to finding differential nucleic acid aptamers that can bind to the disturbed cell population and the target cell population respectively.

19. The method as described in claim 18, characterized in that, The differentially expressed nucleic acid aptamer refers to the nucleic acid aptamer that can only bind to the target cell population but not to the perturbed cell population by comparing nucleic acid aptamers that can bind to a perturbed cell population and nucleic acid aptamers that can bind to the target cell population. The candidate target protein of the perturbed cell population that is knocked out is the target protein corresponding to the differentially expressed nucleic acid aptamer.

20. The method as described in claim 19, characterized in that, The process of identifying differentially expressed nucleic acid aptamers is obtained through single-cell multi-omics sequencing, which includes any one or more of single-cell mRNA sequencing, single-cell nucleic acid aptamer sequencing, and single-cell CRISPR gRNA sequencing.

21. The method as described in claim 20, characterized in that, By simultaneously analyzing the differentially expressed nucleic acid aptamers that bind to each perturbed cell population and the target cell population, high-throughput acquisition of the target proteins corresponding to all differentially expressed nucleic acid aptamers can be achieved.

22. The method as described in claim 21, characterized in that, The analytical method for simultaneously analyzing the differentially expressed nucleic acid aptamers binding to each perturbed cell population and the target cell population uses the SPARK-seq algorithm, which includes: a) Calculate the difference in binding abundance of nucleic acid aptamers between the perturbed and target cell populations; b) Based on the Gaussian mixture model, a statistical threshold is set to screen for nucleic acid aptamer-target proteins with significant binding differences.

23. The method as described in claim 1, characterized in that, The target cell population includes diseased cells and / or healthy cells.

24. The method as described in claim 23, characterized in that, The diseased cells include cells that are tumorous, inflammatory, or otherwise diseased and not healthy.

25. A method for screening slow-dissociating nucleic acid aptamers, characterized in that, A series of nucleic acid aptamers targeting the target protein are screened, and the series of nucleic acid aptamers are bound to the target protein. The nucleic acid aptamer that binds to the target protein the most is selected.

26. The method as described in claim 25, characterized in that, The nucleic acid aptamer is a nucleic acid aptamer that targets a specific cell population, and the method includes the following steps: a) Combining a nucleic acid aptamer library with a target cell or a target cell population; b) The enriched nucleic acid aptamer library is combined with a perturbed cell or a population of perturbed cells; the perturbed cell population has at least one protein altered compared to the target cell. c) Compare the differences in binding of nucleic acid aptamers to the disturbed cell population and the target cell population to identify the protein or the nucleic acid aptamer of the protein, and select the nucleic acid aptamer that binds to the protein the most.

27. The method as described in claim 26, characterized in that, The nucleic acid aptamer is a nucleic acid aptamer that targets a specific cell population, and the method includes the following steps: a) Enriched nucleic acid aptamer libraries are combined with proteins in the target cell population to screen for candidate target proteins in the target cell population; b) Combining an enriched nucleic acid aptamer library with a perturbed cell population; the perturbed cell population is constructed from a target cell population lacking at least one candidate target protein; c) Compare the nucleic acid aptamers that bind to the perturbed cell population and the target cell population respectively, and identify the differential nucleic acid aptamers. The target protein of the differential nucleic acid aptamer is the candidate target protein missing in the perturbed cell population. d) Differential nucleic acid aptamers bind to target proteins, and the nucleic acid aptamer with the highest number of bindings to the target protein is selected.

28. A method for screening nucleic acid aptamers capable of recognizing cell surface proteins of different abundances, characterized in that, The nucleic acid aptamer is a nucleic acid aptamer that targets a specific cell population, and the method includes the following steps: a) Bind an enriched nucleic acid aptamer library to a target cell or a target cell population. b) The enriched nucleic acid aptamer library is combined with a perturbed cell or a population of perturbed cells; the perturbed cell population has at least one protein altered compared to the target cell. c) Compare the differences in binding of nucleic acid aptamers to disturbed cell populations and target cell populations to identify proteins or nucleic acid aptamers of proteins; The protein contains cell surface proteins whose expression levels span two or more orders of magnitude.

29. A nucleic acid aptamer that binds to the PTK7 protein, characterized in that, The nucleotide sequence has the nucleotide sequence shown in SEQ ID No. 1, SEQ ID No. 7, or SEQ ID No. 17; or a nucleotide sequence that has at least 60% homology with SEQ ID No. 1, SEQ ID No. 7, or SEQ ID No. 17 and can still bind to the PTK7 protein; or a nucleotide sequence that has at least 60% homology with the middle 46 bases of SEQ ID No. 1, SEQ ID No. 7, or SEQ ID No. 17 and can still bind to the PTK7 protein; or a nucleotide sequence that simultaneously has two conserved regions, Motif 1 and Motif 2; or a nucleotide sequence that simultaneously has two conserved regions, and the two conserved regions are less than 30% altered compared to the nucleotide sequences of Motif 1 and Motif 2; wherein Motif 1 and Motif 2 have the nucleotide sequences shown in SEQ ID NO. 18 and SEQ ID NO. 19, respectively.

30. The nucleic acid aptamer as described in claim 29, characterized in that, The aptamer includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 1 and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No.

7.

31. A nucleic acid aptamer that binds to the CDCP1 protein, characterized in that, The nucleotide sequence has the nucleotide sequence shown in SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13; or has at least 60% homology with SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13 and can still bind to the CDCP1 protein; or has at least 60% homology with the middle 46 bases of SEQ ID No. 5, SEQ ID No. 11, or SEQ ID No. 13 and can still bind to the CDCP1 protein.

32. The nucleic acid aptamer as described in claim 31, characterized in that, The aptamer includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 5 and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No.

11.

33. A nucleic acid aptamer that binds to the NPR1 protein, characterized in that, The nucleotide sequence having the nucleotide sequence shown in SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15; or having at least 60% homology with SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15 and still being able to bind to the NPR1 protein; or having at least 60% homology with the middle 46 bases of SEQ ID No. 2, SEQ ID No. 8, SEQ ID No. 14 or SEQ ID No. 15 and still being able to bind to the NPR1 protein.

34. The nucleic acid aptamer as described in claim 33, characterized in that, The invention includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 2, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No.

8.

35. A nucleic acid aptamer that binds to the NPR2 protein, characterized in that, The nucleotide sequence having the nucleotide sequence shown in SEQ ID No. 6, SEQ ID No. 9 or SEQ ID No. 16; or having at least 60% homology with SEQ ID No. 6, SEQ ID No. 9 or SEQ ID No. 16 and still being able to bind to the NPR2 protein; or having at least 60% homology with the middle 46 bases of SEQ ID No. 6, SEQ ID No. 9 or SEQ ID No. 16 and still being able to bind to the NPR2 protein.

36. The nucleic acid aptamer as described in claim 35, characterized in that, The invention includes high-affinity nucleic acid aptamers and slow-dissociating nucleic acid aptamers, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 6, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No.

9.

37. A nucleic acid aptamer that simultaneously binds PTPRD, PTPRF, and PTPRS proteins, characterized in that, The nucleotide sequence has the nucleotide sequence shown in SEQ ID No. 4 or SEQ ID No. 10; or has at least 60% homology with SEQ ID No. 4 or SEQ ID No. 10 and can still bind PTPRD, PTPRF and PTPRS proteins simultaneously; or has at least 60% homology with the middle 46 bases of SEQ ID No. 4 or SEQ ID No. 10 and can still bind PTPRD, PTPRF and PTPRS proteins simultaneously.

38. The nucleic acid aptamer as described in claim 37, characterized in that, The invention includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 4, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No.

10.

39. A nucleic acid aptamer that binds to the ITGA3 protein, characterized in that, It has a nucleotide sequence as shown in SEQ ID No. 3 and SEQ ID No. 12; or a nucleotide sequence that has at least 60% homology with SEQ ID No. 3 and SEQ ID No. 12 and can still bind to the ITGA3 protein; or a nucleotide sequence that has at least 60% homology with the middle 46 bases of SEQ ID No. 3 and SEQ ID No. 12 and can still bind to the ITGA3 protein.

40. The nucleic acid aptamer as described in claim 39, characterized in that, The invention includes a high-affinity nucleic acid aptamer and a slow-dissociating nucleic acid aptamer, wherein the high-affinity nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No. 3, and the slow-dissociating nucleic acid aptamer has a nucleotide sequence as shown in SEQ ID No.

12.

41. Use of the nucleic acid aptamer according to any one of claims 29-40 in the preparation of reagents for detecting, diagnosing or treating diseases.