Method for providing oligonucleotide of specific length including predictive thermodynamic characteristic information
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SEEGENE INC
- Filing Date
- 2025-12-08
- Publication Date
- 2026-06-25
AI Technical Summary
Existing methods for designing oligonucleotides are time-consuming and inefficient due to the need for extensive computational resources to calculate thermodynamic properties for various sequences, particularly in multiplex PCR and nucleic acid hybridization techniques, where precise oligonucleotide design is crucial for specific binding and stability.
A method that generates adjacent base pair profiles for oligonucleotides within a specified length range, assigning predicted thermodynamic properties using adjacent base pair thermodynamic parameters, allowing for rapid and accurate derivation of optimal oligonucleotide sequences.
Reduces the time and computational effort required to obtain oligonucleotides with desired thermodynamic properties, enhancing the efficiency and reproducibility of oligonucleotide design, especially for applications involving target nucleic acid detection.
Smart Images

Figure KR2025021000_25062026_PF_FP_ABST
Abstract
Description
Method for providing oligonucleotides of a specific length containing information on predictive thermodynamic properties
[0001] Cross-reference regarding related applications
[0002] This patent application claims priority to Korean Patent Application No. 2024-0191255, filed with the Korean Intellectual Property Office on December 19, 2024, and the disclosures of said patent applications are incorporated herein by reference.
[0003] Technology field
[0004] The present invention relates to a method for providing an oligonucleotide of a specific length containing information on predicted thermodynamic properties.
[0005]
[0006] Polymerase Chain Reaction (PCR) is a widely used technique for nucleic acid analysis in the fields of biology and medicine, capable of amplifying target nucleic acid sequences for detection, sequencing, or other analyses. This technique involves a process in which two or more primers hybridize to different sites on the target nucleic acid, followed by the hybridization and dissociation of complementary strands through repeated cycles of high and low temperatures. In this process, oligonucleotides must be precisely designed to ensure specific binding to the target sequence, and the selection and design of appropriate oligonucleotides are essential for applications in diagnosis and forensics.
[0007] In particular, in multiplex PCR, oligonucleotide primers must be precisely designed to ensure specific binding to the target sequence. Multiplex PCR is a technique that simultaneously amplifies and detects multiple target nucleic acids within a single PCR mixture, contributing to reduced analysis time and increased efficiency. However, because it requires the simultaneous use of multiple primer pairs in the mixture to analyze various targets at the same time, there is a high likelihood of problems occurring, such as primer-to-primer interactions, non-specific binding, and reduced amplification efficiency.
[0008] Along with this, other technologies based on nucleic acid hybridization are also playing an important role in biological and medical research. For example, nucleic acid detection techniques such as Southern blotting involve immobilizing target nucleic acid molecules within a sample onto a solid surface (membrane support) and detecting them by hybridizing them with complementary nucleic acid probes. For these techniques as well, the design of primers and probes—specifically oligonucleotides—that can specifically bind to the target nucleic acid is crucial.
[0009] Ultimately, the success of these technologies depends on the design of primers and probes capable of specifically binding to target nucleic acids. In other words, the design of oligonucleotides is critical, and understanding their thermodynamic properties is essential for deriving optimal oligonucleotides. The thermodynamic properties of oligonucleotides significantly influence the stability of the double helix they form, their binding characteristics, and their degradation reactions. For example, thermodynamic property information such as the melting temperature (Tm), enthalpy (△H), and entropy (△S) of each oligonucleotide is a key factor in determining their binding characteristics and reactivity.
[0010] The stability of oligonucleotides is often expressed by the melting temperature (Tm) of the dimer between the oligonucleotide and its complementary strand. Tm refers to the temperature at which half of the dimer dissociates to become a single strand. Preferably, nucleic acid hybridization is performed at a temperature slightly lower than Tm to optimize hybridization between the primer or probe and its target nucleic acid and to minimize non-specific hybridization between the primer or probe and non-target nucleic acid. Tm is also important in PCR involving thermal cycling.
[0011] Various methods have been proposed to predict thermodynamic information, such as Tm and △G, of specific oligonucleotides. Marmur and Doty (1962) used a simple equation in which Tm depends only on the relative content of cytosine and guanine. This equation was subsequently improved by adding a correction factor responsible for salt concentration to adjust Tm values for different experimental conditions (Wetmur, 1991). In-depth analysis of DNA oligonucleotides and their corresponding measured Tm led to the conclusion that not only do the relative content of cytosine and guanine determine the thermal denaturation of DNA, but the sequential arrangement of different nucleotides within the DNA sequence also plays a significant role in the measured Tm value. Subsequently, Breslauer et al. (Proc. Natl. Acad. Sci. USA 1986, 83:3746-3750) adopted a thermodynamic property prediction method known as the "Nearest Neighbor (NN)" model (SantaLucia et al., Biochemistry 1996, 35:3555-3562; SantaLucia, Proc. Natl. Acad. Sci. USA 1998, 95:1460-1465). The above NN model utilizes nearest neighbor thermodynamic parameters, and several optimized NN tables of NN parameters have been published (Gotoh and Tagashira, 1981; Vologodskii et al., 1984; Breslauer et al., 1986; Delcourt and Blake, 1991; Doktycz et al., 1992; SantaLucia et al., 1996; Sugimoto et al., 1996; Allawi and SantaLucia, 1997). Using these, it is possible to calculate Tm or the △G of the corresponding oligonucleotide under specific temperature conditions.
[0012] As such, the thermodynamic properties of oligonucleotides are generally obtained by determining the base sequence of a fixed oligonucleotide and then calculating them using specific methods. In other words, to design an oligonucleotide with desired thermodynamic properties, one must first select a sequence and calculate its corresponding properties. Subsequently, a process of iteratively performing calculations on various sequences to select the most suitable one is required. This process of calculating thermodynamic properties for all possible sequences and deriving the optimal sequence is very time-consuming and causes problems by consuming significant computational resources.
[0013] Therefore, in order to drastically reduce the time and effort required in existing methods, it is necessary to develop a new method that can more efficiently obtain information on oligonucleotide sequences and their corresponding thermodynamic properties.
[0014]
[0015] The inventors have endeavored to develop a method for obtaining information on the predicted thermodynamic properties of oligonucleotides more rapidly and precisely. As a result, the inventors developed a method that utilizes adjacent base pair profiles at an intermediate stage to provide information on the thermodynamic properties of oligonucleotides when generating oligonucleotides within a specific desired length range, and confirmed that this allows for the more rapid and accurate acquisition of oligonucleotide sequences and information on their predicted thermodynamic properties.
[0016] Accordingly, the object of the present invention is to provide a method for providing an oligonucleotide of a specific length containing information on predicted thermodynamic properties.
[0017] Another object of the present invention is to provide a computer-readable recording medium comprising instructions for implementing a processor for executing a method of providing an oligonucleotide of a specific length containing information on predicted thermodynamic properties.
[0018] Another object of the present invention is to provide a computer program to be stored on a computer-readable recording medium, which implements a processor for executing a method of providing an oligonucleotide of a specific length containing information on predicted thermodynamic properties.
[0019] Another objective of the present invention is to provide a method for providing a tagging oligonucleotide used for the detection of a target nucleic acid molecule in a sample.
[0020] Other objects and advantages of the present invention will become more apparent from the following detailed description together with the appended claims and drawings.
[0021]
[0022] According to one aspect of the present invention, a method is provided for providing an oligonucleotide of a specific length containing predictive thermodynamic property information, comprising the following steps.
[0023] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0024] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0025] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and
[0026] (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
[0027] In one embodiment, the thermodynamic property may be an enthalpy change (△H), an entropy change (△S), a free energy change (△G), or a melting temperature (Tm).
[0028] In one embodiment, in step (b), an adjacent base pair profile can be generated through a search algorithm.
[0029] In one embodiment, the search algorithm may be any one of depth-first search, breadth-first search, dynamic programming, recursive backtracking, or divide and conquer.
[0030] In one embodiment, in step (b), all adjacent base pair profiles that can be derived from an oligonucleotide sequence satisfying the length range are generated; in step (c), all generated adjacent base pair profiles are assigned predictive thermodynamic properties; and in step (d), oligonucleotides corresponding to all adjacent base pair profiles assigned predictive thermodynamic property information are generated; thereby providing the sequences and predictive thermodynamic property information of all oligonucleotides satisfying the specified length range.
[0031] In one embodiment, in step (a) or step (c), a range of predicted thermodynamic properties of the oligonucleotide to be obtained may be further specified.
[0032] In one embodiment, in step (b), all adjacent base pair profiles that can be derived from an oligonucleotide sequence satisfying the length range may be generated; and in step (c), the step of filtering adjacent base pair profiles satisfying the range of specified predicted thermodynamic properties may be further included.
[0033] In one embodiment, in step (d), oligonucleotides can be generated only for adjacent base pair profiles satisfying an arbitrarily selected length within a specified length range.
[0034] In one embodiment, the method may further include the step of selecting a single adjacent base pair profile by calculating the predicted thermodynamic properties of some adjacent base pair profiles through an optimization algorithm based on a specified length range and a range of predicted thermodynamic properties.
[0035] In one embodiment, in step (b), one or more adjacent base pair profiles that can be derived from the sequence of an oligonucleotide satisfying the length range are generated, and in step (c), after providing predictive thermodynamic property information of the adjacent base pair profiles, adjacent base pair profiles satisfying the specified predictive thermodynamic property range can be filtered.
[0036]
[0037] In one embodiment, the single adjacent base pair profile may satisfy a length arbitrarily selected within a specified length range.
[0038] In one embodiment, the optimization algorithm may be any one of a greedy algorithm, a branch-and-bound method, Monte Carlo simulation, a genetic algorithm, Bayesian optimization, and restricted random sampling.
[0039] In one embodiment, in step (d), oligonucleotides can be derived from the generated profile through a search algorithm.
[0040] In one embodiment, the search algorithm may be any one of depth-first search, breadth-first search, dynamic programming, recursive backtracking, or divide and conquer.
[0041] In one embodiment, in step (c), the adjacent base pair thermodynamic parameter may include the nearest neighbor (NN) thermodynamic parameter or the next-nearest neighbor (NNN) thermodynamic parameter.
[0042] In one embodiment, in step (c), the adjacent base pair thermodynamic parameter may include parameters of enthalpy change (△H) and entropy change (△S).
[0043] In one embodiment, in step (c), the predicted thermodynamic properties can be obtained using any one selected from the group consisting of an enthalpy change (△H) for an adjacent base pair profile, an entropy change (△S) for an adjacent base pair profile, a parameter for correcting the entropy change, a parameter for correcting the Tm contribution due to the length of the oligonucleotide, and combinations thereof.
[0044] In one embodiment, the thermodynamic property can be calculated as Tm using the following Equation I.
[0045] <Food I>
[0046]
[0047] In the above equation, Tm is the melting temperature of the oligonucleotide; △H° is the sum of enthalpy changes; △S° is the sum of entropy changes; and m and n are constants.
[0048] In one embodiment, m can be 1000 and n can be 273.15.
[0049] In one embodiment, the oligonucleotide is an artificial base sequence used in a method for detecting a target nucleic acid sequence, and may be a nucleotide sequence that is unhybridized with the target nucleic acid sequence.
[0050] In one embodiment, the oligonucleotide may be a target nucleic acid dehybridization tagging site of a tagging oligonucleotide. Additionally, it may be a target nucleic acid and a tagging site dehybridization templated site included in an oligonucleotide comprising a capturing site to which the tagging site hybridizes.
[0051] According to another aspect of the present invention, a method is provided for providing a tagging oligonucleotide used for detecting a target nucleic acid molecule in a sample, comprising the following steps.
[0052] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0053] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0054] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles;
[0055] (d) generating oligonucleotides corresponding to adjacent base pair profiles to which predicted thermodynamic property information is assigned; and
[0056] (e) i) selecting a sequence containing any one of the generated oligonucleotides as a tagging site, and ii) selecting a sequence containing a hybridized oligonucleotide to a target nucleic acid sequence as a targeting site, and providing a tagging oligonucleotide containing the tagging site and the targeting site.
[0057] According to another aspect of the present invention, a computer-readable recording medium is provided that includes instructions for implementing a processor for executing a method of providing an oligonucleotide of a specific length containing predictive thermodynamic property information, comprising the following steps.
[0058] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0059] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0060] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and
[0061] (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
[0062] According to another aspect of the present invention, a computer program to be stored in a computer-readable recording medium is provided, which implements a processor for executing a method of providing an oligonucleotide of a specific length containing predicted thermodynamic property information, comprising the following steps.
[0063] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0064] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0065] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and
[0066] (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
[0067]
[0068] The features and advantages of the present invention will be summarized as follows:
[0069] (a) The present invention can provide an oligonucleotide of a specific length containing information on predictive thermodynamic properties in a more effective and reproducible manner by first generating an adjacent base pair profile and then generating an oligonucleotide sequence therefrom.
[0070] (b) The longer the sequence length of the desired oligonucleotide, the more significantly the time required to derive an oligonucleotide having optimal predicted thermodynamic properties can be reduced.
[0071] (c) In particular, when demanding thermodynamic properties are required, the utility of the present invention becomes more prominent as it can evaluate the potential for generating oligonucleotides with desired thermodynamic properties or rapidly and accurately derive the optimal sequence.
[0072] (d) According to the present invention, it is practical to develop a computer program for carrying out the method of the present invention, which provides an oligonucleotide of a specific length containing information on predicted thermodynamic properties.
[0073] (e) In addition, by utilizing the oligonucleotide generated by the method of the present invention to provide a probe (e.g., a tagging oligonucleotide), it can be usefully applied in the field of detecting target nucleic acid molecules in a sample.
[0074]
[0075] FIG. 1 is a flowchart illustrating one embodiment of the present invention for providing an oligonucleotide containing predictive thermodynamic information.
[0076] FIG. 2 is a flowchart illustrating another embodiment of the present invention for providing an oligonucleotide containing predictive thermodynamic information.
[0077] Figure 3 is a flowchart illustrating a comparison technique for providing oligonucleotides containing predicted thermodynamic information.
[0078] FIG. 4 is a flowchart illustrating another embodiment of the present invention for providing an oligonucleotide containing predictive thermodynamic information.
[0079] FIG. 5a is a flowchart schematically illustrating the process of providing an oligonucleotide according to Example 2 of the present specification.
[0080] FIG. 5b is a flowchart schematically illustrating the process of generating an adjacent base pair profile in the process of providing an oligonucleotide according to Example 2 of the present specification.
[0081] FIG. 5c is a flowchart schematically illustrating the process of generating an oligonucleotide sequence from an adjacent base pair profile in the process of providing an oligonucleotide according to Example 2 of the present specification.
[0082] FIG. 6 is a flowchart schematically illustrating the process of providing an oligonucleotide according to Comparative Example 2 of the present specification.
[0083] FIG. 7 is a flowchart schematically illustrating the process of providing an oligonucleotide according to Example 3 of the present specification.
[0084] FIG. 8 is a flowchart schematically illustrating the process of providing an oligonucleotide according to Comparative Example 3 of the present specification.
[0085] FIG. 9 is a flowchart schematically illustrating the process of providing an oligonucleotide according to Example 4 of the present specification.
[0086]
[0087] I. Method for providing oligonucleotides of a specific length containing information on predicted thermodynamic properties
[0088] According to one aspect of the present invention, a method is provided for providing an oligonucleotide of a specific length containing predictive thermodynamic property information, comprising the following steps.
[0089] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0090] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0091] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and
[0092] (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
[0093] The most significant feature of the present invention is that, rather than evaluating the thermodynamic properties of an oligonucleotide of a specific sequence, it derives an oligonucleotide possessing such thermodynamic property information by utilizing an adjacent base pair table containing thermodynamic property information. In the process of manufacturing artificial nucleic acid sequences—particularly those that hybridize directly to the target nucleic acid sequence in signal mechanisms generated to confirm the presence of a target nucleic acid in a sample—the inventors have previously verified the thermodynamic properties of all potential candidate oligonucleotides and then selected and manufactured an oligonucleotide suitable for the required conditions. However, verifying the thermodynamic properties of oligonucleotides with countless base sequence combinations was time-consuming and somewhat inefficient due to the inability to control unnecessary overlapping sequences. In contrast, the method of the present invention solves this problem by deriving the nucleic acid sequence of an oligonucleotide from a table that already contains thermodynamic property information, rather than verifying the thermodynamic properties of the oligonucleotide.
[0094]
[0095] Step (a): Specify the length range of the oligonucleotides to be obtained
[0096] First, the length range of the oligonucleotide to be obtained is specified. The said length range is defined based on the number of bases of the oligonucleotide and covers a range starting from an oligonucleotide having at least two bases up to a maximum number of bases set according to a specific purpose. The said length range can be appropriately adjusted depending on the reaction environment or application field to which the generated oligonucleotide will be applied.
[0097] As used herein, the term "oligonucleotide" refers to a linear oligomer of natural or modified monomers or linkages, including deoxyribonucleotides, ribonucleotides, etc., capable of specifically hybridizing to a target nucleic acid sequence, and is naturally occurring or artificially synthesized. Oligonucleotides are particularly single-stranded for maximum hybridization efficiency. Specifically, oligonucleotides are oligodeoxyribonucleotides. The oligonucleotides of the present invention may include naturally occurring dNMPs (i.e., dAMP, dGMP, dCMP, and dTMP), nucleotide analogs, or derivatives. Additionally, oligonucleotides may also include ribonucleotides. For example, the oligonucleotides of the present invention may include backbone-modified nucleotides, such as peptide nucleic acid (PNA) (M. Egholm et al., Nature, 365:566-568 (1993)), phosphorothioate DNA, phosphodithioate DNA, phosphoramidate DNA, amide-linked DNA, MMI-linked DNA, 2'-O-methyl RNA, alpha-DNA and methylphosphonate DNA, sugar-modified nucleotides, e.g., 2'-O-methyl RNA, 2'-fluoro RNA, 2'-amino RNA, 2'-O-alkyl DNA, 2'-O-allyl DNA, 2'-O-alkynyl DNA, hexose DNA, pyranosyl RNA and anhydrohexitol DNA, and nucleotides having base modifications, e.g., C-5 substituted pyrimidines (substituents are fluoro-, bromo-, chloro-, iodo-, methyl-, ethyl-, vinyl-, formyl-, ethityl-, propynyl-, alkynyl-, It may include thiazoryl-, imidazoyl-, pyridyl-, 7-deazpurine having a C-7 substituent (substituents are fluoro-, bromo-, chloro-, iodo-, methyl-, ethyl-, vinyl-, formyl-, alkynyl-, alkenyl-, thiazoryl-, imidazoyl-, pyridyl-), inosine, and diaminopurine.
[0098] In particular, the term “oligonucleotide” as used herein refers to a single strand composed of deoxyribonucleotides. The term “oligonucleotide” includes an oligonucleotide that hybridizes with a cleavage fragment occurring dependently on the target nucleic acid sequence.
[0099] In this specification, “target nucleic acid sequence” refers to a nucleic acid sequence to be amplified or detected using the tagging oligonucleotide of the present invention, and may be briefly denoted herein as “target” or “target sequence.” The target nucleic acid sequence may be double-stranded or single-stranded. If the target nucleic acid sequence is double-stranded, each strand may be named the forward strand or the reverse strand. Alternatively, it may be named the (+) strand (coding strand, sense strand, non-template strand) or the (-) strand (non-coding strand, antisense strand, template strand).
[0100] The oligonucleotides used herein are generally less than 200 nucleotides in length, particularly less than 150 nucleotides, more particularly less than 100 nucleotides, more particularly less than 50 nucleotides, and more particularly less than 30 nucleotides.
[0101] The term “sequence” as used herein refers to a specific arrangement order of monomers within a macromolecule. The term “nucleic acid sequence” as used herein refers to the arrangement order of nucleotides within a nucleic acid molecule, representing a nucleic acid molecule as a specific nucleic acid sequence.
[0102] The terms "nucleic acid sequence" or "nucleic acid sequence data" as used herein refer to the arrangement order of nucleotides within a nucleic acid molecule or information regarding the arrangement order of nucleotides within a nucleic acid molecule, and may be used interchangeably. The term "nucleic acid sequence data set" refers to a collection of said nucleic acid sequence data, and said nucleic acid sequence data set may be provided in the form of a list of nucleic acid sequence data or an alignment file.
[0103] The length of an oligonucleotide is a factor that directly influences its thermodynamic properties. For example, longer sequences tend to have higher binding stability and melting temperatures due to more bindable base pairs and strong stacking interactions. Therefore, in step (a) above, the length range of the oligonucleotide of interest is specified first, and in subsequent steps, other factors such as base sequence are considered in relation to thermodynamic property information.
[0104] As used herein, the term "predicted thermodynamic property information" refers to information predicting thermodynamic properties related to the double helix formation formed by an oligonucleotide sequence under specific conditions (temperature, salt concentration, etc.). In one embodiment, the predicted thermodynamic property information may include enthalpy change (△H), entropy change (△S), free energy change (△G), or melting temperature (Tm), etc. Such information may be calculated through mathematical modeling or simulation based on adjacent base pair prediction methodologies.
[0105] The aforementioned predicted thermodynamic property information collectively refers to information utilized as a standard or result for oligonucleotide sequence design. For example, the predicted thermodynamic property information evaluates the characteristics of complementary hybridization of oligonucleotide sequences and can be used as a design standard to minimize non-specific binding.
[0106]
[0107] Step (b): Generate one or more adjacent base pair profiles for the sequences of oligonucleotides satisfying the above length range.
[0108] Based on the length range specified in step (a) above, adjacent base pair profiles for all possible cases of an oligonucleotide of that length are generated.
[0109] The term "adjacent base pair profile" as used herein refers to the result of calculating the number of adjacent base pairs within an oligonucleotide sequence. This is the result of counting how many times each base pair appears based on combinations of adjacent base pairs within a specific sequence. For example, counting may be based on various combinations, such as adjacent base pairs, base pairs every other base pair, or three or more consecutive base pairs.
[0110] The aforementioned adjacent base pair profile itself does not directly contain any numerical values or values related to thermodynamic properties, but can provide the frequency or distribution of specific reference base pairs within the sequence. The adjacent base pair profile is subsequently used as basic data for predicting the thermodynamic properties of oligonucleotides by applying corresponding thermodynamic parameters, and plays a key role in providing information on predicted thermodynamic properties in the present invention.
[0111] In one embodiment, in step (b), one or more adjacent base pair profiles may be generated through a search algorithm. In this step, based on the length information of a specified oligonucleotide sequence, adjacent base pair profiles satisfying the corresponding length are derived through the search algorithm. In this process, all possible arrangements of adjacent base pairs suitable for the corresponding length are generated by considering the types and number of bonds that each base pair can form. At this time, the algorithm calculates the number of base pair bonds satisfying the given length and can derive all possible solutions. Additionally, due to the characteristics of the algorithm, for example, if an optimization algorithm is used, only some profiles may be generated, and an additional profile generation step may be performed by evaluating the suitability of the generated profiles. The generated adjacent base pair profiles provide possible base pair combinations within a sequence of a specific length and are utilized as basic data for predicting thermodynamic properties in subsequent steps.
[0112] In one embodiment, the search algorithm may be any one of Depth-First Search (DFS), Breadth-First Search (BFS), Dynamic Programming (DP), Recursive Backtracking, and Divide and Conquer, and various other algorithms capable of achieving the same purpose may also be applied.
[0113] Each algorithm has its own advantages and characteristics in generating adjacent base pair profiles. Graph traversal algorithms, such as Depth-First Search (DFS) and Breadth-First Search (BFS), explore the connections between nodes in a graph, enabling efficient exploration of all nodes in the graph.
[0114] For example, Depth-First Search sequentially explores all possible combinations of adjacent base pairs, continuously expanding each selection; if a path is no longer valid, the search is halted, and the process returns to the previous step to explore other paths. By applying pruning techniques during this process, the search space can be efficiently reduced by early blocking of paths that are no longer valid (e.g., predicted to overlap with previously reviewed adjacent base pair profiles) or cannot be optimal.
[0115] As another example, dynamic programming (DP), when combined with memoization techniques, can reduce redundant calculations and efficiently generate adjacent base pair profiles. For instance, when calculating the number of possible combinations of specific base pairs in an oligonucleotide of a given length, if a specific combination has already been calculated, the value can be reused to avoid unnecessary calculations and prevent redundancy, thereby significantly improving the overall search speed.
[0116] These algorithms play an important role in generating adjacent base pair profiles and deriving optimal oligonucleotide sequences based on their different strengths and characteristics.
[0117]
[0118] Step (c): Assigning predicted thermodynamic property information to each adjacent base pair profile using adjacent base pair thermodynamic parameters
[0119] In this step, for some or all of the generated adjacent base pair profiles, predicted thermodynamic characteristic information for each profile is assigned using adjacent base pair thermodynamic parameters.
[0120] The term "adjacent base pair thermodynamic parameter" as used herein refers to a value representing the thermodynamic properties formed by adjacent base pairs within an oligonucleotide sequence (e.g., adjacent base pairs, base pairs one step apart, base pairs of four consecutive base pairs, etc., which are related by predetermined criteria). This value mainly includes the binding strength, binding stability, enthalpy (△H), and entropy (△S) changes between two specific base pairs, and provides information that can numerically evaluate the thermodynamic properties of the binding and double helix structure of the corresponding base pairs.
[0121] As used herein, "assigns information on predicted thermodynamic properties" may be used interchangeably with "obtains information on predicted thermodynamic properties" or "calculates information on predicted thermodynamic properties."
[0122] In one embodiment, in step (c), the adjacent base pair thermodynamic parameter may include the nearest neighbor (NN) thermodynamic parameter or the next-nearest neighbor (NNN) thermodynamic parameter.
[0123] As used herein, regarding adjacent base pairs, “nearest neighbor (NN)” refers to a sequence consisting of two adjacent nucleotides (dinucleotides) in an oligonucleotide. The terms may be used interchangeably with “nearest neighbor,” “nearest neighbor base pair,” or “nearest neighbor pair.”
[0124] For example, the nearest base pairs 'AT / TA' and 'GC / CG' form complementary bonds but have different thermodynamic properties, which also affects predicted values such as the melting temperature (Tm). NN models are used to more precisely predict whether a specific sequence forms a double helix and its stability by integrating the combination and positional information of these adjacent base pairs.
[0125] For example, in the case of an oligonucleotide sequence consisting of 13 bases, 5'-ATTGCTTGCTTCG-3', the total number of nearest neighbor (NN) sequences is 7, namely "AT", "TT", "TG", "GC", "CT", "TC", and "CG". Note that in this case, "TT" appears with a frequency of 3, "TG" appears with a frequency of 2, "GC" appears with a frequency of 2, and "CT" appears with a frequency of 2.
[0126] There are 16 possible nearest neighbor (NN) sequences (4 bases x 4 bases), such as "AA", "AT", "AG", "AC", "TA", "TT", "TG", "TC", "GA", "GT", "GG", "GC", "CA", "CT", "CG", and "CC". However, it is known that "AA", "CA", "GT", "CT", "GA", and "GG" have the same NN parameter values as "TT", "TG", "AC", "AG", "TC", and "CC", respectively. Therefore, the total number of nearest neighbor (NN) sequences for which parameter values will be determined is 10.
[0127] Meanwhile, the adjacent base pair thermodynamic parameters used in this invention are not limited to NN models, and any predictive model capable of providing information to numerically evaluate the thermodynamic properties between adjacent base pairs of oligonucleotides, such as interactions occurring between predefined adjacent base pairs, kinetic properties in the double helix formation and dissociation processes, and statistical thermodynamic calculation results regarding the probability of double helix formation under specific conditions, can be applied without limitation.
[0128] The next nearest neighbor (NNN) model extends the range of base pair interactions compared to the NN model by skipping one intermediate base pair to apply thermodynamic parameters for base pair interactions. Although this may consume more computational resources, it can support thermodynamic parameters that provide more precise predictive thermodynamic property information.
[0129] In one embodiment, in step (c), the adjacent base pair thermodynamic parameter may include parameters of enthalpy change (△H) and entropy change (△S).
[0130] In one embodiment, in step (c), the predicted thermodynamic properties can be obtained using any one selected from the group consisting of an enthalpy change (△H) for an adjacent base pair profile, an entropy change (△S) for an adjacent base pair profile, a parameter for correcting the entropy change, a parameter for correcting the Tm contribution due to the length of the oligonucleotide, and combinations thereof.
[0131] In one embodiment, the thermodynamic property may be Tm. The specified Tm calculation formula refers to any formula based on an NN model using thermodynamic parameters. The formula may be known in the art or a variation thereof. The predicted Tm of the oligonucleotide may vary depending on the formula used.
[0132] In one embodiment, Tm, as predictive thermodynamic information that the oligonucleotide produced herein may contain, can be calculated by the following Equation I.
[0133] <Food I>
[0134]
[0135] In the above equation, Tm is the melting temperature of the oligonucleotide; △H° is the sum of enthalpy changes; △S° is the sum of entropy changes; and m and n are constants.
[0136] In the above equation, △H° can be calculated by the sum of each NN parameter △H°NN as shown in Equation II below, and △S° can be calculated by the sum of each NN parameter △S°NN.
[0137] <Equation II>
[0138] (each ), (each )
[0139] In Equation I, m can be 1000 and n can be 273.15, but m and n can be adjusted by the user. The Tm calculation formula can be found in the literature [SantaLucia, J. Jr (2007) Physical principles and visual-OMP software for optimal PCR design. Methods Mol. Biol., 402, 3-34].
[0140] In one specific embodiment, Equation I can also be expressed as Equation I-1.
[0141] <Equation I-1>
[0142]
[0143] In one embodiment, the Tm calculation formula includes parameters of enthalpy change (△H) and entropy change (△S) for each nearest neighbor (NN) sequence, and one or more additional parameters.
[0144] In one embodiment, the one or more additional parameters include a parameter for correcting (or supplementing, modifying) the entropy change and / or a parameter for correcting (or supplementing, modifying) the Tm contribution due to the length of the oligonucleotide.
[0145] In a specific implementation, the Tm calculation formula is represented by the following Equation III.
[0146] <Equation III>
[0147]
[0148] In the above equation, Tm is the melting temperature of the oligonucleotide; △H° is the sum of enthalpy changes; △S° is the sum of entropy changes; α is an additional parameter for entropy correction; and m and n are constants. The value of the additional parameter may depend on the reaction environment.
[0149] In the above equation, △H° and △S° can be calculated as described above.
[0150] In one embodiment, m is 1000 and n is 273.15. m and n can be adjusted by the user.
[0151] In one specific embodiment, Equation III may also be expressed as Equation III-1.
[0152] <Equation III-1>
[0153]
[0154] In another embodiment, the Tm calculation formula can be expressed as the following Equation IV.
[0155] <Equation IV>
[0156]
[0157] In the above equation, Tm is the melting temperature of the oligonucleotide; △H° is the sum of enthalpy changes; △S° is the sum of entropy changes; β is an additional parameter for correcting the contribution of Tm by the length of the oligonucleotide; length is the length of the oligonucleotide; and m and n are constants.
[0158] In this equation, the above β is a parameter to reflect the effect of the reaction environment on the length of the oligonucleotide and may be dependent on the reaction environment.
[0159] In the above equation, △H° and △S° can be calculated as described above.
[0160] In one embodiment, m is 1000 and n is 273.15. m and n can be adjusted by the user.
[0161] In one specific embodiment, Equation IV may also be expressed as Equation IV-1.
[0162] <Equation IV-1>
[0163]
[0164] In another embodiment, the Tm calculation formula can be expressed as the following formula V.
[0165] <Essence V>
[0166]
[0167] In the above equation, Tm is the melting temperature of the oligonucleotide; △H° is the sum of enthalpy changes; △S° is the sum of entropy changes; α is an additional parameter for correcting entropy changes; β is an additional parameter for correcting the contribution of Tm by the length of the oligonucleotide; length is the length of the oligonucleotide; and m and n are constants.
[0168] In one embodiment, m is 1000 and n is 273.15. m and n can be adjusted by the user.
[0169] In one specific embodiment, Equation V can also be expressed as Equation V-1.
[0170] <Equation V-1>
[0171]
[0172] The formula for calculating Tm can be selected by those skilled in the art, and it should be understood that various formulas may be used in addition to the formula described above. For establishing such a formula for calculating Tm, that is, for determining the parameter values in the formula for calculating Tm, refer to WO No. 2020 / 005019, which relates to an invention providing parameters optimized for various reaction environments.
[0173] All of the NN thermodynamic parameters that are appropriately corrected to reflect each reaction environment according to the table of various published NN thermodynamic parameters (Breslauer et al. (1986) Proc Natl Acad Sci USA 83: 3746-3750; Sugimoto et al. (1996), Nuc Acids Res 24: 4501-4505; Allawi and Santa Lucia Biochemistry 36: 10581-10594; SantaLucia & Hicks (2004), Annu. Rev. Biophys. Biomol. Struct 33: 415-440) and the method proposed in WO No. 2020 / 005019 can be applied to the present invention.
[0174] In this step, for each adjacent base pair profile generated in step (b), adjacent base pair thermodynamic parameters provided in various known ways may be applied to each adjacent base pair profile to assign unique predicted thermodynamic property information to each adjacent base pair profile. For example, if the adjacent base pair profile generated in step (b) is the nearest base pair profile, any one of the known NN thermodynamic parameter tables may be applied to calculate unique predicted thermodynamic property information for each adjacent base pair profile. This involves quantifying and representing the enthalpy change (△H), entropy change (△S), free energy change (△G), or melting temperature (Tm), etc., that the oligonucleotide sequence having the corresponding profile may exhibit for each adjacent base pair profile.
[0175] Consequently, this step utilizes thermodynamic models for adjacent base pairs and related databases to quantify the characteristics of each adjacent base pair profile, and based on this, enables the prediction and design of sequences expected to have the same predicted thermodynamic characteristics.
[0176]
[0177] Step (d): Generate oligonucleotides corresponding to adjacent base pair profiles with assigned predictive thermodynamic property information.
[0178] In this step, oligonucleotide sequences satisfying adjacent base pair profiles containing information on predicted thermodynamic properties are designed. The adjacent base pair profile serves as a direct condition for oligonucleotide design and is utilized as data from which the predicted thermodynamic properties of the profile are inferred from the generated sequence.
[0179] Various methods can be applied to the process of finding sequences that meet the conditions of sequence length elements and base pair combination pattern elements defined in a specific adjacent base pair profile.
[0180] In one embodiment, in step (d), oligonucleotides can be derived from the generated profile through a search algorithm. In one embodiment, the search algorithm may be any one of depth-first search, breadth-first search, dynamic programming, recursive backtracking, or divide and conquer.
[0181] For example, utilizing depth-first search can increase efficiency by sequentially adding possible options for each base pair position and early excluding paths that do not satisfy the conditions. In the case of dynamic programming, the optimal sequence can be derived by solving profile conditions by breaking them down into subproblems. These search methods enhance the efficiency of the generation process while enabling sequence design that strictly adheres to the profile conditions.
[0182] The derived oligonucleotide sequences are generated based on adjacent base pair profiles, which contain information capable of predicting thermodynamic properties derived from the profiles themselves. In other words, rather than directly verifying or utilizing thermodynamic properties during the sequence design process, information regarding the predicted thermodynamic properties of the generated sequences is naturally confirmed.
[0183] Furthermore, after verifying whether the generated oligonucleotides satisfy the conditions of an adjacent base pair profile, their thermodynamic properties can be further evaluated as needed. For example, depending on the sequence length, anywhere from one to thousands or millions of oligonucleotides satisfying a single adjacent base pair profile can be generated. These can all be stored in a database, and some oligonucleotides can be selected based on the desired reaction environment, probe design conditions, etc. The method used to evaluate whether the designed sequence satisfies specific characteristics such as desired stability and Tm can be applied appropriately depending on the intended reaction mechanism.
[0184] The term “providing” as used in this specification when referring to oligonucleotides includes providing the sequence of an oligonucleotide and / or preparing an oligonucleotide material.
[0185]
[0186] FIG. 1 is a flowchart illustrating an embodiment of the present invention for providing an oligonucleotide containing predictive thermodynamic information. The method of the present invention will be described in detail with reference to FIG. 1 as follows.
[0187] According to one embodiment (100) of the present invention, in step (a), a length range of oligonucleotides to be obtained is specified (110), in step (b), an adjacent base pair profile is generated for a sequence of oligonucleotides satisfying the length range (120), in step (c), for the adjacent base pair profiles generated, prediction thermodynamic characteristic information for each profile is assigned using adjacent base pair thermodynamic parameters (130), and in step (d), an oligonucleotide corresponding to the adjacent base pair profile to which prediction thermodynamic characteristic information is assigned is generated (140).
[0188] Specifically, in step (b), all adjacent base pair profiles that can be derived from an oligonucleotide sequence satisfying the length range are generated; in step (c), all generated adjacent base pair profiles are assigned predictive thermodynamic properties; and in step (d), oligonucleotides corresponding to all adjacent base pair profiles assigned predictive thermodynamic property information can be generated.
[0189] For example, if the sequence length of the oligonucleotide is specified as 5 in step (a) above, according to the present embodiment, 742 adjacent base pair profiles are generated for all combinations in which there are 5 bases, that is, 4 nearest neighbor (NN) base pairs, and after the predicted thermodynamic property information for the 742 profiles is calculated, the possible oligonucleotide sequences for each profile are calculated, thereby providing 1,024 oligonucleotide sequences and their predicted thermodynamic property information.
[0190] In this regard, the sequence generation amount and nearest neighbor base pair profile generation amount according to specific length specification were compared and are shown in Table 1 below:
[0191] Sequence Length Sequence Generation Amount NN Profile Generation Amount (NN Profile / Sequence) % 5 1,024 742 72.5% 64,096 2,332 56.9% 716,3846,658 40.6% 865,536 17,444 26.6% 926 2,1444 2,293 16.1% 101,048,576 95,924 9.1% 114,194,304 205,286 4.9% 1216,777,216 417,812 2.5% 1367,108,864 813,800 1.2% 14268,435,456 1,525,188 0.6% 151,073,741,824 2,762,514 0.3%
[0192]
[0193] Generally, to determine the thermodynamic properties of oligonucleotides, sequences must first be generated, adjacent base pair profiles of the generated sequences must be counted, and thermodynamic properties such as Tm must be calculated by applying various thermodynamic parameters to each combination. However, as shown in Table 1 above, under specific length conditions, the amount of sequence generated increases exponentially as the length increases, and consequently, predicting thermodynamic properties for all possible cases requires excessive time and resources.
[0194] In contrast, the generation of nearest neighbor base pair profiles that first satisfy the length condition used in one embodiment of the present invention yields a significantly smaller amount compared to the sequence, and this difference becomes more pronounced as the length increases. For example, when considering oligonucleotides of 20 bases or more, the gap between the number of possible sequences and the number of profiles can be millions of times larger. Therefore, when it is necessary to predict thermodynamic property information for oligonucleotides of a specific length, deriving the sequence according to one embodiment of the present invention allows the work to be performed much more efficiently. Based on this efficiency, the methodology of the present invention presents a new paradigm for designing oligonucleotides of a specific length and predicting their thermodynamic properties.
[0195]
[0196] FIG. 2 is a flowchart illustrating each step of a method (200) according to another embodiment of the present invention, and FIG. 3 is a flowchart illustrating a comparative technique for providing an oligonucleotide containing predicted thermodynamic information. Another embodiment of the present invention will be described as follows with reference to FIG. 2 and FIG. 3.
[0197] According to one embodiment (200) of the present invention, in step (a), a length range and a predicted thermodynamic property range of the oligonucleotide to be obtained are specified (210), in step (b), adjacent base pair profiles are generated for the sequence of the oligonucleotide satisfying the length range (220), in step (c), adjacent base pair profiles are given predicted thermodynamic property information for each profile using adjacent base pair thermodynamic parameters (230), in an additional step, adjacent base pair profiles satisfying the specified range of predicted thermodynamic property are filtered (240), and in step (d), an oligonucleotide corresponding to the adjacent base pair profile to which the predicted thermodynamic property information is given is generated (250).
[0198] The technical feature of this embodiment is that it pre-sets a specific range of predicted thermodynamic properties of interest, filters adjacent base pair profiles based on this range, and ultimately generates only oligonucleotide sequences that satisfy the corresponding conditions. Through this, the user can quickly and accurately design sequences that meet the desired range of thermodynamic properties. For example, if oligonucleotides with a specific Tm range are required, this methodology allows for the efficient generation of only those sequences from filtered adjacent base pair profiles.
[0199] According to Figure 2 above, in step (a) of the present invention, a range of desired predicted thermodynamic properties is specified. This specification is intended to optimize the entire design process to specific thermodynamic property conditions.
[0200] In one embodiment, in step (a) or step (c), a range of predicted thermodynamic properties of the oligonucleotide to be obtained may be further specified. After deriving adjacent base pair profiles satisfying the specified length range and assigning predicted thermodynamic properties, the predicted thermodynamic properties of interest may be specified.
[0201] Additionally, in step (b), all adjacent base pair profiles that can be derived from an oligonucleotide sequence satisfying the length range are generated; and in step (c), the method may further include the step of filtering adjacent base pair profiles that satisfy the range of specified predicted thermodynamic properties. By filtering only the adjacent base pair profiles that satisfy the thermodynamic properties, unnecessary calculations during the sequence design process are minimized, thereby enabling the efficient design of an oligonucleotide sequence with the desired thermodynamic properties.
[0202] On the other hand, referring to FIG. 3, the comparison technique (10) proceeds by specifying the length and thermodynamic property information of the desired sequence (1), generating all sequences that satisfy the length condition (2), generating adjacent base pair profiles for all generated sequences (3), assigning predicted thermodynamic property information using a unique adjacent base pair profile for each sequence (4), and then obtaining only the sequences with the thermodynamic property information of interest through filtering (5) (6). This method generates all sequences in batches and evaluates each one, which is in stark contrast to the improved method of the present invention, as it involves excessive unnecessary calculations and low work efficiency.
[0203] In one embodiment, in step (d), oligonucleotides can be generated only for adjacent base pair profiles that satisfy an arbitrarily selected length within a specified length range. For example, by specifying a specific interval length range as an initial setting, obtaining adjacent base pair profiles satisfying this range and information on their predicted thermodynamic properties, oligonucleotides can be generated only for adjacent base pair profiles with a length corresponding to an arbitrarily selected value.
[0204] In addition, all adjacent base pair profiles satisfying a specific length range can be obtained, their thermodynamic information can be predicted, adjacent base pair profiles corresponding to the desired thermodynamic information can be filtered, and only base pair profiles satisfying an arbitrary length value can be selected and designated as targets for deriving oligonucleotides.
[0205] Such selective sequencing strategies are advantageous when dealing with large amounts of sequence data, particularly in sequence design involving specific thermodynamic property information.
[0206]
[0207] FIG. 4 is a flowchart illustrating each step of a method (300) according to another embodiment of the present invention. The method of the present invention will be described with reference to FIG. 4 as follows:
[0208] According to one embodiment (300) of the present invention, in step (a), a length range and a predicted thermodynamic property range of the oligonucleotide to be obtained are specified (310), in step (b), one or more adjacent base pair profiles are generated for the sequence of the oligonucleotide satisfying the length range (320), in step (c), based on the specified length range and the range of predicted thermodynamic properties, predicted thermodynamic property information is assigned to only some adjacent base pair profiles through an optimization algorithm, and further, a single adjacent base pair profile is selected (330), and in step (d), an oligonucleotide corresponding to the adjacent base pair profile to which the predicted thermodynamic property information is assigned is generated (340).
[0209] The technical feature of this embodiment is that it derives a single solution predicted to possess desired thermodynamic properties without calculating all predicted thermodynamic property information for all generated adjacent base pair profiles. Therefore, this embodiment has the advantage of enabling the design of oligonucleotides having target thermodynamic properties more economically and effectively.
[0210] In one embodiment, the method may further include a step of selecting a single adjacent base pair profile by calculating the predicted thermodynamic properties of some adjacent base pair profiles through an optimization algorithm based on a specified length range and a range of predicted thermodynamic properties. For example, in step (b), one or more adjacent base pair profiles that can be derived from the sequence of an oligonucleotide satisfying the length range may be generated, and in step (c), after providing information on the predicted thermodynamic properties of the adjacent base pair profiles, adjacent base pair profiles satisfying the specified range of predicted thermodynamic properties may be filtered.
[0211] In one embodiment, the single adjacent base pair profile may satisfy a length arbitrarily selected within a specified length range. The single adjacent base pair profile thus selected satisfies a specific length arbitrarily selected within the specified length condition and is designed to derive results close to the desired thermodynamic properties through an efficient search process.
[0212] In one embodiment, the optimization algorithm may be any one of a greedy algorithm, a branch-and-bound method, Monte Carlo simulation, a genetic algorithm, Bayesian optimization, or restricted random sampling, and is not limited thereto as long as it is a method capable of selecting adjacent base pair profiles that satisfy the desired predictive thermodynamic information and length conditions.
[0213] For example, when using restricted random sampling as an optimization algorithm, adjacent base pair profiles corresponding to a specified sequence length range can be randomly generated. The suitability of the generated adjacent base pair profiles is evaluated to determine if they satisfy specified predicted thermodynamic property information, and this process can be repeated until a specific adjacent base pair profile is selected. In this process, one or more adjacent base pair profiles may be generated, and the maximum limit may reach the total number of possible profiles within the corresponding sequence length range.
[0214] Additionally, when using a genetic algorithm as the optimization algorithm, the number of adjacent base pair profiles can be set based on an arbitrary parameter for the initial population. In this case, the maximum number of profiles in the initial population can reach all possible profiles within the corresponding sequence length range.
[0215] This approach can effectively identify a single thermodynamic profile satisfying desired thermodynamic properties without a complete search, making it even more useful for saving time and computational resources and implementing an optimized design process.
[0216]
[0217] In one embodiment, the oligonucleotide is an artificial base sequence used in a method for detecting a target nucleic acid sequence, and may be a nucleotide sequence that is unhybridized with the target nucleic acid sequence.
[0218] As used herein, the term “non-hybridized nucleotide” includes not only sequences that are completely non-complementary to a target nucleic acid sequence, but also sequences that are sufficient not to hybridize to the target nucleic acid sequence under certain strict conditions. For example, the term “non-hybridized nucleotide sequence” may include one or more complementary nucleotides (i.e., matches). For example, it may include 1-2, 1-3, 1-4, or 1-6 complementary nucleotides, depending on the total length. Accordingly, the term “non-hybridized nucleotide sequence” has a different meaning from the term “perfectly non-complementary.”
[0219] These artificial base sequences are designed with a priority on ensuring that nucleotides of the required length possess specific thermodynamic properties, and can be selected and utilized for their non-hybridization with target nucleic acid sequences. This reduces the possibility of non-specific binding or false positives that may occur with conventional hybridization methods when detecting target nucleic acids in samples, and provides a more precise signal generation mechanism.
[0220] The process of selecting sequences that do not hybridize with a target nucleic acid sequence among the oligonucleotides obtained through the method of the present invention can be performed through a method referred to as "specificity evaluation." The specificity evaluation process may include the step of searching for homologous sequences in a known nucleotide sequence database (e.g., GenBank) using any sequence alignment algorithm or program (e.g., BLAST) for the oligonucleotides generated according to the method of the present invention (homology search), and the step of analyzing the generated homologous sequences to confirm whether the generated oligonucleotides do not hybridize with the desired target nucleic acid sequence. Additionally, oligonucleotides having characteristics that do not participate in the amplification mechanism of the target nucleic acid itself to be detected can be selected by referring to various known methods.
[0221] The oligonucleotide provided in the present invention may be a base sequence that hybridizes with a cleavage fragment occurring dependently on the target nucleic acid sequence, and as it is selected based on optimized thermodynamic property information, it can contribute to reliable detection without direct interaction with the target nucleic acid sequence.
[0222] In one embodiment, the oligonucleotide may be a target nucleic acid nonhybridization tagging site of a tagging oligonucleotide. Additionally, the tagging site may be a target nucleic acid and a tagging site nonhybridization templated site included in an oligonucleotide comprising a capturing site to which the tagging site hybridizes.
[0223] The term “tagging oligonucleotide” as used herein refers to an oligonucleotide comprising a targeting site containing a hybridized nucleotide sequence in a target nucleic acid sequence and a tagging site containing a non-hybridized nucleotide sequence in a target nucleic acid sequence. Specifically, a tagging oligonucleotide refers to an oligonucleotide comprising, in a hybridization reaction with a target nucleic acid sequence, a site that hybridizes with the target nucleic acid sequence to form a double strand and a site that does not hybridize with the target nucleic acid sequence and has a single strand. More specifically, under stringent conditions, the targeting site of a tagging oligonucleotide is a site containing a complementary nucleotide that hybridizes with the target nucleic acid sequence, and the tagging site is a site containing a non-complementary nucleotide sequence that does not hybridize with the target nucleic acid sequence.
[0224] The term "tagging site" as used herein refers to a cleavage fragment that occurs dependently on the presence of a target nucleic acid and can be provided in various ways, such as WO 2012 / 096523 or Korean Patent Application No. 10-2017-0049845.
[0225] Tagging oligonucleotides are useful as probes and primers and can be used in various analyses.
[0226] When a tagging oligonucleotide hybridizes to a target nucleic acid sequence, its targeting site is included in the hybridization, but the tagging site is not hybridized to the target nucleic acid sequence and may form a single strand. As such, oligonucleotides containing both single-stranded and double-stranded structures can be cleaved using an enzyme having 5' nuclease activity by various techniques known in the art. A number of prior art techniques can be used for such cleavage reactions, releasing a fragment containing the tagging site or a portion of the tagging site.
[0227] The above-mentioned tagging site may be required to possess specific thermodynamic properties by considering the characteristics and reaction environment of the target nucleic acid sequence and other coexisting oligonucleotides. Since the oligonucleotides generated by the method of the present invention allow for the verification of predicted thermodynamic property information, they are useful when utilized as tagging sites as they can be appropriately provided to meet the desired thermodynamic property conditions.
[0228] Additionally, the oligonucleotide provided according to the present invention may be a target nucleic acid and a tagging site non-hybridizing templated site included in an oligonucleotide comprising a capturing site to which the tagging site is hybridized.
[0229] The oligonucleotide comprising the capturing site and the templating site used herein is a template in which a cleavage fragment (tagging site) can hybridize and extend when said fragment occurs in a manner dependent on the presence of a target nucleic acid; specifically, it serves as a template for extending the tagging site released from said tagging oligonucleotide. The tagging site, acting as a primer, hybridizes to said oligonucleotide and extends to form an extended dimer.
[0230] The above-mentioned templated site allows an oligonucleotide containing it to serve as a template for the tagging site, and at the same time, may include any sequence that is not complementary to the tagging site and the target nucleic acid. Therefore, an appropriate artificial sequence can be designed by considering the extended dimer formed as the tagging site hybridizes.
[0231] As used herein, the term “extension dimer” refers to a dimer formed by an extension reaction in which a fragment hybridized to the capturing site is extended using a template-dependent nucleic acid polymerase and the template-dependent nucleic acid polymerase as a template, in an oligonucleotide comprising a capturing site and a templating site.
[0232] Meanwhile, the oligonucleotide comprising the above-mentioned capturing site and templating site may include a single label or an interactive dual label.
[0233] A single label system is capable of detecting a target signal by causing a change in signal level according to the formation and melting of an extension dimer, and the label may be present in a template site, a capturing site, or within the extension dimer. The single label includes fluorescent labels, luminescent labels, chemiluminescent labels, electrochemical labels, and metal labels. Preferably, the single label includes a fluorescent label. Preferably, the single fluorescent label includes JOE, FAM, TAMRA, ROX, and fluorescein-based labels.
[0234] An interactive labeling system is a signaling system in which energy is transferred non-radioactively between a donor molecule and an acceptor molecule. As a representative example of an interactive labeling system, the FRET (fluorescence resonance energy transfer) labeling system comprises a fluorescent reporter molecule (donor molecule) and a quenching molecule (acceptor molecule). In FRET, the energy donor is fluorescent, while the energy acceptor may be fluorescent or non-fluorescent. In another form of an interactive labeling system, the energy donor is non-fluorescent, e.g., a chromophore, and the energy acceptor is fluorescent. In yet another form of an interactive labeling system, the energy donor is luminescent, e.g., bioluminescent, chemiluminescent, or electrochemiluminescent, and the acceptor is fluorescent. The donor molecule and the acceptor molecule may be described in the present invention as the reporter molecule and the quencher molecule, respectively.
[0235] Preferably, a signal indicating the presence of an extended dimer (i.e., the presence of a target nucleic acid sequence) is generated by an interactive labeling system, and more preferably by a FRET labeling system (i.e., an interactive dual labeling system).
[0236] For example, when a tagging site is hybridized to the capturing site and an extended dimer is formed using the templating site as a template, a signal can be provided by inducing a change in the signal from the interactive double label. The formation of the extended dimer structurally separates the reporter molecule and the quencher molecule so that the quencher molecule unquenchs the reporter molecule, and the melting of the extended dimer structurally brings the reporter molecule and the quencher molecule closer together so that the quencher molecule quenches the signal from the reporter molecule.
[0237] Therefore, when detecting target nucleic acids using the system described above, the thermodynamic property known as Tm of the extended dimer is one of the most critical factors to consider during the oligonucleotide design process; in particular, when performing multiplex PCR, it is essential to precisely design the Tm characteristics of each extended dimer formed for each target.
[0238] Since the method according to the present invention can provide oligonucleotides satisfying the length and Tm conditions of interest more quickly and accurately, it is very easy to select a sequence suitable as a template site for an oligonucleotide comprising the capturing site and template site as described above.
[0239]
[0240] II. Method for providing a tagging oligonucleotide used for the detection of target nucleic acid molecules in a sample
[0241] The method for providing the tagging oligonucleotide described below is intended to apply the oligonucleotide generated using the method of the present invention to the tagging site of the tagging oligonucleotide, and parts common to the above-mentioned content are omitted to avoid excessive duplication that would cause complexity in this specification.
[0242] According to another aspect of the present invention, a method is provided for providing a tagging oligonucleotide used for detecting a target nucleic acid molecule in a sample, comprising the following steps.
[0243] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0244] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0245] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles;
[0246] (d) generating oligonucleotides corresponding to adjacent base pair profiles to which predicted thermodynamic property information is assigned; and
[0247] (e) i) selecting a sequence containing any one of the generated oligonucleotides as a tagging site, and ii) selecting a sequence containing a hybridized oligonucleotide to a target nucleic acid sequence as a targeting site, and providing a tagging oligonucleotide containing the tagging site and the targeting site.
[0248] Among the oligonucleotides containing the predicted thermodynamic property information generated in step (d) above, a necessary sequence can be selected as a tagging site according to the thermodynamic property conditions that must be considered in the signal generation mechanism to which the tagging oligonucleotide is applied. Furthermore, after securing a sequence pool based on the thermodynamic property conditions, a final oligonucleotide can be selected according to the reaction environment, probe design conditions, etc., to which the tagging oligonucleotide is applied. An appropriate method for selecting the final oligonucleotide as a tagging site can be applied depending on the desired reaction conditions.
[0249] According to another aspect of the present invention, a method is provided to provide an extended oligonucleotide in which a cleavage fragment, which occurs dependently on the presence of a target nucleic acid molecule in a sample, is hybridized, comprising the following steps.
[0250] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0251] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0252] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles;
[0253] (d) generating oligonucleotides corresponding to adjacent base pair profiles to which predicted thermodynamic property information is assigned; and
[0254] (e) i) selecting a sequence containing any one of the generated oligonucleotides as a template site, and ii) selecting a sequence in which a cleavage fragment that occurs dependent on the presence of a target nucleic acid sequence is hybridized as a capture site, and providing an oligonucleotide (Capturing and Templating Oligonucleotide; CTO) comprising the capture site and the template site.
[0255]
[0256] III. Recording medium and computer program for providing oligonucleotides of a specific length containing information on predicted thermodynamic properties
[0257] The recording medium and computer program of the present invention described below are intended for executing the present invention on a computer, and common content between them is omitted to avoid excessive duplication that would cause complexity in this specification.
[0258] According to another aspect of the present invention, a computer-readable recording medium is provided that includes instructions for implementing a processor for executing a method of providing an oligonucleotide of a specific length containing predicted thermodynamic property information, said method comprising the following steps:
[0259] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0260] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0261] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and
[0262] (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
[0263] According to another aspect of the present invention, a computer program to be stored in a computer-readable recording medium is provided, which implements a processor for executing a method of providing an oligonucleotide of a specific length containing predicted thermodynamic property information, comprising the following steps.
[0264] (a) A step of specifying a length range of oligonucleotides to be obtained;
[0265] (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range;
[0266] (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and
[0267] (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
[0268] When the above method is executed by the above processor, the above program instruction is activated, causing the processor to execute the method of the present invention described above. The above program instruction may include the following instructions: it may include an instruction to receive a signal, and may include an instruction to set the reference value using the received signal.
[0269] Program instructions for executing a method for providing oligonucleotides may include the following instructions: (i) an instruction to specify a length range of oligonucleotides to be obtained; (ii) an instruction to generate one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the said length range; (iii) an instruction to assign predictive thermodynamic property information for each profile using adjacent base pair thermodynamic parameters to some or all of the generated adjacent base pair profiles; (iv) an instruction to generate oligonucleotides corresponding to adjacent base pair profiles to which predictive thermodynamic property information has been assigned; and (v) an instruction to provide the generated oligonucleotide sequence and its thermodynamic information (e.g., display on an output device).
[0270] The method of the present invention described above is executed in a processor, for example, in a data acquisition device such as a stand-alone computer, a network-attached computer, or a real-time PCR device.
[0271] The above computer-readable recording media include various recording media such as CD-R, CD-ROM, DVD, flash memory, floppy disk, hard drive, portable HDD, USB, magnetic tape, MINIDISC, non-volatile memory card, EEPROM, optical disc, optical recording media, RAM, ROM, system memory, and web server.
[0272] Data related to the above-mentioned oligonucleotides may be received through several devices. For example, the data may be collected by a processor in an oligonucleotide design data acquisition device used for the detection of target nucleic acid molecules within a sample. The data may be provided in real-time while data is being collected, or stored in a memory unit or buffer, and provided to the processor after the experiment is completed. Similarly, the data set may be provided to a separate system, such as a desktop computer system, via a network connection (e.g., LAN, VPN, Internet, and intranet) or direct connection (e.g., USB or other direct wired or wireless connection) with the acquisition device, or provided on portable media such as CDs, DVDs, floppy disks, portable HDDs, or standalone computer systems. Similarly, the data set may be provided to a server system via a network connection (e.g., LAN, VPN, Internet, intranet, and wireless communication networks) to a client, such as a laptop or desktop computer system.
[0273] Instructions for implementing a processor for executing the present invention may be included in a logic system. Although said instructions may be provided on any software recording medium such as a portable HDD, USB, floppy disk, CD, and DVD, they may be downloadable and stored in a memory module (e.g., a hard drive or other memory such as local or attached RAM or ROM). Computer code for executing the present invention may be executed in various coding languages such as C, C++, Java, Visual Basic, VBScript, JavaScript, Perl, and XML. Additionally, various languages and protocols may be used for external and internal storage and transmission of data and commands according to the present invention.
[0274]
[0275] The present invention will be described in more detail below through examples. These examples are intended to explain the invention more specifically, and it will be obvious to those skilled in the art that the scope of the invention as set forth in the appended claims is not limited by these examples.
[0276]
[0277] <Experimental Example>
[0278] Experiments were conducted to determine whether the adjacent base pair profile-based oligonucleotide sequence generation method of the present invention can effectively provide oligonucleotides of a specific length containing information on predicted thermodynamic properties.
[0279]
[0280] Hardware and software specifications
[0281] The generation of oligonucleotides in the Examples and Comparative Examples was implemented in Docker version 24.0.7 on the Ubuntu 20.04.6 LTS operating system equipped with an Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz and 754GB DDR4 DIMM RAM. To ensure the reproducibility of the experiment and to perform platform-independent testing, Docker containers were used, and the environment was set up using the following command: docker run -it --cpus="16" --memory="64g" ubuntu / bin / bash. In this environment, Python 3.12.4 and Anaconda version 24.5.0 were used along with the operating system. These settings were selected to support the effective execution of the methods according to the Examples and Comparative Examples.
[0282]
[0283] Adjacent base pair profiles and thermodynamic parameters
[0284] The adjacent base pair profiles used in the examples and comparative examples are nearest base pair profiles counted by the Nearest Neighbor method. The adjacent base pair thermodynamic parameters used to predict thermodynamic property information were referenced from the data table described in the 2004 paper by SantaLucia & Hicks (SantaLucia Jr J, Hicks D. The thermodynamics of DNA structural motifs. Annu Rev Biophys Biomol Struct. 2004;33:415-440). The known nearest base pair parameter values are as shown in Table 2 below.
[0285] Pair AA / TTAT / TAAG / TCAC / TGTA / ATTT / AATG / ACTC / AGGA / CTGT / CAGG / CCGC / CGCA / GTCT / GACG / GCCC / GG△S°-21.3-20.4-21.0-22.4-21.3-21.3-2 2.7-22.2-22.2-22.4-19.9-24.4-22.7-21.0-27.2-19.9△H°-7.6-7. 2-7.8-8.4-7.2-7.6-8.5-8.2-8.2-8.4-8.0-9.8-8.5-7.8-10.6-8.0
[0286]
[0287] In addition, the predicted thermodynamic property information of the examples and comparative examples was derived by targeting the target thermodynamic property information to the melting temperature. Excluding the initiation parameter and salt correction from the table referenced above, the melting temperature was calculated using Equation 15 described in Santa Lucia's 2007 paper, which corresponds to the case where the equilibrium constant (K) value is 1 (SantaLucia Jr J. Physical principles and visual-OMP software for optimal PCR design. Methods Mol Biol. 2007;402:3-34). This is the Equation indicated as <Equation I> in this specification, as follows.
[0288] <Food I>
[0289]
[0290]
[0291] <Example>
[0292] 1. Provides information on all sequences of a specified length and their predicted thermodynamic properties
[0293] Target sequence lengths were specified from 5 to 14, and for a total of 10 cases, the time taken to simultaneously provide all sequence information and its predicted thermodynamic properties for each sequence length was compared.
[0294]
[0295] Example 1
[0296] After searching for adjacent base pair profiles for a target sequence length using the method of the present invention, an experiment was conducted to generate all possible sequences from all adjacent base pair profiles and simultaneously provide information on their thermodynamic properties.
[0297] The schematic flow of the method for providing oligonucleotides according to Example 1 is shown in FIG. 1. Specifically, all adjacent base pair profiles satisfying the target sequence length were searched using a Depth-First Search Algorithm, thermodynamic parameters were assigned to the generated adjacent base pair profiles to calculate the melting temperature, and then all possible sequences generated from the adjacent base pair profiles containing the melting temperature were generated using dynamic programming.
[0298]
[0299] Comparative Example 1
[0300] Comparative Example 1 derived all sequences corresponding to the target sequence length using recursive backtracking, and calculated the melting temperature for all sequences to obtain final sequences containing each melting temperature information.
[0301]
[0302] According to the experimental example, the operating time was measured by repeating Example 1 and Comparative Example 1 three times in the same environment. The mean and standard deviation were calculated, and the results are shown in Table 3 below.
[0303] Sequence Length Sequence Generation Amount Time (min) Example 1 Comparative Example 1 5 1,024 0.000±0.0000.001±0.0006 4,096 0.001±0.0000.002±0.0007 6,384 0.004±0.0000.006±0.0008 6 5,536 0.012±0.001 0.022±0.0019 26 2,144 0.034±0.002 0.084±0.0011 01,048,5760. 106±0.0120.317±0.003114,194,3040.278±0.0061.313±0.0091216,777,2160.703±0.0045.465±0.0311367,108,8641.898±0.02323.050±0.24114268,435,4565.433±0.06992.326±0.496
[0304]
[0305] As shown in Table 3, it was confirmed that the method of the present invention can generate sequences containing predicted thermodynamic property information more quickly. In particular, it was observed that the difference in operating time between Comparative Example 1 and Example 1 gradually increases as the length of the oligonucleotide to be generated increases. This is judged to be because, as shown in Table 1 of the specification, the amount of nearest neighbor base pair profile generated is significantly less than the amount of sequence generated. In Comparative Example 1, thermodynamic property information must be calculated individually for each sequence, whereas the method of the present invention calculates thermodynamic properties only for adjacent base pair profiles and derives sequences having the same properties. Through this, it was confirmed that time and computational resources can be significantly reduced.
[0306]
[0307] 2. Provide oligonucleotides of a specified length representing the desired predictive thermodynamic information.
[0308] For a total of 10 cases with target sequence lengths ranging from 5 to 14, oligonucleotides were derived by setting the melting temperature range to 50°C to 60°C as the thermodynamic characteristic of interest. The time taken to provide sequence information for oligonucleotides with a predicted melting temperature of 50°C to 60°C for each sequence length was compared.
[0309]
[0310] Example 2
[0311] After searching for adjacent base pair profiles for a target sequence length using the method of the present invention, only adjacent base pair profiles having information on predicted thermodynamic properties of interest were selected, and all possible sequences were generated therefrom.
[0312] A flowchart of the method for providing oligonucleotides according to Example 2 is shown in FIG. 5a. Specifically, all adjacent base pair profiles satisfying the target sequence length were searched using a Depth-First Search Algorithm (see FIG. 5b). Thermodynamic parameters were applied to the generated adjacent base pair profiles to calculate the melting temperature, and then adjacent base pair profiles satisfying the target melting temperature range were filtered. All possible sequences generated from the selected adjacent base pair profiles were generated using dynamic programming, thereby obtaining final sequences containing information on each melting temperature value (see FIG. 5c).
[0313]
[0314] Comparative Example 2
[0315] The schematic flow of the method according to Comparative Example 2 is shown in Fig. 6. All sequences corresponding to the target sequence length were derived using recursive backtracking, and the melting temperature was calculated for all sequences. Only sequences corresponding to the target melting temperature range were filtered to obtain final sequences containing information on each melting temperature value.
[0316]
[0317] According to the experimental example, the operating time was measured by repeating Example 2 and Comparative Example 2 three times in the same environment. The mean and standard deviation were calculated, and the results are shown in Table 4 below.
[0318] Sequence Length Sequence Generation Amount Time (min) Example 2 Comparative Example 2 5 00.0002±0.00000.0002±0.00006 00.0006±0.00000.0007±0.00007 00.0016±0.00000.0026±0.00008 32 10.0045±0.0001 0.0110±0.00029 11,974 0.0121±0.0008 0.0448±0.000110175,046 0.0407±0.00 020.1954±0.0004111,467,3460.1441±0.00350.7705±0.0012128,427,2880.4421±0.01743.1925±0.00931336,745,4841.2023±0.026012.9548±0.010514132,083,0353.1085±0.078854.2903±0.1658
[0319]
[0320] As shown in Table 4, it was confirmed that the method of the present invention can generate sequences satisfying target conditions more quickly. In particular, a tendency was observed where the difference in operating time between Comparative Example 1 and Example 1 increased as the length of the oligonucleotide to be generated increased. This demonstrates that the method of the present invention can more effectively provide oligonucleotides containing target predicted thermodynamic property information when generating long sequences.
[0321]
[0322] 3-1. Single Adjacent Base Pair Profile-Based Oligonucleotide Sequence Generation - Limited Random Sampling
[0323] Oligonucleotides were derived by setting the target sequence length to a total of four cases, including 20, 22, 24, and 26, and the melting temperature range of interest as the thermodynamic characteristic to be 60°C to 70°C. The time taken to provide sequence information for oligonucleotides with a predicted melting temperature of 60°C to 70°C for each sequence length was compared.
[0324]
[0325] Example 3
[0326] A flowchart of the method for providing oligonucleotides according to Example 3 is shown in Fig. 7. Single adjacent base pair profiles satisfying a specified sequence length and a melting temperature range of interest were repeatedly searched using Constrained Random Sampling. Then, all possible sequences generated from the selected single adjacent base pair profiles were generated using dynamic programming to obtain final sequences containing melting temperature values.
[0327]
[0328] Comparative Example 3
[0329] A flowchart of the method for providing oligonucleotides according to Comparative Example 3 is shown in FIG. 8. For each case of the same sequence length, the number of sequences generated in Example 3 was set to be equal to the number of sequences generated in Example 3, and final sequences including melting temperature values were obtained by repeatedly searching using Monte Carlo Simulation.
[0330]
[0331] According to the experimental example, the operating time was measured by repeating the test of Example 3 and Comparative Example 3 three times in the same environment. The mean and standard deviation were calculated, and the results are shown in Table 5 below.
[0332] Sequence Length Sequence Generation Amount Time (min) Example 3 Comparative Example 3 20 13,680 0.0004±0.00000.0124±0.000122 1,764,0000.0455±0.002 12.9988±0.013824 3,427,2000.1033±0.0052 12.7302±0.007326 6,667,920 0.1652±0.00556 1.4145±0.1122
[0333]
[0334] As shown in Table 5, it was confirmed that the method according to one embodiment of the present invention can generate an oligonucleotide sequence data set having target melting temperature characteristics much faster than Comparative Example 3. Furthermore, similar to Examples 1 and 2, as the length of the sequence to be generated increased, it exhibited significantly faster performance compared to the method of filtering by verifying thermodynamic characteristics after sequence generation.
[0335]
[0336] 3-2. Single Adjacent Base Pair Profile-Based Oligonucleotide Sequence Generation - Genetic Algorithm
[0337] The target sequence length was set to 20, 22, 24, or 26, and the melting temperature was set to 60°C or 70°C as the thermodynamic characteristic of interest, respectively. Oligonucleotides were derived in a total of 8 cases, and their operating times were verified.
[0338]
[0339] Example 4
[0340] A flowchart of the oligonucleotide provision method according to Example 4 is shown in FIG. 9. A single adjacent base pair profile closest to the melting temperature of interest while satisfying the specified sequence length was iteratively searched using a genetic algorithm. The following settings were applied to the genetic algorithm:
[0341] - Population Size: 500. This represents the number of sequence combinations maintained in each generation and was set to 500 to ensure sufficient search space.
[0342] - Generations: 100. Represents the maximum number of generations repeated until the goal is reached, and was set as a sufficient number of iterations for optimizing adjacent base pair profiles.
[0343] - Crossover Rate: 0.8. The probability that crossover occurs during the process of generating offspring from parent individuals. A high value was applied to generate various combinations.
[0344] - Mutation Rate: 0.05. The probability of randomly modifying a portion of the population, set to prevent falling into a local optimum in the search space.
[0345] - Selection Method: The top 20% of individuals were used as the initial parents for the next generation, allowing for the retention of individuals with high fitness and the generation of various combinations.
[0346]
[0347] All possible sequences generated from the derived single adjacent base pair profiles were generated using dynamic programming to finally obtain the sequence closest to the target melting temperature.
[0348]
[0349] According to the experimental example, Example 4 was tested three times in the same environment to calculate the error between the operating time, the set melting temperature, and the selected adjacent base pair profile (or the sequences derived therefrom). The mean and standard deviation were calculated, and the results are shown in Table 6 below.
[0350] Sequence Length Target Melting Temperature (°C) Predicted Melting Temperature (°C) Melting Temperature Error (°C) Sequence Generation Amount Time (min) 206059.99270.007331,5000.0163±0.0001226060.0038-0.0038493,9200.0279±0.0005246060.0176-0.0176498,9600.0268±0.0001266059.97470.0253693,0000.0339±0.0008 207069.99930.00075,7600.0168±0.0004227070.0001-0.0001216,0000.0234±0.0009247069.99940.00061,260,0000.0545±0.0019267070.0001-0.000127,216,0000.6849±0.0078
[0351]
[0352] As shown in Table 6 above, it was confirmed that adjacent base pair profiles close to thermodynamic property information of interest can be efficiently explored and a large amount of sequences can be rapidly generated from them.
[0353] As such, the adjacent base pair profile-based oligonucleotide sequence providing method of the present invention can generate sequences much more efficiently than conventional direct sequence generation methods. This is because it utilizes the characteristic that multiple sequences share the same adjacent base pair profile. This approach efficiently searches for various sequence combinations with the same profile and optimizes computational resources.
[0354] In particular, when given predicted thermodynamic property conditions, the method of the present invention can generate a target sequence more efficiently. This is made possible by first generating an adjacent base pair profile and calculating the predicted thermodynamic properties based thereon. This method allows for the rapid verification of thermodynamic property conditions during the adjacent base pair profile stage before generating the actual sequence, thereby avoiding the generation of unnecessary sequences that do not satisfy the conditions.
[0355] On the other hand, conventional direct sequence generation methods require generating each sequence individually and then verifying its thermodynamic properties, which leads to the problem of consuming excessive computational resources even for sequences that do not satisfy the conditions. In particular, there is a limitation in that processing time increases exponentially as the sequence length increases and the number of possible combinations increases.
[0356] In contrast, the method of the present invention has a distinct advantage of being able to rapidly provide oligonucleotides that satisfy target conditions by efficiently utilizing time and resources.
[0357]
[0358] Foregoing, specific parts of the present invention have been described in detail. It is evident to those skilled in the art that such specific descriptions are merely preferred embodiments and do not limit the scope of the invention. Accordingly, the actual scope of the invention is defined by the appended claims and their equivalents.
Claims
1. A method for providing an oligonucleotide of a specific length containing predictive thermodynamic property information, comprising the following steps: (a) A step of specifying a length range of oligonucleotides to be obtained; (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range; (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
2. In Paragraph 1, A method in which the above thermodynamic properties are enthalpy change (△H), entropy change (△S), free energy change (△G) or melting temperature (Tm).
3. In Paragraph 1, A method for generating adjacent base pair profiles through a search algorithm in step (b) above.
4. In Paragraph 3, The above search algorithm is a method that is any one of depth-first search, breadth-first search, dynamic programming, recursive backtracking, or divide and conquer.
5. In Paragraph 1, A method for deriving oligonucleotides from a generated profile through a search algorithm in step (d) above.
6. In Paragraph 5, The above search algorithm is a method that is any one of depth-first search, breadth-first search, dynamic programming, recursive backtracking, or divide and conquer.
7. In Paragraph 1, In step (b), all adjacent base pair profiles that can be derived from an oligonucleotide sequence satisfying the length range are generated; in step (c), all generated adjacent base pair profiles are assigned predictive thermodynamic properties; and in step (d), oligonucleotides corresponding to all adjacent base pair profiles assigned predictive thermodynamic property information are generated; A method for providing sequence and predicted thermodynamic property information of all oligonucleotides satisfying a specified length range.
8. In Paragraph 1, A method for further specifying a range of predicted thermodynamic properties of the oligonucleotide to be obtained in step (a) or step (c), 9. In Paragraph 8, A method further comprising, in step (b), generating all adjacent base pair profiles that can be derived from an oligonucleotide sequence satisfying the length range; and in step (c), filtering adjacent base pair profiles satisfying the range of specified predicted thermodynamic properties.
10. In Paragraph 9, A method for generating oligonucleotides only for adjacent base pair profiles satisfying an arbitrarily selected length within a specified length range in step (d) above.
11. In Paragraph 8, A method further comprising the step of selecting a single adjacent base pair profile by calculating the predicted thermodynamic properties of some adjacent base pair profiles through an optimization algorithm based on a specified length range and a range of predicted thermodynamic properties.
12. In Paragraph 11, A method for generating one or more adjacent base pair profiles that can be derived from a sequence of oligonucleotides satisfying the length range in step (b), and in step (c), providing information on the predicted thermodynamic properties of the adjacent base pair profiles, and then filtering adjacent base pair profiles satisfying the specified range of predicted thermodynamic properties.
13. In Paragraph 11, A method in which the above single adjacent base pair profile satisfies a length arbitrarily selected within a specified length range.
14. In Paragraph 11, The above optimization algorithm is a method that is any one of a greedy algorithm, a branch-and-bound method, Monte Carlo simulation, a genetic algorithm, Bayesian optimization, and restricted random sampling.
15. In Paragraph 1, A method in which, in step (c) above, the adjacent base pair thermodynamic parameter includes the nearest neighbor (NN) thermodynamic parameter or the next-nearest neighbor (NNN) thermodynamic parameter.
16. In Paragraph 1, A method in which, in step (c) above, the adjacent base pair thermodynamic parameters include parameters of enthalpy change (△H) and entropy change (△S).
17. In Paragraph 1, A method wherein, in step (c) above, the predicted thermodynamic properties are obtained using any one selected from the group consisting of an enthalpy change (△H) for an adjacent base pair profile, an entropy change (△S) for an adjacent base pair profile, a parameter for correcting the entropy change, a parameter for correcting the Tm contribution due to the length of the oligonucleotide, and combinations thereof.
18. In Paragraph 17, The above thermodynamic properties are calculated as Tm using the following Equation I. <Food I> In the above equation, Tm is the melting temperature of the oligonucleotide; △H° is the sum of enthalpy changes; △S° is the sum of entropy changes; and m and n are constants.
19. In Paragraph 18, A method where m is 1000 and n is 273.
15.
20. In Paragraph 1, The above oligonucleotide is an artificial base sequence used in a method for detecting a target nucleic acid sequence, and is a target nucleic acid dehybridized nucleotide sequence, method.
21. In Paragraph 20, A method wherein the oligonucleotide comprises: a target nucleic acid nonhybridization tagging site of a tagging oligonucleotide; or a target nucleic acid and a tagging site nonhybridization templated site included in an oligonucleotide comprising a capturing site to which the tagging site is hybridized.
22. A method for providing a tagging oligonucleotide used for detecting a target nucleic acid molecule in a sample, comprising the following steps: (a) A step of specifying a length range of oligonucleotides to be obtained; (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range; (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; (d) generating oligonucleotides corresponding to adjacent base pair profiles to which predicted thermodynamic property information is assigned; and (e) i) selecting a sequence containing any one of the generated oligonucleotides as a tagging site, and ii) selecting a sequence containing an oligonucleotide hybridized to the target nucleic acid sequence as a targeting site, and providing a tagging oligonucleotide containing the tagging site and the targeting site.
23. A computer-readable recording medium comprising instructions for implementing a processor for executing a method of providing an oligonucleotide of a specific length containing predictive thermodynamic property information, comprising the following steps: (a) A step of specifying a length range of oligonucleotides to be obtained; (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range; (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.
24. A computer program to be stored on a computer-readable recording medium, implementing a processor for executing a method for providing an oligonucleotide of a specific length containing predictive thermodynamic property information, comprising the following steps: (a) A step of specifying a length range of oligonucleotides to be obtained; (b) generating one or more adjacent base pair profiles for a sequence of oligonucleotides satisfying the above length range; (c) a step of providing predicted thermodynamic characteristic information for each profile using adjacent base pair thermodynamic parameters for some or all of the generated adjacent base pair profiles; and (d) A step of generating an oligonucleotide corresponding to an adjacent base pair profile to which predicted thermodynamic property information is assigned.