Check patentability & draft patents in minutes with Patsnap Eureka AI!

System for identifying coding region and non-coding region in DNA (Deoxyribose Nucleic Acid) gene sequence

A technology of DNA sequence and coding region, which is applied in the system field of identifying coding region and non-coding region in DNA gene sequence, which can solve the problems of floating-point calculation cost and excessive calculation time, and achieve the effect of small amount of calculation and high precision

Inactive Publication Date: 2015-03-11
NANJING INST OF TECH
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

All DFT operations are floating-point operations, and floating-point operations consume a lot of computing time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System for identifying coding region and non-coding region in DNA (Deoxyribose Nucleic Acid) gene sequence
  • System for identifying coding region and non-coding region in DNA (Deoxyribose Nucleic Acid) gene sequence
  • System for identifying coding region and non-coding region in DNA (Deoxyribose Nucleic Acid) gene sequence

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] Preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0038] Based on Ramanujan sum and Ramanujan coefficients, the embodiment proposes the concept of discrete Ramanujan transform spectrum. By using the Voss numerical representation, the character DNA sequence is mapped to the numerical DNA sequence, thereby obtaining the discrete Ramanujan spectrum of the numerical DNA sequence.

[0039] Since the spectrum of the discrete Fourier transform of protein-coding sequences has a significant 3-periodicity, this is the basis for the discrete Fourier transform to be used in DNA sequence analysis. It uses the signal-to-noise ratio of the sequence at N / 3 frequency as a measure, where N refers to the length of the sequence. The examples indicate that 3-periodicity can be identified by a prominent peak at 3 in the discrete Ramanujan spectrum of protein coding sequences. As a numerical measure, the signal-...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a system for identifying a coding region and a non-coding region in a DNA (Deoxyribose Nucleic Acid) gene sequence. The DRT (Discrete Fourier Transform) spectrum of a DNA sequence is calculated, and a judgment on whether the sequence is an exon or an intron is made through comparison of spectral values on k=3 and other places; if the value of the DRT spectrum on k=3 is higher than the values of other places, the sequence is the exon; otherwise, the sequence is the intron. The coding region and the non-coding region of proteins are distinguished through the discrete Ramanujan spectrum of a numerical sequence and the signal to noise ratio. A test result shows the reliability of a method disclosed by the invention. In comparison to Fourier transform, the calculation amount of the discrete Ramanujan spectrum is smaller, and the accuracy is higher.

Description

technical field [0001] The present invention relates to a system for identifying coding regions and non-coding regions in DNA gene sequences. Background technique [0002] With the advancement of science and technology, modern biological technology has been flourishing. More and more mathematical methods and signal processing techniques are applied to the research field of life sciences, forming the frontier discipline of bioinformatics. [0003] Discrete Fourier transform (DFT) is mostly used now as a method for identifying coding regions and non-coding regions in DNA gene sequences. Because this method uses floating-point operations, and the calculation accuracy of computers is limited, there are calculation errors. And floating-point operations consume a lot of computing time. [0004] First, modern computers store real numbers with a finite number of bits, which leads to rounding errors. For the discrete Fourier transform (DFT), the basis functions It is approximate...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/18
Inventor 滑伟
Owner NANJING INST OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More