Method for gene identification signature (GIS) analysis

a gene identification and signature technology, applied in the field of gene expression, can solve the problems of prohibitively expensive to tag every transcript in the transcriptome, inability to generate tags to improve specificity, and inability to complete sequencing analysis of all different transcriptomes, etc., to achieve easy recognition, increase the specificity of tags, and enhance sequencing efficiency

Inactive Publication Date: 2006-04-20
AGENCY FOR SCI TECH & RES
View PDF14 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0012] The present invention solves the problems mentioned above by providing two tags (a ditag) per nucleic acid molecule, therefore increasing the specificity of the tags to represent a nucleic acid molecule (for example a gene). The two tags are extracted from the 5′ and 3′ ends of the same nucleic acid molecule, and therefore ditags are more informative to reflect the structure of the nucleic acid molecules. Critically, the invention provides a method to link the 5′ and 3′ tags of the same nucleic acid molecule into a single ditag unit. Therefore, the pairs of 5′ and 3′ tags that represent the nucleic acid molecule can be easily recognized by simple sequencing analysis. The invention can be used for the identification of new genes, for the measure of transcript abundance in transcriptomes, for the annotation of genome sequences and at the same time enhancing sequencing efficiency.
[0051] Such ditags can directly guide the process of recovering the full-length nucleic acid molecule corresponding to the newly identified genes.

Problems solved by technology

However, due to the complexity and immense volume of transcripts expressed in the various developmental stages of an organism's life cycle, complete sequencing analysis of all different transcriptomes still remains unrealistic.
Though EST is effective in identifying genes, it is prohibitively expensive to tag every transcript in a transcriptome.
Limited by the availability of type II restriction enzymes that can cut longer than 21 bp, the SAGE method currently can not generate any longer tags to improve specificity.
Further, SAGE and MPSS methods only produce a single signature per transcript in the middle of the gene.
In view of the “internal” nature of the tag in a transcript, these methods provide only limited tag information.
Therefore, despite their usefulness in sequencing efficiency, the utility of methods such as SAGE or MPSS is severely undermined by their lack of specificity and consequent inconclusiveness.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for gene identification signature (GIS) analysis
  • Method for gene identification signature (GIS) analysis
  • Method for gene identification signature (GIS) analysis

Examples

Experimental program
Comparison scheme
Effect test

example 1

The Method

[0170] The experimental procedure of GIS ditag analysis has been carried out according to the following modules of cDNA library construction and analysis: [0171] (1) The full-length cDNA library which introduces the MmeI sites flanking both ends of each cDNA insert; [0172] (2) The GIS ditag library in which each clone contains a 5′ 18 bp signature and a 3′ 18 bp signature of a transcriptional unit; [0173] (3) The GIS library for clones of concatenated GIS ditags; [0174] (4) GIS sequencing analysis.

1. GIS Full-Length cDNA Library with Addition of MmeI sites for Each cDNA Inserts

[0175] The outline of procedure of this section was as follows: starting from high quality mRNA, the first cDNA was synthesized with a GsuI-oligo dT primer (SEQ ID NO:1).

[0176] The first strand cDNA / RNA hybrids was subjected to a full-length enrichment procedure by the biotinylation-based cap-trapper approach. Any cap-trapper approach known in the art can be used, for example Carninci et al., 1...

example 2

2. GIS Ditag Library

[0224] The cDNA clones made from steps 1-1 to 1-8 contained a MmeI site (TCCGAC) at the 5 side and another MmeI site (TCCAAC) in reverse orientation at the 3 end. Note that these two Mmel recognition sites are two isoforms that can be recognized by MmeI (TCCRAC 20 / 18, where R=(A / G)). The sequence difference here will be useful later for directional indication. Mmel restriction enzyme will cleave these clones 20 bp into the cDNA fragments from their 5 and 3 ends. Consequently, despite the variable sizes of the digested cDNA, the vector plus the 20 bp cDNA signature tags on each end of all clones will be of a constant size that can be easily recognized upon agarose gel electrophoresis, and can be easily purified from the unwanted cDNA fragments.

[0225] The gel-purified vector plus tags can then be self-ligated to give a agged plasmid containing the 5 and 3 GIS signature tags.

2-1. Plasmid Preparation

[0226] The GIS full-length cDNA library was amplified once by ...

example 3

[0294] The GIS analysis method according to any embodiment of the invention is a complete gene discovery platform. It combines full-length cDNA library construction, cDNA tag sequencing, genome mapping and annotation into one operation from the same starting materials. For example, to study the genes expressed in human stem cells, we start with the stem cell mRNA, construct a stem cell GIS full-length cDNA library, and then the GIS library. We will only need to sequence 50,000 clones of the GIS library to reveal over a million transcripts. Such deep sampling will allow us to capture nearly all unique transcripts expressed in the human stem cell transcriptome. Each of the GIS ditags can be specifically mapped to the genome and therefore define the structural regions of the corresponding genes on the chromosomes. Most of the GIS ditags map to known genes on chromosomes and the counts of the GIS ditags provide the measurement of expression activity. Some of the GIS ditags may map to de...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

PropertyMeasurementUnit
pHaaaaaaaaaa
pHaaaaaaaaaa
pHaaaaaaaaaa
Login to view more

Abstract

An isolated oligonucleotide comprising at least one ditag, wherein the ditag comprises two joined first and second sequence tags, wherein the first tag comprises the 5′-terminus sequence and the second tag comprises the 3′-terminus sequence of a nucleic acid molecule or a fragment thereof. The ditag analysis is useful for gene discovery and genome mapping.

Description

[0001] This application is a divisional of U.S. Ser. No. 10 / 664,234, filed Sep. 17, 2003.FIELD OF THE INVENTION [0002] The present invention relates generally to the field of gene and transcript expression and specifically to a method for the serial analysis of a large number of transcripts by identification of a gene signature (GIS) corresponding to defined regions within a transcript. BACKGROUND OF THE INVENTION [0003] One of the most important goals of the human genome project is to provide complete lists of genes for the genomes of human and model organisms. Complete genome annotation of genes relies on comprehensive transcriptome analysis by experimental and computational approaches. Ab initio predictions of genes must be validated by experimental data. An ideal solution is to clone all full-length transcripts and completely sequence them. This approach has gained recognition recently (Strausberg, R. L., et al., 1999, Science, 286: 455-457) and progress has been made (Jongeneel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): C40B40/08C12Q1/68C07H21/02C12N15/09C12N15/10
CPCC12N15/1065C12N15/1093C12N15/1096C12Q1/6809C12Q1/6855C12Q2525/191C12Q2525/131C12Q2521/313C07H21/02
Inventor RUAN, YIJUNNG, PATRICKWEI, CHIALIN
Owner AGENCY FOR SCI TECH & RES
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products