Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Linking sequence reads using paired code tags

a technology of linking sequences and code tags, applied in the field of biology and genomics, can solve the problems of enormous information generated from a single sequencing run, and achieve the effect of reducing the number of target nucleic acid molecules

Inactive Publication Date: 2015-09-17
ILLUMINA INC
View PDF1 Cites 70 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This approach enables the assembly of sequence representations from short sequencing reads without a reference genome, improving the assembly of repetitive sequences and maintaining order information, thus enhancing the efficiency of nucleic acid sequencing.

Problems solved by technology

As such, the information generated from a single sequencing run can be enormous.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Linking sequence reads using paired code tags
  • Linking sequence reads using paired code tags
  • Linking sequence reads using paired code tags

Examples

Experimental program
Comparison scheme
Effect test

example 1

Whole Genome Amplification Using Transposon Sequences

[0193]This example illustrates a method for uniform amplification of genomic DNA with random insertion therein of specific primer sites. Transposon sequences are prepared, each comprising a first transposase recognition site, a second transposase recognition site having a sequencing adaptor disposed therebetween, in which the sequencing adaptor comprises a first primer site and second primer site. The transposon sequences are contacted with genomic DNA in the presence of MuA transposase under conditions sufficient for the transposon sequences to integrate into the genomic DNA. The genomic DNA is amplified using primers that hybridize to the first primer site or second primer site.

example 2

Landmark Sequencing Methods Using Genomes with Increased Complexity

[0194]This example illustrates an embodiment for providing additional markers in a genome. Additional markers can be useful in genomes that include repetitive sequences during subsequent assembly steps to generate a sequence representation of the genome. Transposon sequences are prepared, each comprising a different barcode. The transposon sequences are integrated into genomic DNA in a transposition reaction. The genomic DNA comprising the integrated transposon is amplified by whole genome amplification. A sequencing library is prepared from the amplified template nucleic acids. Sequencing data is obtained from the sequencing library. Sequencing reads can include representations of one or more nucleic acids with the same barcode on each nucleic acid. Such nucleic acids are identified as containing sequences that overlap in a sequence representation of the genomic DNA. The sequencing reads can be assembled by identify...

example 3

Predicted average coverage using linked read sequencing strategy

[0195]Useable fragment lengths are modeled as a truncated exponential distribution so that the mean useable fragment length can be obtained by setting k=b / d, where d is the mean of the non-truncated exponential (the total fragment distribution) and b is the value for truncation (either 180 or 280 for 100 nucleotides and 150 nucleotide paired-end reads, respectively) and then calculating the mean of the truncated exponential as

E(f)=d(1−(k+1)e−k) / (1−e−k)

[0196]The proportion of useable reads is p=C(b)×(1−D(0,T)) where C is the exponential cumulative distribution function, T is the average repetitions of observing each fragment (num clusters) / complexity, complexity is the genome size times the number of genome copies diluted to divided by d, and D is the Poisson cumulative distribution function

[0197]Expected length of linked read is then (E(f)−9)×1 / (1−p)+9 where p is proportion of useable reads: 9 is subtracted from each re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Artificial transposon sequences having code tags and target nucleic acids containing such sequences. Methods for making artificial transposons and for using their properties to analyze target nucleic acids.

Description

RELATED APPLICATIONS[0001]This application is a continuation of U.S. application Ser. No. 13 / 080,345 filed Apr. 5, 2011, which is a continuation-in-part of U.S. application Ser. No. 13 / 025,022, filed Feb. 10, 2011 entitled “LINKING SEQUENCE READS USING PAIRED CODE TAGS,” the contents of which are incorporated herein by reference in its entirety.REFERENCE TO SEQUENCE LISTING[0002]The present application is being filed along with a Sequence Listing in electronic format. The Sequence Listing associated with this application is provided as a file entitled ILLINC193P1SEQLIST.TXT, created Mar. 30, 2011, which is approximately 2 Kb in size, and was submitted electronically via EFS-Web on Apr. 5, 2011, concurrent with the filing of U.S. patent application Ser. No. 13 / 080,345. The information in the electronic format of the Sequence Listing is incorporated herein by reference in its entirety.FIELD OF THE INVENTION[0003]Embodiments of the present invention relate to the fields of biology and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): C12Q1/68
CPCC12Q1/6874C12Q1/6869C12Q2521/319C12Q2521/507C12Q2525/155C12Q2525/161C12Q2525/185C12Q2525/191C12Q2525/197C12Q2525/313C12Q2535/122C12Q2565/514C12N15/1082C12Q2521/301C12Q2521/327C12Q2525/186C12Q2563/179
Inventor STEEMERS, FRANK J.GUNDERSON, KEVIN L.ROYCE, THOMASPIGNATELLI, NATASHAGORYSHIN, IGORCARUCCIO, NICHOLAS
Owner ILLUMINA INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products