Typing and assembling discontinuous genomic elements
A genome and genome sequence technology, applied in the field of typing and assembling discontinuous genomic elements and diploid sequencing, which can solve the problems of lack of haplotypes and typing of discontinuous genomic elements
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0107] In this example, it was examined whether whole-exome haplotype phasing could be achieved using data sets obtained from simulated genomic proximity junction assays such as TCC or Hi-C or in situ Hi-C. More specifically, to show that whole-exome haplotype phasing is feasible, data were obtained from Hi-C whole-exome proximity joining experiments performed on chromosome 1 of GM12878 cells. Then, at least one fragment containing the exon region in the two sequence read pairs is retained. Thus, this dataset represents a simulated whole-exome proximity junction dataset.
[0108] The data was then simulated using the algorithm described above and used to examine its ability to phase exonic SNVs to a single haplotype structure. To this end, two metrics were defined - completeness as the length of the haplotype block compared to the length of the chromosome and resolution as the phased exosome in the chromosome The subvariant's score. It was found that complete haplotypes cou...
Embodiment 2
[0111] In this example, it was examined whether whole-exome haplotype phasing could be achieved using real datasets obtained from exome-captured proximity joins.
[0112] More specifically, exome capture was performed using proximity junction data from GM12878 cells, followed by sequencing using the methods described above. The exome capture protocol is optimized in-house for fragment length, blocker primers, and oligonucleotide probe binding. As shown in Figure 4, three whole-exome proximity junction libraries were generated. Two of these libraries were generated using a single enzyme (NcoI or XbaI), while the third was generated using a mixture of 6-base cutting enzymes (HindIII, NcoI, XbaI and BamHI, labeled "multienzyme"). After capture and sequencing, these libraries were found to have a clear enrichment of exon sequences (Fig. 4b). These were then sequenced, generating approximately 50-70 million read pairs per library (Fig. 4b).
[0113] These data sets were first us...
Embodiment 3
[0120] In this example, assays were performed to investigate the effect of restriction enzymes selected based on the number of bases covered and phased. Briefly, three libraries were generated using the exome sequencing protocol and the whole-exome Haploseq method described above. For this, NcoI (6-base cleavage enzyme) and DpnI (4-base cleavage enzyme) were used. The result is as Figure 6 shown in . Results showed that 96% of bases were covered at >10x in whole-exome sequencing samples when the average coverage per library was sequenced at 44x. However, if a 6-base cut is used, only about 30% of the bases are covered at 10x or greater. This increased to 50% with 4-base cleaving enzyme. These results again suggest that multi-enzyme data may be more useful for genotyping and potentially haplotyping or de novo assembly applications than single-enzyme data sets.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


