Compact next-generation sequencing datasets and efficient sequencing processing using them
A compact, gene sequencing technology, applied in the field of gene analysis, which can solve the problems of increased cost and high computing cost, and achieve the effect of preserving compatibility
- Summary
- Abstract
- Description
- Claims
- Application Information
 AI Technical Summary 
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0025] Disclosed herein is a method for formatting raw read data including base quality scores in a manner that allows for a substantial reduction in file size while preserving most of the useful information. As discussed earlier, in the regular FASTQ format, reads occupy slightly more than 2L 序列 (ASCII) characters, where L 序列 is the number of bases. Other existing text-based storage formats that store base sequences and corresponding base quality scores occupy a considerable amount of storage. For example, in the Qseq format, base sequences and quality scores are stored but arranged in a single line of text. The FASTA format is able to cut this storage roughly in half - but it does so by losing all base quality score information. Alternatively, anyone can convert a text-formatted read entry to a non-text format (eg, a binary format where two bits encode a base and the phred score is represented by a binary integer value). However, the most downstream processing components...
PUM
 Login to View More
 Login to View More Abstract
Description
Claims
Application Information
 Login to View More
 Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com



