Compression method for next generation sequencing data
A technology of second-generation sequencing and compression methods, which is applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of low compression ratio, achieve the effect of reducing storage space and improving processing speed
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0034] This embodiment takes the data of a thousand human genomes as an example for description, where the sample NA12345 is one of more than one thousand samples of the thousand human genomes. Here, for the convenience of description, NA12345 is taken as an example for illustration. The second-generation sequencing data of the sample data is stored in fastq format, and the corresponding file name is example.fastq. The following steps S11 to S16 are used to compress the second-generation sequencing data of the thousand-person genome.
[0035] In this embodiment, step S11 generates a BSSL initial file. details as follows.
[0036] In step S11, first use the split command to split example.fastq into multiple small files of 80000000 lines (that is, the aforementioned first preset length, of course, the first preset length can also be other values); the system can automatically Name the resulting small file. For example, the first file will be named exampleaa.fastq. The split comman...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com