Flow correction method for second-generation cancer genome high-throughput sequencing data

A technology for sequencing data and correction methods, applied in the field of data science, can solve problems such as inapplicable tumor genome data processing procedures, and achieve the effects of saving storage space, improving accuracy, and improving computing efficiency

Active Publication Date: 2017-05-31
北京吉因加医学检验实验室有限公司
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, if the existing method is used, the read data based on single mutation data can only be superimposed, and the read data obtained from this is subject to multiple uniform distributions, and can only approach multiple Beta or multiple Dirichlet distributions as much as possible, and cannot be compared with the correct Situation fit
[0011] In summary, due to the particularity of tumor tissue, the existing process correction methods for second-generation genome high-throughput sequencing data are not suitable for the processing process of tumor genome data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Flow correction method for second-generation cancer genome high-throughput sequencing data
  • Flow correction method for second-generation cancer genome high-throughput sequencing data
  • Flow correction method for second-generation cancer genome high-throughput sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] see figure 1 As shown, the present invention discloses a process correction method for high-throughput sequencing data of the second-generation tumor genome. A 32-bit unsigned number is used as the identification quantity to record the variation data of each blood line, and the generated data reflects the purity and different subgroups. According to the read segment data of the clone ratio, the somatic variation calibration data of the brother subclones of the offspring subclones are obtained according to the bloodline variation data relationship of father-son subclones inheritance and brother subclones mutually exclusive relationship, and are used for the second The processing flow of high-throughput sequencing data of generation tumor genome was corrected.

[0065] Among them, the mutation data algorithm of father-son subclone inheritance and sibling subclone mutual exclusion relationship is as follows:

[0066] S1. As mentioned above, tumor cells inherit bloodline v...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a flow correction method for second-generation cancer genome high-throughput sequencing data; in the method, a series of 32-bit unsigned numbers are used as identification quantities to record corresponding bloodline variation or somatic cell variation data respectively to generate read data embodying purity and different subcloning ratios, somatic cell variation correction data for offspring subcloning and its brother subcloning are acquired according to variation relationship between farther-son subcloning inheritance and brother subcloning mutual exclusion, and the data are used to correct a processing flow of second-generation cancer genome high-throughput sequencing data.

Description

[0001] 【Technical field】 [0002] The invention belongs to the technical field of data science with the application background of precision medicine, and is a set of auxiliary correction system for a decision support system for precise diagnosis and treatment of tumors. [0003] 【Background technique】 [0004] In the past decade, thanks to the rapid development of high-throughput genome and transcriptome sequencing technologies, tumor genomics and precision tumor diagnosis and treatment have made remarkable achievements in both the depth of medical research and the breadth of clinical application. Both cancer genomics research and cancer precision medicine rely on high-throughput tumor sequencing data. The genome and transcriptome sequencing data output from the sequencer are called read data (English name is read data), which cannot be directly used by tumor researchers and clinicians due to its short length and sequencing errors. Some data processing procedures must be used ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/20
CPCG16B25/00
Inventor 赵仲孟王嘉寅耿彧
Owner 北京吉因加医学检验实验室有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products