Unlock instant, AI-driven research and patent intelligence for your innovation.

Genetic data analysis method based on spark platform

A technology of genetic data and analysis methods, applied in the fields of genomics, sequence analysis, instruments, etc., can solve the problems of poor scalability, low efficiency of genetic data analysis, and difficult to transplant the optimization of sequencing process, and achieve the effect of improving efficiency.

Active Publication Date: 2021-02-02
PHIL RIVERS TECH LTD
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the large differences in gene samples and algorithm parameters involved in different data processing processes, it is difficult to transplant the optimization work for specific sequencing processes
[0006] Therefore, it is necessary to improve the existing technology to solve the problems of low efficiency and poor scalability of the genetic data analysis process.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Genetic data analysis method based on spark platform
  • Genetic data analysis method based on spark platform
  • Genetic data analysis method based on spark platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] In order to make the purpose, technical solution, design method and advantages of the present invention clearer, the present invention will be further described in detail through specific embodiments in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0036] Combining the next-generation sequencing gene data analysis and big data processing technology, the present invention proposes a Spark platform development framework for gene data analysis, which can realize the mainstream gene data processing algorithm and abstract it as an API (Application Programming Interface) form, so that users can use this development framework to further develop the gene sequencing process.

[0037] Spark is a scalable data analysis platform that uses elastic distributed data set RDD to provide a distributed memory parallel computing engine to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a genetic data analysis method based on a spark platform. The method comprises: acquiring gene sequencing data; generating an elastic distributed data set RDD from the acquired gene sequencing data using a spark platform, wherein the elastic distributed data set RDD includes a plurality of parts; Each part of the RDD performs an alignment with a reference gene to generate a resiliently distributed dataset RDD containing the alignment results. According to the method of the present invention, the genetic data analysis algorithm can be realized through the spark platform, thereby improving the efficiency and flexibility of the genetic data analysis.

Description

technical field [0001] The invention relates to the technical field of gene data sequencing, in particular to a gene data analysis method based on a spark platform. Background technique [0002] In recent years, gene sequencing technology has developed rapidly, especially the wide application of next generation sequencing (NGS, Next generation sequence) technology, which has made gene sequencing play an important role in disease monitoring, biomedicine and other fields, and medical products related to gene sequencing have gradually Take shape and show huge market potential. [0003] However, with the explosive growth of next-generation sequencing data, traditional genetic data analysis tools and analysis methods can no longer meet the processing needs of massive biological data, and the processing speed of genetic data has gradually become the bottleneck in the entire gene sequencing process. Although a lot of optimization work has been done on genetic data processing at ho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16B20/20G16B30/10G16B50/30
CPCG16B20/00G16B50/00G16B30/00
Inventor 谭光明张中海牛钢王炳琛张春明
Owner PHIL RIVERS TECH LTD