Unlock instant, AI-driven research and patent intelligence for your innovation.

Intelligent structured storage and extraction method for gene sequencing data

A technology of gene sequencing and intelligent structure, applied in genomics, instrumentation, proteomics, etc., can solve performance problems and other problems

Pending Publication Date: 2022-01-07
上海烈冰生物医药科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For traditional databases such as mySQL database, performance problems will occur if there are more than 2 million to 3 million records

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Intelligent structured storage and extraction method for gene sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0017] Example: such as figure 1 As shown, an intelligent structured storage and extraction method for gene sequencing data of the present invention includes the following steps

[0018] Step 1. Establish a columnar database and store each gene sequencing site into the database;

[0019] Step 2. According to the particularity of biological information / gene sequencing mutation sites, optimize the data, optimize the data of our mutation sites or modify the format so that they can be conveniently stored in non-relational databases;

[0020] Step 3, using data information such as mutation sites stored in the database, comparing with known mutation site information, and combining phenotype information to conduct biological significance analysis;

[0021] Step 4. Set up a fast incremental statistical method to count the data and enter the data into the database.

[0022] In the step 2, the coverage and mutation information of a site are recorded at the same time, and the method of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an intelligent structured storage and extraction method for gene sequencing data, which comprises the following steps of: establishing a non-relational database, and storing each gene sequencing site into the database; according to the particularity of biological information / gene sequencing mutation sites, optimizing data, and optimizing the data of the mutation sites or modifying the format of the data of the mutation sites so that the data can be conveniently stored in a non-relational database; utilizing data information of mutation sites, expression profiles, ChIP, methylation and the like stored in a database, comparing known information of mutation sites, expression profiles, ChIP, methylation and the like, and adopting phenotypic information to carry out biological significance analysis; and setting a rapid increment statistical method to carry out statistics on the data, so that newly added gene related data is rapidly input into a database and batch statistics is carried out. According to the method, a non-relational database can be adopted, a large amount of gene related information such as gene mutation, expression profiles, ChIP and methylation can be stored, the jamming phenomenon is avoided, and statistics is conducted on newly-input gene related data through a rapid increment statistical method.

Description

technical field [0001] The invention relates to the field of gene data storage or extraction, in particular to a method for intelligent structured storage and extraction of gene sequencing data. Background technique [0002] In recent years, the analysis and research of genetic data has been deepening, and people have fully realized the importance of studying genetic information. Due to the huge size, incompleteness and strong randomness of genetic data, a sample has tens of thousands of gene expressions, and more than a dozen Thousands of methylation and transcription factor binding sites, hundreds of thousands to millions of mutation sites. When the sample size increases to hundreds of thousands of individuals, the amount of data will further increase to trillions of levels. Getting this information into a database quickly and querying it at high speeds is a challenge. For traditional databases such as mySQL databases, performance problems will occur if there are more th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30G16B50/00
CPCG16B20/30G16B50/00
Inventor 陈岱宗杰
Owner 上海烈冰生物医药科技有限公司