Promoter query method, system and equipment for species genome

A query method and promoter technology, applied in the field of big data, can solve the problems of low prediction accuracy, large prediction workload, and narrow applicability

Active Publication Date: 2021-08-20
JINAN UNIVERSITY
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The embodiment of the present invention provides a species genome promoter query method, system and equipment, which are used to solve the problem that the existing prediction tools for gene promoters have narrow applicability, require manual coordination, heavy workload and low prediction accuracy. technical problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Promoter query method, system and equipment for species genome
  • Promoter query method, system and equipment for species genome

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] figure 1 It is a flow chart of the steps of the method for querying promoters of species genomes described in the embodiment of the present invention.

[0051] Such as figure 1 As shown, the embodiment of the present invention provides a promoter query method of species genome, comprising the following steps:

[0052] S1. Obtain the genome file, genome annotation file, and fastq file of the high-throughput transcriptome of the species.

[0053] It should be noted that the genome files, genome annotation files, and fastq files of corresponding high-throughput transcriptomes of several species were mainly obtained from NCBI database, JGI database and / or Ensembl genome database. Among them, the transcriptome high-throughput data in the fastq file can be generated by sequencing instruments of the llumina series, and can also be downloaded from the NCBI database. In this embodiment, the genome file and genome annotation file of one of the species are used as a case illust...

Embodiment 2

[0081] figure 2 It is a frame diagram of the promoter query system of the species genome in the embodiment of the present invention.

[0082] Such as figure 2 As shown, the embodiment of the present invention also provides a species genome promoter query system, including a data acquisition module 10, a data processing module 20, a conversion module 30, a database construction module 40 and a query module 50;

[0083] The data acquisition module 10 is used to obtain the genome file of the species, the genome annotation file and the fastq file of the high-throughput transcriptome of the species;

[0084] The data processing module 20 is used to sort all the genes in the genome file according to the genome annotation file, and obtain a species promoter file containing the gene number and the starting coordinates, sequence, and length of the gene promoter;

[0085] Conversion module 30, for adopting hisat2 or Trinity software, fastq file is converted into the FPKM file that c...

Embodiment 3

[0098] An embodiment of the present invention provides a species genome promoter query device, including a processor and a memory;

[0099] a memory for storing program codes and transmitting the program codes to the processor;

[0100] The processor is configured to execute the above-mentioned method for querying promoters of species genomes according to instructions in the program code.

[0101] It should be noted that the processor is configured to execute the steps in the above embodiment of the method for querying a promoter of a species genome according to the instructions in the program code. Alternatively, when the processor executes the computer program, the functions of the modules / units in the above system / device embodiments are realized.

[0102] Exemplarily, a computer program can be divided into one or more modules / units, and one or more modules / units are stored in a memory and executed by a processor to complete the present application. One or more modules / uni...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a promoter query method, system and equipment for species genome. The method comprises the following steps: acquiring a genome file, a genome annotation file and a fastq file of a species; sequencing all genes in the genome file according to the genome annotation file to obtain a species promoter file; converting the fastq file into an FPKM file containing a high expression gene by adopting hisat2 or Trinity software; obtaining the species promoter files and the FPKM files of the multiple species through the steps S1 to S3, and constructing the promoter database through the species promoter files and the FPKM files of the multiple species. According to the promoter query method of the species genome, the promoter of the required gene can be queried in the promoter database, the query of the gene promoter of any species is not limited, an additional auxiliary tool is not needed, and the accuracy of the queried promoter is high.

Description

technical field [0001] The present invention relates to the field of big data technology, in particular to a promoter query method, system and equipment of a species genome. Background technique [0002] With the rapid development of high-throughput sequencing, the genomes of more and more species have been sequenced. For the genetic transformation of species vectors, genome sequencing provides the possibility to search for the basic element of the vector-promoter. Studies have shown that transcription starts in the DNA region upstream of the gene, which belongs to the assembly region of RNA polymerase II (PolII) and related transcription factors required for transcription initiation. The key region of the gene is called the core promoter. The promoter is the "baton" that regulates gene expression, and it can control the level, location and mode of gene expression. [0003] The establishment of a genetic transformation system of a species is the basis for the study of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/10G06F16/245
CPCG06F16/245G16B30/10
Inventor 李宏业李达伟黄小龙黄丹杨维东
Owner JINAN UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products