Method for batch computing genome coding region SNP sites among related species

A genome and coding region technology, applied in the field of batch calculation of SNP sites in genome coding regions between closely related species, can solve problems such as operation difficulties, achieve good results, make up for time-consuming, labor-intensive, and fast effects

Active Publication Date: 2019-01-18
JIANGSU ACAD OF AGRI SCI
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Most of the existing software for SNP development is based on the calculation of all SNPs within a species. The software developed for SNP sites between closely related species, especially for cSNP sites, is very limited and difficult to operate for non-bioinformatics researchers.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for batch computing genome coding region SNP sites among related species
  • Method for batch computing genome coding region SNP sites among related species
  • Method for batch computing genome coding region SNP sites among related species

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Example 1. Establishment of a method for batch calculation of genomic cSNP sites among closely related species

[0037] The flow chart of the method for batch calculation of genomic cSNP sites between closely related species provided by the present invention is shown in figure 1 , including the following steps:

[0038] (1) Using the InParanoid program for clustering based on the results of Blast pairwise comparisons, identify the orthologous genes in the Speci I and Speci II genomes (data sets A and B) to be tested, operate under the Linux system, and use the default parameter settings Perform analysis; follow the steps below to obtain complete orthologous gene pair ID and score information files (C data set);

[0039] Steps to obtain the complete orthologous gene pair ID and score information file: Open the folder where the InParanoid software package "inparanoid.pl" file is located, and run the "perl inparanoid.pl XXX1 XXX2" command, where "XXX1" and "XXX2" Represe...

Embodiment 2

[0056] Example 2, using the method established in Example 1 to calculate the cSNP sites in the genome of closely related species of Chinese cabbage and Brassica oleracea in batches

[0057] Enter the link of the Chinese Cabbage Genome Project (http: / / brassicadb.org / brad / ) and the link of the Brassica Genome Project of the Institute of Oil Crops (http: / / www.ocri-genomics.org / bolbase / index.html) database to download the genome sequence of cabbage (Brassica rapa) (10 chromosomes, 485Mb) and the genome sequence of cabbage (Brassica oleracea) (9 chromosomes, 630Mb). On the Windows system or the local Linux computing server, the calculation of the cSNP sites between the cabbage and cabbage genomes was performed. During the calculation process, the commonly used program names, operating environments and addresses involved are shown in Table 1. The specific operation steps of the calculation method are as follows:

[0058] 1) Carry out with reference to the step (1) of Example 1. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for batch computing genome coding region SNP sites among related species. According to the method for computing the genome coding region SNP sites (coding SNP, cSNP) among the related species, an InParanoid program for clustering searching orthologous genes is comprehensively utilized after the Blast-based paired comparison results based on the Crossmatch carrier shielding software for comparing two sets of DNA coding region sequences (cds sequences) through combination of the Perl scripting language programming method. The experiment proves that the method forbatch computing the cSNP sites among the related species is systematic and has the advantages of good repetition effect, high speed and easy implementation of batch, automation and process flow in case of detecting the cSNP among the related species.

Description

technical field [0001] The invention belongs to the field of biotechnology, and relates to a method for calculating SNP (cSNP) sites in genome coding regions among closely related species in batches. Background technique [0002] Molecular markers are an important tool in molecular genetics research. It can be widely used in map construction, molecular marker-assisted breeding, population association analysis, biological population diversity analysis, kinship research and other fields. Interspecies universal molecular markers can be used in research fields such as interspecies evolutionary relationship research, interspecies comparative mapping, molecular marker-assisted breeding, etc., and have important biological significance. Initial studies on the generality of DNA molecular markers mostly focused on polymorphic-rich SSR markers. As a new type of molecular marker developed in recent years, SNP markers have the advantages of high genetic stability, co-dominance, rich c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/20G16B20/30G16B30/10
Inventor 郭月刘静杜建厂胡茂龙浦惠明张洁夫龙卫华张维周晓婴
Owner JIANGSU ACAD OF AGRI SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products