A method for functional prediction of single nucleotide genomic variation in a non-coding region

A single nucleotide variation, single nucleotide technology, applied in the field of genes, to achieve the effect of low time and economic cost

Active Publication Date: 2018-12-18
SOUTHEAST UNIV
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Unfortunately, at present, there is no set of methods to systematically identify the binding sites of multiple transcription factors at one time and evaluate the effect of nucleotide variations located in these transcription factor binding site regions on transcription factor binding and downstream gene transcription. influences

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for functional prediction of single nucleotide genomic variation in a non-coding region
  • A method for functional prediction of single nucleotide genomic variation in a non-coding region
  • A method for functional prediction of single nucleotide genomic variation in a non-coding region

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0031] Embodiment 1: Accuracy experiment of the prediction method of the present invention: Calculating the influence of a single nucleotide polymorphism of DNA on a transcription factor binding site on transcription factor binding

[0032] The proto-oncogene c-MYC has a regulatory region (8q24) at a distance of 335kb. This region has a SNP (rs6983267) (genome assembly version GRCh37.p13). In the Thousand Genomes Project, 1008 Asian populations (Phase3_V1-EAS ), on the positive strand, the frequency of this site is guanine (G) is G=0.388, and the frequency of cytosine (C) is C=0.612. In the European population (thousand genomes, population size 1006), the frequency was G=0.499, T=0.501. The DNA sequence ("ATGAAAGGC") where the SNP is located is the binding site of the transcription factor TCF4, and the target gene regulated by it is c-MYC.

[0033]In the European population, genotype G has a slightly lower frequency (0.501) and genotype T has a slightly higher frequency (0.49...

Embodiment 2

[0037] Example 2 Identifying transcription factors common to 8 cell lines and predicting the function of genomic variation in transcription factor binding sites

[0038] 1. Data source:

[0039] The high-throughput sequencing data (DNase-Seq) of 8 cell lines GM12878, IMR90, MCF-7, K562, BJ, H7, HepG2 and M059J came from the data accession number of the National Center for Biotechnology Information (GEO ID: GSE32970) (Nature 2012 Sep 6;489(7414):75-82). The cell types of the eight cell lines are listed in Table 1.

[0040] Table 1

[0041]

[0042] The sources of expression data (RNA high-throughput sequencing (RNA-Seq)) of the eight cell lines are shown in Table 1. The variation data was obtained from the genome information browser (UCSC genome browser) of the University of California, Santa Cruz, and the simple variation data information (dbSNP150) of the human genome hg19 was obtained by using the Tables function. The SNP of a single base is selected according to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for functional prediction of single nucleotide genomic variation in a non-coding region, which comprises the following steps: 1) identifying a chromatin open region; 2) identifying transcription factor bin sites; 3) assessing the role of single nucleotide variations: Based on the site-specific frequency matrix of transcription factors, the effect of single nucleotide variations in the transcription factor binding site region on transcription factor binding is calculated, and the single nucleotide variations that significantly change the binding capacity of transcription factors are identified; the role of single nucleotide variations is further evaluated by viewing the biological pathway of the target gene of the transcription factor. The method recognizesa variety of transcription factors and their binding sites at one time through chromatin open region information and gene expression information, and realizes the functional annotation of non-coding region genomic mutation.

Description

technical field [0001] The invention belongs to the field of gene technology, and in particular relates to a method for predicting the function of a single nucleotide genome variation in a non-coding region. The invention is based on high-throughput sequencing information of open chromatin regions to identify transcription factors (transcription factors, TFs) in eukaryotic genomes. ) and their binding sites, and methods for assessing the effect of single nucleotide variations based on motifs (motifs) of transcription factor binding to DNA. Background technique [0002] All biological functions and characteristics of cells are related to the transcriptional regulation of genes. Transcriptional regulation is cell-type specific and closely related to differentiation and carcinogenesis. It is a key to analyzing the laws of cells and solving cancer problems. To analyze the transcriptional regulation of genes, the first task is to identify the binding sites (TFBS) of various trans...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/20
Inventor 刘宏德孙啸罗坤马伟恒
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products