Transcription factor binding site prediction algorithm and device across transcription factors

A transcription factor and binding site technology, applied in the field of bioinformatics, can solve problems that cannot be used to predict transcription factor binding sites

Active Publication Date: 2019-10-15
HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, current prediction methods cannot be used to predict the binding site of a transcription factor...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Transcription factor binding site prediction algorithm and device across transcription factors
  • Transcription factor binding site prediction algorithm and device across transcription factors
  • Transcription factor binding site prediction algorithm and device across transcription factors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] refer to figure 1 , figure 2 , Embodiment 1 of the present invention provides a binding site prediction method across transcription factors, the main steps are:

[0036] 1. Use the convolutional neural network model to predict the amino acids that can bind to DNA in all transcription factors, which are called DNA binding sites. The predicted DNA binding sites are mainly used to measure the labeled data of different transcription factors in the training process of the target transcription factor model contribution in .

[0037] Second, we use a long-short memory network model (LSTM) to learn representation vectors for transcription factors from sequences consisting of predicted DNA-binding sites.

[0038] 3. Using a convolutional neural network model to learn the high-order dependencies of DNA fragments from their histone modification signatures.

[0039] Fourth, use the convolutional neural network model to learn the low-order dependencies of DNA fragments from the ...

Embodiment 2

[0046] Embodiment 2 of the present invention provides a device for predicting binding sites across transcription factors, which mainly includes the following modules:

[0047] The DNA binding site prediction module is used to use the convolutional neural network model to predict the amino acids that can bind to DNA in all transcription factors, called DNA binding sites. The predicted DNA binding sites are mainly used to measure the labeling of different transcription factors Contribution of data during training of target transcription factor model.

[0048] A module for learning representation vectors for transcription factors, for learning representation vectors for transcription factors from a sequence consisting of predicted DNA binding sites using a long-short memory network model (LSTM).

[0049] A module for learning high-order dependencies of DNA fragments for learning high-order dependencies of DNA fragments from histone modification signatures of DNA fragments using a c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a transcription factor binding site prediction algorithm and device across transcription factors. The method comprises the following steps: 1, predicting amino acids capable ofbinding to DNA in all transcription factors, namely DNA binding sites, wherein the predicted DNA binding sites are mainly used for measuring contributions of labeled data of different transcription factors in a target transcription factor model training process; 2, learning a representation vector of transcription factors from a sequence composed of the predicted DNA binding sites; 3, learning thehigh-order dependency relationship of the DNA fragments from the histone modification characteristics of the DNA fragments; 4, learning the low-order dependency relationship of the DNA fragments fromthe sequence characteristics of the DNA fragments; 5, splicing the learned vector representation of the transcription factors, the high-order dependency relationship and the low-order dependency relationship of the DNA fragments into a feature vector, inputting the feature vector into a multilayer perceptron to classify the target DNA fragments, and judging whether the target DNA fragments are binding sites of the target transcription factor or not.

Description

technical field [0001] The invention relates to the technical field of bioinformatics, in particular to a transcription factor binding site prediction algorithm and device across transcription factors. Background technique [0002] A transcription factor binding site is a base pair segment of DNA that can be bound by a transcription factor. Because the interaction between transcription factors and DNA plays an important role in the regulation of gene expression, prediction of transcription factor binding sites has important implications for gene regulatory networks and cellular functions including growth control, cell cycle progression and development, and differentiation. The understanding of fundamental cellular processes is of great importance. [0003] Most methods in the prior art use PWM to identify transcription factor binding sites, but the basic assumption of PWM is that the base pairs at all positions in the binding site are independently involved in the interacti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B20/30
CPCG16B20/30
Inventor 徐睿峰周继云杜嘉晨陆勤
Owner HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products