Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hierarchical TADs differential analysis method in Hi-C contact matrix based on online machine learning

A technology of machine learning and difference analysis, applied in the biological field, can solve problems affecting the recognition rate of differential TADs

Active Publication Date: 2019-08-06
XI AN JIAOTONG UNIV
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The existing methods for analyzing the differences of TADs are lacking in considering the hierarchical structure of TADs, which will affect the recognition rate of differential TADs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hierarchical TADs differential analysis method in Hi-C contact matrix based on online machine learning
  • Hierarchical TADs differential analysis method in Hi-C contact matrix based on online machine learning
  • Hierarchical TADs differential analysis method in Hi-C contact matrix based on online machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0040] Such as figure 1 As shown, the difference analysis of hierarchical TADs under the two cell lines based on TADs boundary region recognition of the present invention comprises the following steps:

[0041] Step 1. Standardize the Hi-C data to eliminate the systematic deviation of the Hi-C experiment and enhance the comparability between the data. The specific method is as follows:

[0042] Firstly, multiHiCcompare, a cross-cell line Hi-C data normalization method, was used to preliminarily process the Hi-C data under different cell lines, so as to eliminate the systematic deviation between different cell lines as much as possible; then, the data standardization method CPM (Counts per million) to process the preliminary processed Hi-C data to further enhance the comparability of Hi-C data among different cell lines.

[0043] Step 2. Calcul...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a hierarchical TADs differential analysis method in an Hi-C contact matrix based on online machine learning. The method comprises the steps of performing standardizing processing on Hi-C data for eliminating an experiment system deviation and improving inter-data comparability; calculating an interactive frequency average value between the upstream area and the downstream area of each bin of the standardized data, and marking as binSignal (i); performing fitting and rank sum testing on the sequence binSignal, thereby obtaining boundary area points of the TADs; obtainingall possible hierarchical TADs according to the boundary area points, and presenting a mathematical model between an interaction frequency number and all possible hierarchical TADs in the Hi-C contact matrix; determining a target function of the model, and firstly utilizing an online machine learning algorithm FTRL for solving a hierarchical TADs differential analysis model, and identifying the hierarchical TADs with variation between different cell lines. According to the method of the invention, the mathematical model between the interaction frequency number and the hierarchical TADs in theHi-C contact matrix is presented; and furthermore the online machine learning algorithm FTRL is utilized for calculating the weight coefficients of all hierarchical TADs, thereby identifying the TADswith difference between different cell lines.

Description

technical field [0001] The invention belongs to the field of biotechnology, and relates to the difference analysis method of hierarchical TADs in different cell lines, in particular to a method for analyzing the difference of hierarchical TADs in Hi-C contact matrix based on online machine learning. Background technique [0002] Hi-C technology is a high-throughput chromatin conformation capture technology. Through Hi-C experiments, the interaction information between any sites in the whole genome can be obtained. Hi-C data is the data obtained through Hi-C experiments. The general form of Hi-C data is a matrix, which is called the contact matrix. The contact matrix is ​​a symmetrical square matrix. Each element in the contact matrix is ​​called interaction frequency. With the development of Hi-C technology, scientists found that each chromosome can be roughly divided into two compartments (A / B compartment) where the chromosome state is active and negative when studying Hi-...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/00G16B25/00G16B40/00G16B45/00G06F17/15G06F17/16G06N20/00
CPCG06F17/15G06F17/16G06N20/00G16B20/00G16B25/00G16B40/00G16B45/00
Inventor 吕红强刘聪毅韩九强
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products