Unlock instant, AI-driven research and patent intelligence for your innovation.

DNA methylation prediction method and system based on BERT framework

A prediction method and methylation technology, applied in the field of biological information, can solve the problems of relying on prior knowledge and difficult to be generally applicable, and achieve the effect of performance improvement

Pending Publication Date: 2021-12-03
SHANDONG UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, whether it is traditional machine learning methods or deep learning methods, most still highly rely on handcrafted features as the input of classifiers to train predictive models, relying on researchers' prior knowledge
Therefore, it is difficult to apply universally to all species
On the other hand, the above methods only target one methylation type, and even some methods are only applicable to a specific species

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DNA methylation prediction method and system based on BERT framework
  • DNA methylation prediction method and system based on BERT framework
  • DNA methylation prediction method and system based on BERT framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] This embodiment provides a DNA methylation prediction method based on the BERT framework;

[0037] Such as figure 1 As shown, the DNA methylation prediction method based on the BERT framework includes:

[0038] S101: Obtain the DNA sequence to be predicted;

[0039] S102: Input the DNA sequence to be predicted into the trained deep learning model for predicting DNA methylation, obtain the predicted probability of methylation of the DNA sequence to be predicted, and obtain the final methylation prediction result according to the predicted probability ;

[0040] Wherein, the deep learning model for predicting DNA methylation after the training is based on the model architecture in the deep bidirectional Transformer language model pre-trained by Pre-training of Deep Bidirectional Transformers for Language Understanding (BERT), and the cross-entropy loss function and The experience weighted mutual information is combined and applied to the training process of the deep le...

Embodiment 2

[0091] This embodiment provides a DNA methylation prediction system based on the BERT framework;

[0092] A DNA methylation prediction system based on the BERT framework, including:

[0093] An acquisition module configured to: acquire the DNA sequence to be predicted;

[0094] The prediction module is configured to: input the DNA sequence to be predicted into the deep learning model for predicting DNA methylation after training, obtain the predicted probability of methylation of the DNA sequence to be predicted, and obtain the final The methylation prediction results of ;

[0095] Wherein, the trained deep learning model for predicting DNA methylation is based on the BERT model, and is obtained by combining the cross-entropy loss function and empirical weighted mutual information in the deep learning model training process.

[0096] The specific network model structure of the deep learning model for predicting DNA methylation includes: sequentially connected input module, f...

Embodiment 3

[0112] This embodiment also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, the processor is connected to the memory, and the one or more computer programs are programmed Stored in the memory, when the electronic device is running, the processor executes one or more computer programs stored in the memory, so that the electronic device executes the method described in Embodiment 1 above.

[0113] It should be understood that in this embodiment, the processor can be a central processing unit CPU, and the processor can also be other general-purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a DNA methylation prediction method and system based on a BERT framework; the method comprises the steps: obtaining to-be-predicted DNA sequence data, inputting the data into a trained BERT framework-based neural network model which employs a direct-push-type information maximization loss, outputting the prediction probability of DNA methylation, and carrying out the final prediction, wherein the direct-push type trained neural network model based on the BERT framework with the maximum information loss firstly carries out input processing on an original DNA sequence, and features are extracted based on the BERT framework; predicting the features by using a full-connection neural network, and judging DNA methylation on the basis of an output probability; and performing constraint through direct push type information maximization loss so as to increase the confidence coefficient. According to the method, the features of the original DNA sequence can be automatically extracted, so that the problems caused by a prediction tool are avoided.

Description

technical field [0001] The present invention relates to the field of biological information technology, in particular to a method and system for predicting DNA methylation based on the BERT framework. Background technique [0002] The statements in this section merely mention the background technology related to the present invention and do not necessarily constitute the prior art. [0003] DNA methylation plays an important role in epigenetic modifications that regulate transcription, thereby affecting gene expression. In addition, DNA methylation is dynamically changing due to environmental, disease, age and gender factors. Therefore, abnormal changes in DNA methylation content and patterns are important factors in the development of diseases such as cancer. Currently, there are three types of DNA methylation, including n6-methyladenosine (6mA), 5-hydroxymethylcytosine (5hmC) and n4-methylcytosine (4mC). 4mC has different tasks in controlling DNA replication, distinguis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16B30/00G16B40/00
CPCG16B30/00G16B40/00
Inventor 魏乐义郁莹莹
Owner SHANDONG UNIV