Method for predicting splicing sites with paired two ends

A splice site and prediction method technology, applied in the field of double-ended paired splice site prediction, to facilitate the research of double-ended paired splice sites, increase convenience, and improve prediction performance

Pending Publication Date: 2022-05-31
GUILIN UNIV OF ELECTRONIC TECH
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to provide a double-ended paired splice site prediction method for the defects existing in the existing splice site prediction problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for predicting splicing sites with paired two ends
  • Method for predicting splicing sites with paired two ends
  • Method for predicting splicing sites with paired two ends

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0055] Such as figure 1 As shown, a double-ended paired splice site prediction method includes the following steps:

[0056] 1) Use the human reference genome sequence as the source, and collect the splice site sequence data according to the reference genome sequence file and the reference genome annotation file. Specifically, to collect the human splice site data set, you first need to download the human reference genome sequence from the NCBI database, and then download it from The GenCode database downloads the reference genome annotation file, and combines the reference genome sequence and annotation file to obtain the required information.

[0057] The splice site sequence data includes canonical splice site sequences and non-canonical splice site sequences such as figure 2 and image 3 As shown, data processing is performed on the collected splice site sequence data, including the length of the data, introns and exons for region identification processing, and after po...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a double-end paired splicing site prediction method. The method comprises the following steps: acquiring a double-end paired splicing site sample sequence as a reference data set and an independent data set; coding the base sequence through a plurality of feature extraction modes based on the sequence, physicochemical properties and the like; combining a plurality of features as a multi-channel multi-dimensional vector representation; training a convolutional neural network model; and finally, evaluating. The prediction method can be combined with multiple feature representation modes of the sample to help the convolutional neural network to fully learn the intrinsic mode of the sample, and the accuracy of predicting the splicing sites with the paired two ends is improved.

Description

technical field [0001] The invention relates to the technical field of splicing site recognition and prediction of genes, in particular to a double-end paired splicing site prediction method. Background technique [0002] With the development of sequencing technology, researchers have obtained more and more sequencing off-machine data. However, at this stage, the splice site annotations on the reference genomes of organisms are not complete, and there are many new splice sites that people have not discovered. Splice sites are not only the dividing positions of exon and intron boundaries, but also play a key role in the connection between exons. The sequence after exon connection is mature mRNA, and mRNA will be expressed as protein after translational modification. If splicing occurs at the wrong position, it may lead to the wrong expression of disease-causing proteins in the gene, resulting in the inability of the body to complete normal life activities, and may even caus...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/30G16B40/00G06N3/08G06N3/04
CPCG16B20/30G16B40/00G06N3/084G06N3/045
Inventor 张艳菊许峻玮齐王璟王荣兴
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products