Multi-species feature selection and unknown gene identification methods
Patent Information
- Authority / Receiving Office
- CN ยท China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TSINGHUA UNIV
- Publication Date
- 2017-02-22
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The invention relates to the field of life sciences, in particular to a method for multi-species feature selection and identification of unknown genes. Background technique
[0002] A number of tools for predicting the probability of protein-coding transcripts have been published, including CONC, CPC, PhyloCSF, RNAcode, PLEK, CNCI, CNCTDiscriminator, CPAT, HMMER, and lncRNA-ID (1-10), but the vast majority of these tools Some only used the sequence information of the transcripts. These sequence information include but not limited to: Open reading frame (Openreading frame, ORF) characteristics, such as ORF length and coverage, etc. (1,2,4,7,9); base frequency (nucleotide frequencies) characteristics, such as k-mer Sequence patterns, codon usage, etc. (1,2,5,7-9); conservation score features such as base sequence alignment or protein sequence alignment, etc. (1-4); Evolution-related features such as substitution rate and phylogenic score (7,10) and in ...