Protein family phylogenetic analysis method based on amino acid sequence alignment

A technology of sequence alignment and protein family, applied in the field of cluster analysis and biology, it can solve problems such as excessive calculation, achieve accurate results, improve clustering speed, and speed up.

Pending Publication Date: 2022-08-09
HUAZHONG AGRI UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Compared with the analysis methods based on the evolutionary distance between sequences, the maximum likelihood method, Bayesian inference method and maximum parsimony method can retain more sequence information, so they can get more accurate results, but due to the large The likelihood method and Bayesian inference method are too computationally intensive, while the maximum parsimony method is only applicable to close sequences, and its applicability is not as wide as that of the neighbor-joining method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protein family phylogenetic analysis method based on amino acid sequence alignment
  • Protein family phylogenetic analysis method based on amino acid sequence alignment
  • Protein family phylogenetic analysis method based on amino acid sequence alignment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0092] like Figure 1-2 As shown, the present embodiment provides a protein family phylogenetic analysis method based on amino acid sequence alignment, including:

[0093] 1. Amino acid sequence extraction of plant HB protein family: We retrieved the amino acid sequences of plant HB family full-length proteins (18147 in total) and Homeodomain (HD) (15184 in total) in the PlantTFDB database, and Its family annotation information.

[0094] 2. Obtain plant HB protein superfamily gene species annotation information: according to the degree of species development, according to the degree of species evolution, it is divided into angiosperms (Angiospermae), conifers (Coniferophyta), Lycopodiophyta (Lycopodiophyta) , Bryophyta, liverwort (Marchantiophyta), Charophyta and Chlorophytae. For dicots, the species information is divided into Asterids, Basal Magnoliophyta, Fabids, Malvids, Other Eudicots, Monocots (Monocots) and other plants (Other plants).

[0095] 3. Obtain the annotat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a protein family phylogenetic analysis method based on amino acid sequence alignment, which comprises the following steps: obtaining a combined multi-sequence alignment result based on an amino acid sequence alignment fusion method; digitizing a combined multi-sequence comparison result, and constructing a fractional matrix; performing dimension reduction and clustering processing on the fractional matrix to obtain an input sequence; identifying a specific site and a conserved site of the input sequence; performing quasi-time analysis on the input sequence to obtain a track sequence of the input sequence; and obtaining a development trajectory of the input sequence based on the trajectory sorting. According to the method, fractional matrix construction and dimension reduction analysis are carried out through the sequence site features, so that the clustering and evolutionary relationship between gene families is deduced, the sequence gene clustering speed is effectively increased under the condition that the sequence clustering stability is guaranteed, and a new tool and method are provided for gene phylogenetic analysis and development trajectory analysis.

Description

technical field [0001] The invention belongs to the field of cluster analysis and biological technology, and in particular relates to a protein family phylogenetic analysis method based on amino acid sequence alignment. Background technique [0002] Phylogenetic analysis of a group of homologous protein sequences based on multiple sequence alignment and fusion methods, and inferring the evolutionary relationship between these homologous protein sequences is the first step in protein function analysis. After obtaining the multiple sequence alignment results of homologous protein sequences, there are usually two types of methods for phylogenetic analysis, which are phylogenetic analysis methods based on sequence site features, including maximum likelihood method, maximum parsimony method and Bayesian inference; and phylogenetic analysis methods based on evolutionary distances between sequences, including neighbor-joining, minimal evolution, and unweighted group averaging. The...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/00G16B20/00G06K9/62
CPCG16B30/00G16B20/00G06F18/23
Inventor 郑波张哲施雪萍朱苗苗谢琪
Owner HUAZHONG AGRI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products