Method for identifying biological sequence and deducing species genetic relationship through digitals
A biological sequence and digital identification technology, applied in the application field of informatics in the field of biology, can solve the problem of not being able to identify organisms with a small amount of data
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0048] For example, the ID number is gi|2745742| to compile the identification code of AIDS-1 virus. The virus genome sequence consists of 9290 bases. Due to the large amount of data, we show a part of its genome (980 bases) for intuitive understanding, such as figure 1 .
[0049] According to the identification method described in the present invention, an identification code is compiled for AIDS type 1 virus, with 20 rows and 17 columns in total, such as figure 2 As shown in , the expression form of the identification code is as shown in formula 4.
[0050] F A ( k 0 ) A / D k 0 ...
Embodiment 2
[0051] Example 2 Reconstruction of the well-known mammalian evolutionary tree
[0052] Use statistical tools to screen species-specific partial information associations, (i) take 36 mammals as sample species, randomly select 100 sequences with a length of 1kb from each sample species genome as sample sequences; (ii) calculate the sample For the information association and partial information association between the sequence and the mitochondrial genome sequence, k ranges from 0 to 248; (iii) set up 50 different starting points k 0 , a vector with the maximum dimension d=8, and perform analysis of variance and multiple comparisons.
[0053] Table 1 shows the results of the above statistics. The table lists the average failure scores of vector X corresponding to d=2, 4, 6 and 8 Where X represents information association or partial information association. is for 50 random k 0 The failure score W of the corresponding d-dimensional vector X (k 0 , d) average. Normalized t...
Embodiment 3
[0059] Example 3 Construction of parvoviruses (Parvoviruses) phylogenetic tree
[0060] Use statistical tools to test the species specificity of partial information association: (i) take all (32) viral genomes as sample genomes, and randomly select 50 sequences with a length of 1kb from each sample genome as sample sequences; (ii) in k ranges from 0 to 198 to calculate the information association and partial information association between the sample sequence and the whole genome; (iii) set up 50 different starting points k 0 , The vector X with the maximum dimension d=10 is used for analysis of variance and multiple comparisons.
[0061] The results of the multiple comparisons are shown in Table 2, and the average failure score table for d=4, 6, 8 and 10, the average failure score of vector X is 1 Where X represents information association or partial information association. is for 50 random k 0 The failure score W of the corresponding d-dimensional identification code ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com



