Similarity analysis method of negative sequence mode based on biological sequence, implementation system and medium

A similarity analysis and biological sequence technology, applied in the application field of high-utility negative sequence rules, can solve problems such as lack of similarity measurement methods
CN112182497AActive Publication Date: 2021-01-05山东元竞信息科技有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
山东元竞信息科技有限公司
Publication Date
2021-01-05

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to a similarity analysis method of a biological sequence based on a negative sequence mode, an implementation system and a medium. The similarity analysis method comprises the following steps: (1) data preprocessing: representing letters in a DNA sequence with numbers; dividing the data into a plurality of blocks, and taking the obtained blocks as a data set for frequent pattern mining; (2) frequent pattern mining: using an fNSP algorithm to mine a data set; (3) performing graphic representation on the maximum frequent positive and negative sequence modes; converting themaximum frequent positive and negative sequence modes into a digital sequence; (4) similarity analysis of the DNA sequences: solving the similarity of different DNA sequences, and selecting the DNA sequence corresponding to the minimum similarity as the DNA sequence to be researched. According to the method, the negative sequence can be effectively expressed and analyzed, and different analysis results can be obtained by selecting different maximum frequent pattern combinations, and therefore the memory and time consumption of a computer are greatly reduced.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a similarity analysis method, a realization system and a medium of a biological sequence-based negative sequence pattern, and belongs to the application technical field of decision-making high-utility negative sequence rules. Background technique

[0002] In recent years, we have obtained a large amount of biological sequence data. With the advancement of DNA and protein sequencing technology, it is very important to interpret various information contained in biological sequence data, especially the genetic and regulatory information in DNA sequences, protein sequence structure and The demand for data analysis tools for functional relationships increases, and sequence similarity analysis is widely used. Whenever we obtain a new DNA sequence, we hope to prove that it is similar to some known sequences through similarity analysis. If it has homology with known sequences, it will greatly save the function of re-determining the new...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More