Metagenome fragment attribute reduction and classification method based on neighbourhood rough set

A technology of neighborhood rough sets and metagenomics, which is applied in the field of attribute reduction and classification of metagenomic fragments based on neighborhood rough sets, can solve problems such as inability to obtain genetic information, improve classification accuracy, keep classification accuracy unchanged, The effect of reducing the amount of data

Inactive Publication Date: 2017-12-01
JILIN UNIV
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Under the current technical conditions, the number of cultivable microorganisms existing in nature is less than 1% of the total number, that is, the remaining 99% cannot obtain their genetic information through traditional genomics methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Metagenome fragment attribute reduction and classification method based on neighbourhood rough set
  • Metagenome fragment attribute reduction and classification method based on neighbourhood rough set
  • Metagenome fragment attribute reduction and classification method based on neighbourhood rough set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0018] A. Download the whole genome sequences of 10 kinds of microorganisms from the US National Center for Biotechnology (NCBI: US National Center for Biotechnology Information). The downloaded microbial genome sequences are all linear sequences composed of four nucleotides A, T, G, and C. Sequence; 100 non-overlapping DNA fragments with a length of 1000bp (base points) were randomly cut out from each bacterium; the four nucleotides A, T, G, and C in the metagenomic fragment were converted into numbers 0, 1, 2, 3; The meaning of k-mer frequency is the number of occurrences of K nucleotides in the gene sequence. Studies have shown that the K-mer frequency distribution of DNA fragments derived from the same species is similar, and the K-mer frequency of DNA fragments derived from different species The distribution is different, so the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a metagenome fragment attribute reduction and classification method based on a neighbourhood rough set. The method comprises the following steps that: A: randomly obtaining the whole genome sequence of microbial flora, taking the amount and the dimension of each DNA (deoxyribonucleic acid) fragment as condition attributes, and randomly selecting a line and a row as corresponding identification for representing each DNA fragment to serve as a decision attribute so as to form a decision table; B: taking an initial reduction set as an empty set, taking a sample as a whole domain of discourse, calculating the attribute importance degrees of all residual attributes each time, and adding the attribute with the highest attribute importance degree into the reduction set until the attribute importance degrees of all residual attributes are smaller than the lower limit of the set attribute importance degree; and C: classifying the low-dimension genes of the reduced metagenome DNA fragment in the B. Under a situation that the metagenome fragment is not assembled, the metagenome fragment is reduced to obtain classification accuracy which is the same with or higher than classification accuracy before reduction is carried out.

Description

technical field [0001] The invention belongs to the field related to bioinformatics analysis technology, and specifically relates to a method for attribute reduction and classification of metagenomic fragments based on neighborhood rough sets. Background technique [0002] Under the current technical conditions, the number of cultivable microorganisms existing in nature is less than 1% of the total number, that is, the remaining 99% cannot obtain their genetic information through traditional genomics methods. The neighborhood rough set attribute reduction method can overcome the shortcomings of classical rough sets that need to discretize the data when processing continuous data, and can objectively reflect the original appearance of the data. It has been effectively applied in big data analysis, Knowledge dependency discovery, attribute subset selection, decision rule discovery, classification analysis and other fields have important theoretical research value and practical...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/18G06F19/24
CPCG16B20/00G16B40/00
Inventor 刘富薛健侯涛刘云姜守坤
Owner JILIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products