The invention relates to a 
gene classification method and 
system based on clustering and 
random forest algorithms and belongs to the technical field of biological information. The method comprises a step of acquiring 
gene sample data, clustering the acquired 
gene sample data by using the clustering 
algorithm to obtain a cluster center, and supplementing a training sample set with an obtained cluster center set, a step of adjusting the number of fixed 
decision tree random description attributes in a traditional 
random forest algorithm to a random value, wherein on one hand, strong decision trees in a 
decision tree set are kept, on the other hand, the number of average random description attributes of the 
decision tree set is reduced, thus the correlation between the decision trees is further reduced, and a step of predicting 
genetic data to be classified by using each decision tree in a 
random forest model. According to the method and the 
system, the cluster center obtained through theclustering 
algorithm is taken as artificial data to expand the 
training set of the random forest model, thus the random forest model is fully trained, the obtained classification model has high precision, and the accuracy of the classification of 
genetic data is improved.