The invention discloses a differentially expressed
gene identification method based on combined constraint non-negative matrix factorization. The method comprises the following steps of 1, representing a
cancer-
gene expression data set with a non-negative matrix X, 2, constructing a
diagonal matrix Q and an element-
full matrix E, 3, introducing manifold learning in the classical non-negative matrix factorization method, conducting orthogonal-constraint sparseness and constraint on a
coefficient matrix G, and obtaining a combined constraint non-negative matrix factorization target function, 4, calculating the target function, and obtaining iterative formulas of a basis matrix F and the
coefficient matrix G, 5, conducting semi-supervision non-negative matrix factorization on the non-negative
data set X, and obtaining the basis matrix F and the
coefficient matrix G after iteration convergence, 6, obtaining an evaluation vector (the formula is shown in the description), sorting elements in the evaluation vector (the formula is shown in the description) from large to small according to the basis matrix F, and obtaining
differentially expressed genes, 7, testing and analyzing the identified
differentially expressed genes through a GO tool. The identification method can effectively extract the
differentially expressed genes where
cancer data is concentrated, and be applied in discovering differential features in a
human disease gene database. The identification method has important
clinical significance for early diagnosis and target treatment of diseases.