The invention provides a method for establishing a machine learning model to check webpage hidden links through domain identification and theme identification. The method comprises the steps of collecting a great number of webpage source codes which are marked as the webpage source codes comprising hidden links and the webpage source codes marked as the normal webpage source codes, and forming a training set based on the webpage source codes; extracting a risk text, risk degree, theme difference degree, a theme, a risk text vector, a risk text abnormal probability and the risk text length in each web page source code through suspicious domain identification, sensitive domain identification, secure domain identification, all-domain analysis and theme identification; carrying out model training on characteristic data of all web page source codes in the training set through utilization of a machine learning algorithm, thereby obtaining a classification distinguishing model; and importingthe characteristic data of to-be-predicted web page source codes into the classification distinguishing model, thereby obtaining a result of whether the to-be-predicted web page source codes comprisethe hidden links or not. According to the method, the highly mixed hidden link code identification effect is good, the feature extraction is relatively complete and the problem that a traditional method cannot accurately distinguish the hidden links and page tampering can be solved well.