The invention provides a microblog information filtering method based on multi-
information fusion, which belongs to the technical field of intelligent
information processing. The method comprises the following steps of step 1, building distributed
crawling, and
crawling microblog data; step 2, preprocessing the microblog data; step 3, carrying out
Chinese word segmentation on the microblog data, deleting stop words, acquiring a word segmentation result, and obtaining a word set VOC; step 4, extracting characteristics from the perspective of microblog contents; step 5, extracting microblog characteristics from the perspective of the
client; step 6, extracting characteristics from a transmission path; step 7, building a classification model, and screening non-junk microblogs. According to the microblog information filtering method based on multi-
information fusion, the double processes of microblog information duplicate removal and a classification learning
algorithm are combined to delete microblog junk information, so that the microblog information can be filtered, and not only is reduplicative microblog information filtered, but also junk microblog information can be filtered.