Word distribution and document feature based automatic classification method for spam comments
A technology of spam comments and document features, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as single features, poor scalability, and no comprehensive consideration of word distribution features and document features
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0062] figure 1 Shown is the overall framework of the automatic classification method of spam comments based on word distribution and document features. The input of the method is a small number of labeled online comments (that is, artificially labeled online comments as normal comments or spam comments, forming a labeling set), and a large number of unlabeled comments to be classified (forming a target set). The output of the method is the classification of online comments: normal comments are marked as 0; spam comments are marked as 1. The method of the present invention comprises the following four main steps: 1) collecting network comments, segmenting the comments to obtain a keyword set; 2) establishing a word distribution matrix, training a language model, and calculating the classification probability that unmarked network comments belong to normal comments and spam comments ; 3) extract the document features of network comments, train the Bayes classifier based on pro...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com