Similarity of email message characteristics is used to detect bulk and spam email. A determination of “sameness” for purposes of both bulk and spam classifications can use any number and type of evaluation modules. Each module can include one or more rules, tests, processes, algorithms, or other functionality. For example, one type of module may be a word count of email message text. Another module can use a weighting factor based on groups of multiple words and their perceived meanings. In general, any type of module that performs a similarity analysis can be used. A preferred embodiment of the invention uses statistical analysis, such as Bayesian analysis, to measure the performance of different modules against a known standard, such as human manual matching. Modules that are performing worse than other modules can be valued less than modules having better performance. In this manner, a high degree of reliability can be achieved. To improve performance, if a message is determined to be the same as a previous message, the previous computations and results for that previous message can be re-used. Users can be provided with options to customize or regulate bulk and spam classification and subsequent actions on how to handle the classified email messages.