Mythic Beasts

Spam

Introduction

Unsolicited commerical email (spam) is a significant problem for just about all Internet users. There are two anti-spam techniques on common use today:

Rules-based filtering is easier to setup as it doesn't have to be taught. Unfortunately, the static nature and previous success of rules-based filters means that spammers now actively target their spam so that it doesn't get caught by popular, rules-based anti-spam software, such as SpamAssassin.

Bayesian filtering can be extremely effective, but requires to collect a large amount that has been sorted accurately into spam and non-spam, and to be effective requires that you continue to train the filter on mail that you receive. For more information on how and why Bayesian filtering works, please see A Plan for Spam by Paul Graham.

Our recommended approach is to use a combination of both rules-based filtering, and Bayesian filter. SpamAssassin provides both of these, however we recommend that it be used only for the rules-based filtering. SpamAssassin is not designed to work well in multi-user environment, and this is particularly true for its Bayesian functionality.

SpamAssassin

All mail that is delivered to a Mythic Beasts shell account is scanned with SpamAssassin rules, and headers are added to indicate the results. For more information on how to filter your email based on these headers, please see our SpamAssassin page.

bfilter

Our recommended tool for Bayesian filtering is bfilter, written by Chris Lightfoot. In order to setup Bayesian filtering, you will need to have sorted a reasonable amount of mail into spam and non-spam folders. Let's assume that these two folders are mail/spam and mail/ham. A good way to achieve this is to run SpamAssassin rules filtering only for a while, and religiously move missed spam from your inbox to your spam folder (and vice-versa, if you get any false positives). To train bfilter, run the following commands:


bfilter isreal < mail/ham
bfilter isspam < mail/spam
		
		

You must now arrange for your mail to be run through bfilter, and then filter your mail based on the result. You should do this be extending the procmail config setup for SpamAssassin. You should adjust your .procmailrc file to contain the following:


# Pass mail through bfilter
:0 fw
| /software/bin/bfilter test

# Check for the bfilter verdict
:0:
* ^X-Spam-Probability: YES
mail/spam

# Check for the SpamAssassin verdict
:0:
* ^X-Spam-Status: Yes
mail/spam
		
		

This will save any mail that is deemed to be spam into a folder called spam.

Copyright © 2000-2007 Mythic Beasts Ltd. All Rights Reserved.