There are a few things you need to train spamassassin to do before bayes can start learning how to tell the difference between spam and non-spam. The more you train bayes, the better the learning algorithm.

First to make sure bayes can be turned on, bayes needs to be trained for 200 hams and 200 spams. Run the following command:


# sa-learn --dump magic

And then you should get something like this:


0.000 0 5752 0 non-token data: nspam
0.000 0 1702 0 non-token data: nham

As you can see from the above example, I have 5752 spams and 1702 hams
The spam and ham totals must be at least 200 each.

The nspam total is the total amount of spams Bayes has learned.
The nham total is the total amount of hams Bayes has learned.

Here is how to train SpamAssassin hams and spams.

There are a few ways to feed sa-learn spams and hams. The easiest way is
by running the command right from console. Lets just say that you have a
folder in ~vpopmail/domains/domain.ext/test/Maildir/spam. Run the sa-learn
command like so. Replace domain.ext with your domain andreplace user with the actual user on your system :


# sa-learn --spam ~vpopmail/domains/domain.ext/user/Maildir/.Spam/new

To learn hams in ~vpopmail/domains/domain.ext/user/Maildir/new, run


# sa-learn --ham ~vpopmail/domains/domain.ext/user/Maildir/new

You'll get an output similar to the following in wither either case. Actual messages numbers may vary.


Learned from 30 message(s) (30 message(s) examined).

This tells you that out of 30 messages in the new folder, 30 were learned. If you run sa-learn --dump magic, your nspam total will have 30 more new messages learned as spam.

You basically need 200 hams and 200 spams before you can enable bayes autolearning. Once you have done that, add or modify the following lines to your local.cf

# The line below needs to point to the users bayes_path that spamassassin runs as. In this case, the qscand home folder is /tmp


bayes_path /tmp/.spamassassin/bayes
use_bayes 1
bayes_auto_learn 1
bayes_file_mode 0770

The first line tells the bayes path to tell bayes where to store the bayes database. The next line enables bayes. The next line after that enables autolearning. and the next line just forces a chmod of 770 on the bayes database for security reasons.

Restart spamd and within a day or so you will see autolearn appear in your headers. I am not sure why it takes so long for it to come into the header part of the emails. It just does for some reason.