Koozali.org: home of the SME Server

How to tweak SpamAssassin ?

Andy Berry

How to tweak SpamAssassin ?
« on: January 17, 2004, 01:44:06 PM »
Hello Everyone.
I'm running SpamAssassin on 5.6 (SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp), and it works great! Keeps my spam down to almost nothing (Thunderbird catches most of the rest on the desktop side).
My problem is that it's too good... a get some false positives, which means that sometimes I don't get legit mail, need to sort through the junkmail periodically and add good addresses to the whitelist.
I have SA set to Very High (hits required 2.0).  When I reduce it to High (hits required = 5.0), I get a lot of spam slipping through.

Is there a way to configure SA so it sits somewhere between these 2? Like, maybe hits req'd = 3.0?

Thanks!.
-A

[%sig%]

Jens Kruuse

Re: How to tweak SpamAssassin ?
« Reply #1 on: January 18, 2004, 12:24:50 PM »
My solution to the same problem was to let the Baysian filter do its job. Save your spam in a folder (instead of deleting it) and when you have several hundreds teach SpamAssassin how to deal with it.

From the console you can type:
sa-learn --ham /home/e-smith/files/users/[your-user-name]/Maildir/cur/*
sa-learn --spam /home/e-smith/files/users/[your-user-name]/Mail/junkmail/cur/*
If you save real mail in other folders, you could also teach SA about those emails.  If you make a mistake with a folder or a few mails just rescan them with the right rule and they will be recategorized.

Once the Baysian filter has something to work with, it works like magic. I receive lots of mail which is only sorted as junkmail due to the 5.4 points from the filter. Excellent. :-)
I have also made a small cronjob which scans a number of folders daily but that's just because I'm lazy.

/Jens

Andy Berry

Re: How to tweak SpamAssassin ?
« Reply #2 on: January 18, 2004, 01:23:52 PM »
Well, well...
Thanks, Jens.
I'll give that a shot.  
-A

patrickthickey

training Spam Assassin
« Reply #3 on: January 20, 2004, 05:42:54 PM »
In reading the docs at spamassassin.org, it  appears a double-edged sword is associated with use of the Bayes filtering. In fact, the most safe configuration is to leave the self-learning "off" and train the filter manually.

If used properly it can be powerful. If used improperly it can be atrocious.

The documents are pretty clear the Bayes filters need to "learn" from ideally 1000 or so SPAM and then, discreetly from 1000 or so non-SPAM messages.  This implies more or less equal numbers for the thing to behave when using the manual "training" method.

Thus I would be cautious about training the system with a few hundred - according to the docs I  read.

Your milesage may vary.

regards,

patrick

Jesper Knudsen

Did you install all the sub-modules (DCC, Pyzor, Razor2)?
« Reply #4 on: January 22, 2004, 11:08:16 AM »
Hi,

In order to get much better detection of Spam you should make sure to have DCC, Pyzor and Razor2 modules installed as well.

I have made a script that installs SA and the modules for both 5.6 and 6.0 and this can be found at:

http://sme.swerts-knudsen.dk

andyberry

How to tweak SpamAssassin ?
« Reply #5 on: January 31, 2004, 01:55:39 AM »
Hey thanks Jesper.
I've been using your script for a while now, and it's cut my spam haul down dramatically.  Wonderful, useful thing, it is.

The ham / spam learning is tweaking it just that little bit more.

-A

bobk

Re: How to tweak SpamAssassin ?
« Reply #6 on: January 31, 2004, 03:34:47 PM »
Quote from: "Andy Berry"
Hello ...
Is there a way to configure SA so it sits somewhere between these 2? Like, maybe hits req'd = 3.0?


Global settings are located at:
etc/e-smith/templates-custom/etc/mail/spamassassin/local.cf

User settings are located at:
etc/e-smith/files/users/$username/.spamassassin/user.prefs

Offline gbentley

  • *****
  • 482
  • +0/-0
  • Forum Lurker
    • Earth
How to tweak SpamAssassin ?
« Reply #7 on: February 02, 2004, 01:37:54 AM »
Can anyone help with the issue on the end of this post ?

Thanks :)

http://forums.contribs.org/index.php?topic=19166.0
"If you don't know what you want, you end up with a lot you don't."

neill

Training SA
« Reply #8 on: February 05, 2004, 04:18:27 PM »
So I have the latest version of Jesper's install of SA (on 5.6).

I copied my mozilla-mail inbox and junk folders to the server in an attempt to "train" the bayesian filter.

Running sa-learn --ham Inbox  ran for a while then reported:
"interrupted at /usr/lib/perl5/site_perl/5.6.1/Mail/Spamassassin/CmdLearn.pm line 250, <INPUT> line 4070164

Any clues to what's wrong?

Also, has anyone tried to set this up?
http://jousset.org/pub/sa-postfix.en.html ?
It set up aliases in Postfix so that users can simply forward Spam and Ham from their mail client and SA automagically trains the Bayes filter.  Requires AMaVIS, so could be used in conjunction with ClamAV?

Thanks to all!