Koozali.org: home of the SME Server
Legacy Forums => General Discussion (Legacy) => Topic started by: Andy Berry on January 17, 2004, 01:44:06 PM
-
Hello Everyone.
I'm running SpamAssassin on 5.6 (SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp), and it works great! Keeps my spam down to almost nothing (Thunderbird catches most of the rest on the desktop side).
My problem is that it's too good... a get some false positives, which means that sometimes I don't get legit mail, need to sort through the junkmail periodically and add good addresses to the whitelist.
I have SA set to Very High (hits required 2.0). When I reduce it to High (hits required = 5.0), I get a lot of spam slipping through.
Is there a way to configure SA so it sits somewhere between these 2? Like, maybe hits req'd = 3.0?
Thanks!.
-A
[%sig%]
-
My solution to the same problem was to let the Baysian filter do its job. Save your spam in a folder (instead of deleting it) and when you have several hundreds teach SpamAssassin how to deal with it.
From the console you can type:
sa-learn --ham /home/e-smith/files/users/[your-user-name]/Maildir/cur/*
sa-learn --spam /home/e-smith/files/users/[your-user-name]/Mail/junkmail/cur/*
If you save real mail in other folders, you could also teach SA about those emails. If you make a mistake with a folder or a few mails just rescan them with the right rule and they will be recategorized.
Once the Baysian filter has something to work with, it works like magic. I receive lots of mail which is only sorted as junkmail due to the 5.4 points from the filter. Excellent. :-)
I have also made a small cronjob which scans a number of folders daily but that's just because I'm lazy.
/Jens
-
Well, well...
Thanks, Jens.
I'll give that a shot.
-A
-
In reading the docs at spamassassin.org, it appears a double-edged sword is associated with use of the Bayes filtering. In fact, the most safe configuration is to leave the self-learning "off" and train the filter manually.
If used properly it can be powerful. If used improperly it can be atrocious.
The documents are pretty clear the Bayes filters need to "learn" from ideally 1000 or so SPAM and then, discreetly from 1000 or so non-SPAM messages. This implies more or less equal numbers for the thing to behave when using the manual "training" method.
Thus I would be cautious about training the system with a few hundred - according to the docs I read.
Your milesage may vary.
regards,
patrick
-
Hi,
In order to get much better detection of Spam you should make sure to have DCC, Pyzor and Razor2 modules installed as well.
I have made a script that installs SA and the modules for both 5.6 and 6.0 and this can be found at:
http://sme.swerts-knudsen.dk
-
Hey thanks Jesper.
I've been using your script for a while now, and it's cut my spam haul down dramatically. Wonderful, useful thing, it is.
The ham / spam learning is tweaking it just that little bit more.
-A
-
Hello ...
Is there a way to configure SA so it sits somewhere between these 2? Like, maybe hits req'd = 3.0?
Global settings are located at:
etc/e-smith/templates-custom/etc/mail/spamassassin/local.cf
User settings are located at:
etc/e-smith/files/users/$username/.spamassassin/user.prefs
-
Can anyone help with the issue on the end of this post ?
Thanks :)
http://forums.contribs.org/index.php?topic=19166.0
-
So I have the latest version of Jesper's install of SA (on 5.6).
I copied my mozilla-mail inbox and junk folders to the server in an attempt to "train" the bayesian filter.
Running sa-learn --ham Inbox ran for a while then reported:
"interrupted at /usr/lib/perl5/site_perl/5.6.1/Mail/Spamassassin/CmdLearn.pm line 250, <INPUT> line 4070164
Any clues to what's wrong?
Also, has anyone tried to set this up?
http://jousset.org/pub/sa-postfix.en.html ?
It set up aliases in Postfix so that users can simply forward Spam and Ham from their mail client and SA automagically trains the Bayes filter. Requires AMaVIS, so could be used in conjunction with ClamAV?
Thanks to all!