Koozali.org: home of the SME Server

Spam filter traning

Offline Troels

  • ***
  • 48
  • +0/-0
Spam filter traning
« on: November 26, 2009, 10:58:33 AM »
Hello Everyone,

Is this still the preferred way to implement user spam filter traning via imap.

http://wiki.contribs.org/Learn

Regards Troels

Offline mmccarn

  • *
  • 2,656
  • +10/-0
Re: Spam filter traning
« Reply #1 on: November 26, 2009, 04:11:11 PM »
I've always set up spam learning manually as described at http://bugs.contribs.org/show_bug.cgi?id=1701#c36

The 'Learn' contrib is supposed to automate and simplify much of the setup required, but I've never used it.

Offline kevinb

  • *
  • 237
  • +0/-0
Re: Spam filter traning
« Reply #2 on: November 26, 2009, 05:22:33 PM »
I have a script that I run after the nightly backup.

It is designed to work with Thunderbird using the "junkmail" folder and not the default "Junk" folder only so users do not have to look in two places for false positives. This will, however, force some junk emails to be "re-sent" to learn-as-spam. If you would prefer to use the TB default "Junk" folder or Outlook's default folder just change the folder name in the script.

You must replace <hostname> with the host name of your server (not the FQDN, just the hostname).

This script will also teach "Ham" from the user's "Saved" folder and sub-folders. Users are intructed to move all saved emails to the "Saved" folder structure. This also helps email client performance since the user should not collect 10's of thousands of emails in the Inbox.

And it will delete mail older than 30 days from the "Trash" folder and send the log to the admin.



---

#!/bin/bash

(
date
echo ''
echo ''
for userdir in $(ls -A1 /home/e-smith/files/users)
do
        echo $userdir
        echo "Find email in the junkmail folder that is less than 24 hours old and feed it to sa-learn"
        test -d /home/e-smith/files/users/$userdir/Maildir/.junkmail && (find /home/e-smith/files/users/$userdir/Maildir/.junkmail -iname '*.<hostname>*' -type f -ctime 0 -exec sa-learn --spam --no-sync '{}' \;) || (echo "    No junkmail folder")

        echo "Find email in the Saved folder and sub-folders that is less than 24 hours old and feed it to sa-learn"
        test -d /home/e-smith/files/users/$userdir/Maildir/.Saved && (find /home/e-smith/files/users/$userdir/Maildir -path '*/.Saved*cur*.<hostname>*' -type f -ctime 0 -exec sa-learn --ham --no-sync '{}' \;) || (echo "    No Saved mail folders")

        echo "Find email in the Trash folder that is more than 30 days old and delete it"
        test -d /home/e-smith/files/users/$userdir/Maildir/.Trash && (find /home/e-smith/files/users/$userdir/Maildir/.Trash -type f -ctime +30 -exec rm -vf '{}' \;) || (echo "    No Trash folder")
        echo
done

sa-learn --sync
date
) 1>/var/log/teach_spam.log 2>&1

sleep 10
mail -s "Learn SPAM" admin </var/log/teach_spam.log


---
« Last Edit: November 26, 2009, 10:31:41 PM by kevinb »

Offline Troels

  • ***
  • 48
  • +0/-0
Re: Spam filter traning
« Reply #3 on: November 26, 2009, 10:13:40 PM »
Tak you guys for your quick responce. I have it up and running now, only time will tell how well it works.

Here is what I did to make it work:


First of all enable spamassassin:
 
#config setprop spamassassin status enabled

#config setprop spamassassin UseBayes 1
#config setprop spamassassin BayesAutoLearnThresholdSpam 12.00
#config setprop spamassassin BayesAutoLearnThresholdNonspam 0.10
#expand-template /etc/mail/spamassassin/local.cf
#sa-learn --sync --dbpath /var/spool/spamd/.spamassassin -u spamd
#chown spamd.spamd /var/spool/spamd/.spamassassin/bayes_*
#chmod 750 /var/spool/spamd/.spamassassin/bayes_*
#chown spamd.spamd /var/spool/spamd/.spamassassin/bayes.mutex
#config setprop spamassassin RejectLevel 12
#config setprop spamassassin TagLevel 5
#config setprop spamassassin Sensitivity custom

I like this manual approach because one is not dependent on a contrib being available.

#signal-event post-upgrade && signal-event reboot



Then downloaded the pearl scripts.

curl -o /usr/bin/LearnAsSpam.pl http://bugs.contribs.org/attachment.cgi?id=1227
curl -o /usr/bin/LearnAsHam.pl http://bugs.contribs.org/attachment.cgi?id=1229
curl -o /etc/cron.d/LearnAsSpam.cron http://bugs.contribs.org/attachment.cgi?id=1231
curl -o /etc/cron.d/LearnAsHam.cron http://bugs.contribs.org/attachment.cgi?id=1232

Created the folders LearnAsHam and LearnAsSpam in the users imap account and copied some mails to them.

Checked the admin's mailboks for status mails and voila it works, no mails when there is noting to do and some when mails are processed, just beautiful.

Regards Troels