Koozali.org: home of the SME Server

Using SpamAssassin...

spook

Using SpamAssassin...
« on: March 21, 2006, 08:22:45 PM »
When I do recieve spam, in spite of SpamAssasin being active, is there an easy way for me (and my users) to report it as spam? Was wondering if you could just forward the mail to "spam@mydomain.com" and have it "sent" to spamassassin?

Offline idyll

  • ****
  • 113
  • +0/-0
yes
« Reply #1 on: March 21, 2006, 09:49:44 PM »
First the system needs to have seen anough SPAM for auto-learning to be turned on. Turning it on too soon makes it behave very poorly.

You can use fetchmail or a simple perl script to poll folders in each home directory where your users deposit email which should be learned as SPAM. This is the cleanest way to do it. It is also completly overlooked in most SA installations.

I used this on my previous 6.0.1 server (with Jesper's kind assistance) and it trained SA to a very high degree.

I am now using 7.0 RC1 and waiting for it to have seen enough SPAM to invoke auto-learn.

regards,

patrick
...

jsheets

Using SpamAssassin...
« Reply #2 on: March 23, 2006, 05:22:20 PM »
Does SA invoke autolearn automatically when it has seen enough spam?

Offline idyll

  • ****
  • 113
  • +0/-0
yes
« Reply #3 on: March 23, 2006, 05:43:44 PM »
This issue took me way further into the bowels of spamassassin than I ever imagined or hoped for.   Plus I can't seem to post anything about the issue without being pressed to use RBLs :-).

Note: I use RBLs, and always have. They are a given in my environment.

Auto-learn is indeed set to commence after 200 SPAM and 200 HAM and this can be changed - but there is no reason to fiddle with that parameter.

It is the Bayes filtering itself which is NOT "on" by default, and needs to be in my opinion. Others may not agree.

SA relies upon directories in each home, or otherwise known as ~/.spamassassin. Use this sommand logged in as the specific user, NOT root, to see what the system knows for THAT user...if you are logged in as root it will fail complaining about lack of a toks database.

sa-learn -D --dump magic

You'll see at the end how many "nspam" and "nham" it has learned. By deault it has zeros.

If you look at the spamd log files, you will see "autolearn:no" and "autolearn:failed" pretty much all of the time. This is perfectly normal as most of the SPAM does not need to be learned again. The "failed" can be ignored as the system uses a simple file locking mechanism and it just passes on the db lookup if it cannot get an answer. Only if it "fails" all of the time should you be concerned.

So my advice is to turn it "on" and let the full system work over time. But if you run an extremely slow system, it may add too much load. I mean like 386 or a 486 with say 256mb of RAM, etc. I know they exist and I am not dismissing them as servers.

regards,

patrick
...

meb

How to enable?
« Reply #4 on: March 25, 2006, 12:08:27 AM »
Is RBL blocking enabled by default?  If not, how can I enable it?

How do I enable learning?

Offline idyll

  • ****
  • 113
  • +0/-0
use search
« Reply #5 on: March 25, 2006, 12:18:42 AM »
The SEARCH function serves a very good purpose. did you try it?

To view the defaults for the RBL and DNSBL:

# config show qpsmtpd
qpsmtpd=service
Bcc=disabled
BccUser=maillog
DNSBL=disabled
LogLevel=6
MaxScannerSize=25000000
RBLList=sbl-xbl.spamhaus.org,whois.rfc-ignorant.org,dnsbl.njabl.org,relays.ordb.org
RHSBL=disabled
RequireResolvableFromHost=no
SBLList=dsn.rfc-ignorant.org
access=public
status=enabled

To enable both available lists:

# config setprop qpsmtpd DNSBL enabled RHSBL enabled
# signal-event email-update

To enable Bayes or auto-learn you need to educate yourself so you know what you are getting into. Go to the spamassassin wiki and read up a bit.

patrick
...

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: yes
« Reply #6 on: March 25, 2006, 01:40:30 AM »
Quote from: "idyll"

SA relies upon directories in each home, or otherwise known as ~/.spamassassin.


SA in SME server runs as the unprivileged user spamd, and doesn't have access to any user's home directory or any ~/.spamassassin files.

Offline idyll

  • ****
  • 113
  • +0/-0
OK
« Reply #7 on: March 25, 2006, 02:03:23 AM »
The output I depicted, after I was asked, is still valid for that user's home and the SA status for that same user.

Agree?

patrick
...

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Using SpamAssassin...
« Reply #8 on: March 27, 2006, 11:26:41 AM »
# config setprop qpsmtpd DNSBL enabled RHSBL enabled
# signal-event email-update

Should you also run:  svc -t /service/qpsmtpd ?
Thanks.
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

spook

Using SpamAssassin...
« Reply #9 on: March 31, 2006, 08:02:46 AM »
Quite impressive, 8 answers, none of which seems to answer my question:

How do I report spam as being spam to spamassassin?

Rather simple, I thougt, but I might be wrong...

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Using SpamAssassin...
« Reply #10 on: March 31, 2006, 08:40:58 AM »
Spook,  
"How do I report spam as being spam to spamassassin"
Lets try...It may be that your question is a little ambiguous.  There is no need to report anything to spamassassin.  Spamassassin is a program designed to identify spams using a number of mechanisms, then tag the offensive email to facilitate further processing.  In SME7, the default processing consists in the forwarding the offensive email to the junkmail folder of relevant users.  

You could modify the standard settings of SME7 to change the way in which spams are processed.  One way is to use procmail.  Another way would be to modify 15sortspam to send all spams to a common email folder associated with a user created just for this purpose.  Is this what you are after?
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

Offline raem

  • *
  • 3,972
  • +4/-0
Using SpamAssassin...
« Reply #11 on: March 31, 2006, 09:45:26 AM »
burnat

>...Should you also run:  svc -t /service/qpsmtpd ?

Yes
]
...

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Using SpamAssassin...
« Reply #12 on: March 31, 2006, 09:57:46 AM »
Thank you Ray.
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

Offline brianr

  • *
  • 991
  • +2/-0
Using SpamAssassin...
« Reply #13 on: March 31, 2006, 12:09:40 PM »
I suspect that spook is looking for a faciility to report false negatives to SA so that it is trained to spot them next time.

I have considered this on a number of occasions, but not followed through my thoughts.

My ideas are that each user has another folder "IsJunkMail" which they use to drag and drop spam which has not been spotted by SA.  Then every so often, the folder is interrogated and passed to SA for training and then the contents are deleted, this could  happen daily or weekly, depending on how much gets through.  This can be done by a script (Perl or Shell), which i have not yet had the time to develop.  If anyone else wants to have a go I will be glad to help..
Brian j Read
(retired, for a second time, still got 2 installations though)
The instrument I am playing is my favourite Melodeon.
.........

Offline idyll

  • ****
  • 113
  • +0/-0
the script for the sa-learn process
« Reply #14 on: March 31, 2006, 04:50:24 PM »
Jesper wrote this perl script a few months ago as we worked out such a "training" process for our 6.0 servers. He shared it with me, and I do the same with you. It is approproate to leave his name, etc. intact or you are subverting the notion.

SA needs to see 200 such SPAM before autolearn will initiate. This script will help get to that number and then continue to feed it.

Jesper is quite busy with his new company and at the same time, quite happy with his 6.0 servers.  I have moved to the 7.0 series for my own reasons. However, this script works perfectly with the 7.0 servers. You just need to create your own cron job, I run it every 8 hours. We named it LearnAsSpam.pl  (and the folders are named the same) but obviously you may do as you wish and modify the script to meet your naming needs. It works out of the box, BUT the users must be granted ssh access for it to run  without issue. If your users do NOT have ssh access you need to enter this line..

# chsh -s /bin/bash <user>


patrick

---- snip -----------

#!/usr/bin/perl

#############################################################################
#
# This script has been developed
# by Jesper Knudsen at http://sme.swerts-knudsen.dk
#
# Revision History:
#
# January 18, 2006:      Initial version
#############################################################################

use Sys::Hostname;

use esmith::AccountsDB;

my $hostname = hostname();
 
my $adb = esmith::AccountsDB->open_ro()
        or die "Couldnt' open AccountsDB\n";
 
my @users = $adb->users;

foreach my $user (@users)
{
  my $firstname = $user->prop('FirstName');
  my $lastname = $user->prop('LastName');
  my $key = $user->key;

  printf("Checking for user (%s): %s %s\n", $key,$firstname, $lastname);

  $MailDir = "/home/e-smith/files/users/" . $key . "/Maildir";
  opendir(LOGDIR, $MailDir);
  my $dirname = sprintf "LearnAsSpam";
  my @logdirs = sort grep { /$dirname/ } readdir(LOGDIR);
  closedir(LOGDIR);

  foreach my $logdir (@logdirs) {

    my $SpamDir = $MailDir . "/" . $logdir . "/cur/";

#    printf("Checking Dir: %s\n",$SpamDir);
    opendir(SPAMDIR, $SpamDir);
    my @spamfiles = sort grep { /$hostname/ } readdir(SPAMDIR);
    closedir(SPAMDIR);

    foreach $spamfile (@spamfiles) {
      my $filetolearn = $Maildir . $SpamDir . $spamfile;

      $filetolearn =~ s/;/\\;/g;
      $filetolearn =~ s/:/\\:/g;

      printf("Learning Spammail: %s\n",$filetolearn);
      my $result = su - $key -c "/usr/bin/sa-learn --spam $filetolearn";
      printf("Result of sa-learn: %s\n",$result);
      # Now delete the file after learning
      my $delete = su - $key -c "rm -f $filetolearn";
#      printf("Result of delete: %s\n",$delete);
    }
  }
}

--------------- snip --------------------
...

Offline sonoracomm

  • *
  • 208
  • +0/-0
    • http://www.sonoracomm.com
Using SpamAssassin...
« Reply #15 on: June 15, 2006, 04:35:47 PM »
Hi Patrick,

I have implemented this script as you suggested.  It looks like just the ticket.

I created an IMAP folder called LearnAsSpam and I dropped a few spams in it.

I then went to the SME command line...as root...and ran this command to test:

Code: [Select]
perl /usr/bin/LearnAsSpam.pl

It seems to learn the spams, though I'm not really sure.  

Code: [Select]
Learning Spammail: /home/e-smith/files/users/gcooper/Maildir/.LearnAsSpam/cur/1150381198.P23845Q20M668252.mail\:2,S
Result of sa-learn:
Result of delete:


I looked for some evidence by running:

Code: [Select]
sa-learn -D --dump magic

But it terminated with this:

Code: [Select]
debug: bayes: no dbs present, cannot tie DB R/O: /root/.spamassassin/bayes_toks
ERROR: Bayes dump returned an error, please re-run with -D for more information


Do you have any idea how I could verify the learning is working or how I should tweak the script to delete the processed messages?

Thanks much,

G

Offline idyll

  • ****
  • 113
  • +0/-0
bayes
« Reply #16 on: June 15, 2006, 04:48:36 PM »
Yes, you need to be loged in as the specific user you wish to check. I get the same error string if I run the command as root. I login as myself, the user and issue the command.

Example...

-bash-3.00$ sa-learn -D --dump magic

<blah>
<blah>

(near the bottom is the one you seek, nspam)

0.000          0          3          0  non-token data: bayes db version
0.000          0         345          0  non-token data: nspam
0.000          0          0          0  non-token data: nham
0.000          0       1875          0  non-token data: ntokens
0.000          0 1142713049          0  non-token data: oldest atime
0.000          0 1143119655          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count
debug: bayes: 6761 untie-ing
debug: bayes: 6761 untie-ing db_toks
debug: bayes: 6761 untie-ing db_seen
-bash-3.00$

I am not reading or posting here much anymore as I have had my fill of delusional disorders of the subtype grandiose. But your note and your postings are always positive and solid contributions.

I hope this helps.

regards,

patrick
...

Offline sonoracomm

  • *
  • 208
  • +0/-0
    • http://www.sonoracomm.com
Using SpamAssassin...
« Reply #17 on: June 16, 2006, 08:34:42 PM »
Thanks again, Patrick!

My problem was twofold.  I had not followed instructions exactly pertaining to the enabling of the bash shell...and running the script as the user.

Now if I run 'perl /usr/bin/LearnAsSpam.pl' as root, the messages are deleted.  And if I run 'sa-learn -D --dump magic' as the regular user, it spits out the proper results.

I had forgotten that I had set the shell to rssh in an attempt to make things more secure.  Didn't work.

Thanks much,

G

p.s.  I updated the howto I wrote to add optional Bayesian filtering.  Though it's more trouble, it seems worth it to me.

http://www.sonoracomm.com/index.php?option=com_content&task=view&id=49&Itemid=32

Offline idyll

  • ****
  • 113
  • +0/-0
great work
« Reply #18 on: June 16, 2006, 10:11:58 PM »
Solid addition to the end user documentation.

I am sure there will be hundreds of people who appreciate your efforts over time.

regards,

patrick
...

Offline sonoracomm

  • *
  • 208
  • +0/-0
    • http://www.sonoracomm.com
Using SpamAssassin...
« Reply #19 on: July 05, 2006, 06:52:52 AM »
Me again...

Sorry for dragging this out (or beating a dead horse...).

It seems the LearnAsSpam script is working fine and provides an important function.

I'm now wondering how dificult it might be to expand it to offer a whitelisting function?  Though fairly rare, I am finding that I need to be able to whitelist sender addresses/domains to eliminate problems with valid mail misclassified as spam.  I looked at the LearnAsSpam script, but I don't understand it.

It would be nice if we could have a separate IMAP folder where we could copy messages that would add to the whitelist.

I saw in another thread a db command for adding to the (global?) whitelist.  Maybe this could be integrated into the LearnAsSpam script?

http://forums.contribs.org/index.php?topic=32248.0

I suspect other folks more proficient with SA have already looked at this issue.  What have you all done for a whitelist?

Thanks,

G

Offline azche24

  • *
  • 163
  • +0/-0
    • http://az-law.de
Using SpamAssassin...
« Reply #20 on: July 07, 2006, 12:04:24 PM »
Hi,

i installed and modified according to your wonderful howto and now i get

Code: [Select]
Learning Spammail: /home/e-smith/files/users/alex/Maildir/.LearnAsSpam/cur/1152264515.P19159Q9M989657.castor\:2,a
bayes: expire_old_tokens: locker: safe_lock: cannot create lockfile /var/spool/spamd/.spamassassin/bayes.mutex: Keine Berechtigung

bayes: locker: safe_lock: cannot create lockfile /var/spool/spamd/.spamassassin/bayes.mutex: Keine Berechtigung
Result of sa-learn: Learned tokens from 0 message(s) (1 message(s) examined)


if i run "LearnAsSpam.pl" from the shell as root. The same error came with the original cronjob, so i modified "root" to "spamd". It seems this did not help.

Spam is deleted, but when running
Code: [Select]
sa-learn -D --dump magic
i get the same "db not present error" as stated here.
Alexander Ziemann, Berlin - DE

Offline azche24

  • *
  • 163
  • +0/-0
    • http://az-law.de
Using SpamAssassin...
« Reply #21 on: July 11, 2006, 01:43:55 PM »
Hi, again,

i surveyed some further and the LearnAsSpam.pl Script mentioned here will not work.

For what reason?

When the script does su - <mailboxuser> at the end it is not able to write to the databases in /var/spool/spamd/.spamassassin/... as it should.

There must be a PATH statement wrong in this script. Charlie Brady stated earlier, that the .spamassassin/ directory is only writable by unprivileged user spamd; i can prove that. So only root checking admin-mailbox is able to learn spam in this kind of configuration. Which makes a lot of sense to me for security reasons.

I was able to login as root and do

Code: [Select]
sa-learn --spam *

in the admin's spamdirectory.

I can not imagine how to get this script working without breaking security. Any ideas?
Alexander Ziemann, Berlin - DE

Offline judgej

  • *
  • 375
  • +0/-0
Using SpamAssassin...
« Reply #22 on: July 11, 2006, 02:56:51 PM »
Quote from: "burnat"
...Is this what you are after?


I don't think so. He is looking for a way to allow an end user to report a piece of spam to the server, so the server can start learning from its mistakes. The questino is not about how spamassasion processes spam once it has caught it.

-- JJ
-- Jason

Offline brianr

  • *
  • 991
  • +2/-0
Using SpamAssassin...
« Reply #23 on: July 14, 2006, 07:25:48 AM »
Quote from: "azche24"
Hi,

i installed and modified according to your wonderful howto and now i get

Code: [Select]
Learning Spammail: /home/e-smith/files/users/alex/Maildir/.LearnAsSpam/cur/1152264515.P19159Q9M989657.castor\:2,a
bayes: expire_old_tokens: locker: safe_lock: cannot create lockfile /var/spool/spamd/.spamassassin/bayes.mutex: Keine Berechtigung

bayes: locker: safe_lock: cannot create lockfile /var/spool/spamd/.spamassassin/bayes.mutex: Keine Berechtigung
Result of sa-learn: Learned tokens from 0 message(s) (1 message(s) examined)


if i run "LearnAsSpam.pl" from the shell as root. The same error came with the original cronjob, so i modified "root" to "spamd". It seems this did not help.

Spam is deleted, but when running
Code: [Select]
sa-learn -D --dump magic
i get the same "db not present error" as stated here.


I just got this for a few days, I deleted the offending bayes.mutex, and the script seems to run fine again now..
Brian j Read
(retired, for a second time, still got 2 installations though)
The instrument I am playing is my favourite Melodeon.
.........

Offline sonoracomm

  • *
  • 208
  • +0/-0
    • http://www.sonoracomm.com
Using SpamAssassin...
« Reply #24 on: August 02, 2006, 07:55:51 PM »
Hi all,

I've now had the problem described above happen several times.  I keep deleting the bayes.mutex file, but the problem returns again in short order.

Is anyone else having this problem?

Does anyone know why it happens?

A Google search just seems to bring up Brian's posts.

Thanks in advance,

G

Offline brianr

  • *
  • 991
  • +2/-0
Using SpamAssassin...
« Reply #25 on: August 06, 2006, 07:50:39 AM »
Brian j Read
(retired, for a second time, still got 2 installations though)
The instrument I am playing is my favourite Melodeon.
.........

Offline robwellesley

  • *
  • 92
  • +0/-0
Using SpamAssassin...
« Reply #26 on: September 21, 2006, 11:54:05 AM »
Quote from: "RayMitchell"
burnat

>...Should you also run:  svc -t /service/qpsmtpd ?

Yes
]


Ray,

if you look at the email-update event (#ll in the directory) you'll see that qpsmtp is sent a sighup as part of the event.  This is the smeas an svc -t, yes?

Rob

Offline raem

  • *
  • 3,972
  • +4/-0
Using SpamAssassin...
« Reply #27 on: September 21, 2006, 06:35:36 PM »
robwellesley

> the email-update event ....
> ....qpsmtp is sent a sighup...same as an svc -t ?

It appears to achieve the same end result


abbreviated messages log file:

Sep 22 01:45:19 server3 esmith::event[4797]: Processing event: email-update  
Sep 22 01:45:19 server3 esmith::event[4797]: Running event handler: /etc/e-smith/events/actions/generic_template_expand

Sep 22 01:45:31 server3 esmith::event[4797]: expanding /var/service/sqpsmtpd/runenv  
Sep 22 01:45:31 server3 esmith::event[4797]: expanding /var/service/qpsmtpd/runenv  

Sep 22 01:46:27 server3 esmith::event[4797]: adjusting supervised qmail (sighup)  
Sep 22 01:46:27 server3 esmith::event[4797]: adjusting supervised qmail (up)  
Sep 22 01:46:27 server3 esmith::event[4797]: adjusting supervised sqpsmtpd (sighup)  
Sep 22 01:46:27 server3 esmith::event[4797]: adjusting supervised sqpsmtpd (up)  
Sep 22 01:46:28 server3 esmith::event[4797]: adjusting supervised qpsmtpd (sighup)  
Sep 22 01:46:28 server3 esmith::event[4797]: adjusting supervised qpsmtpd (up)  




svc applies all the options to each service in turn. Here are the options:

-u: Up. If the service is not running, start it. If the service stops, restart it.
-d: Down. If the service is running, send it a TERM signal and then a CONT signal. After it stops, do not restart it.
-o: Once. If the service is not running, start it. Do not restart it if it stops.
-p: Pause. Send the service a STOP signal.
-c: Continue. Send the service a CONT signal.
-h: Hangup. Send the service a HUP signal.
-a: Alarm. Send the service an ALRM signal.
-i: Interrupt. Send the service an INT signal.
-t: Terminate. Send the service a TERM signal.
-k: Kill. Send the service a KILL signal.
-x: Exit. supervise will exit as soon as the service is down. If you use this option on a stable system, you're doing something wrong; supervise is designed to run forever.
...

Offline timn

  • *
  • 62
  • +0/-0
    • Nash CDL
Using SpamAssassin...
« Reply #28 on: September 30, 2006, 01:00:37 AM »
I have followed the how to and installed the cron jobs and perl scripts etc as described. However the first time the cron job for sa-update ran last night I got a Cron Daemon email to root which stated that

/etc/cron.daily/sa-update:

sa-update: importing default keyring to '/etc/mail/spamassassin/sa-update-keys'...
/etc/cron.daily/sa-update: line 1: svc: command not found

I also got an email from sa-update which indicated the updates went through OK and ends with

[28907] dbg: channel: applying changes to /var/lib/spamassassin/3.001003/updates_spamassassin_org...
[28907] dbg: channel: update complete
[28907] dbg: diag: updates complete, exiting with code 0

So why wasn't svc found? should this be email-update in the script.

I can run he svc command as per the script from a root shell prompt.

Offline brianr

  • *
  • 991
  • +2/-0
Using SpamAssassin...
« Reply #29 on: September 30, 2006, 05:06:02 PM »
I haven't had a spamassassin update detected recently, infact since the recent set of SME7 updates, but I have had problems with cron and permisssions (in /etc/cron.d), so I think perhaps Cron has been updated and the rules have changed.  E.G. /etc/cron.daily scripts are no longer run under root?  This is purely speculative - perhaps someone who knows about the recent centos/sme updates can confirm or deny this.
Brian j Read
(retired, for a second time, still got 2 installations though)
The instrument I am playing is my favourite Melodeon.
.........