Koozali.org: home of the SME Server
Obsolete Releases => SME Server 7.x => Topic started by: idyll on March 21, 2006, 09:39:09 PM
-
Can one of the developers advise if a new installation of 7.0 PR1 uses the default of needing to see 200 pieces of SPAM before auto-learning is enabled?
Auto-learning is clearly disabled on my new server.
I used a perl script (thanks Jesper!) and ran it against a LearnAsSpam folder in each home directory to teach my 6.0.1 server. The cron/script is currently failing as it reports auto-learning is disabled.
I can wait as I know the system needs a minimum of SPAM before this is invoked, I just am asking for confirmation.
thanks
patrick
-
idyll
Depending on how the spam filter is configured in sme7, spam messages are rejected by the server and therefore will not be moved to the junkmail folder, as was the case in sme6 spam filter by knuddi.
If you enable RBL list rejection, you will be receiving a greatly reduced amount of spam anyway and therefore not see very many spam messages in the junkmail folder.
-
UPDATE - I found this inside the system logged in as root. Enter this while in /etc/mail/spamassassin
perldoc Mail::SpamAssassin::Conf
which pretty much lays it all out. I think this should referenced for any and all SA questions. It clearly answered mine ;-)
patrick
---------------------------------------------
I understand those issues very well. I did not make mention of RBLs or rejection thresholds.
The sa-learn features of the Bayes filters are OFF until they reach a threshold of a number of SPAM received. My site receives close to 1000 SPAM per day.
I have been using PR1 for five days and I am confused why the sa-learn is still disabled. A new installation will not be training the Bayes filters until this numeric threshold is reached.
When I enter this
sa-learn -D --dump data, the outout is this...
------------ snip ---------------------
[root@galadriel ~]# sa-learn --dump data
ERROR: Bayes dump returned an error, please re-run with -D for more information
[root@galadriel ~]# sa-learn -D --dump data
debug: SpamAssassin version 3.0.5
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/sbin/e-smith', keeping.
debug: PATH included '/usr/local/sbin', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/sbin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/X11R6/bin', keeping.
debug: PATH included '/root/bin', which doesn't exist, dropping.
debug: Final PATH set to: /sbin/e-smith:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin
debug: using "/etc/mail/spamassassin/init.pre" for site rules init.pre
debug: config: read file /etc/mail/spamassassin/init.pre
debug: using "/usr/share/spamassassin" for default rules dir
debug: config: read file /usr/share/spamassassin/10_misc.cf
debug: config: read file /usr/share/spamassassin/20_anti_ratware.cf
debug: config: read file /usr/share/spamassassin/20_body_tests.cf
debug: config: read file /usr/share/spamassassin/20_compensate.cf
debug: config: read file /usr/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /usr/share/spamassassin/20_drugs.cf
debug: config: read file /usr/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /usr/share/spamassassin/20_head_tests.cf
debug: config: read file /usr/share/spamassassin/20_html_tests.cf
debug: config: read file /usr/share/spamassassin/20_meta_tests.cf
debug: config: read file /usr/share/spamassassin/20_phrases.cf
debug: config: read file /usr/share/spamassassin/20_porn.cf
debug: config: read file /usr/share/spamassassin/20_ratware.cf
debug: config: read file /usr/share/spamassassin/20_uri_tests.cf
debug: config: read file /usr/share/spamassassin/23_bayes.cf
debug: config: read file /usr/share/spamassassin/25_body_tests_es.cf
debug: config: read file /usr/share/spamassassin/25_hashcash.cf
debug: config: read file /usr/share/spamassassin/25_spf.cf
debug: config: read file /usr/share/spamassassin/25_uribl.cf
debug: config: read file /usr/share/spamassassin/30_text_de.cf
debug: config: read file /usr/share/spamassassin/30_text_fr.cf
debug: config: read file /usr/share/spamassassin/30_text_nl.cf
debug: config: read file /usr/share/spamassassin/30_text_pl.cf
debug: config: read file /usr/share/spamassassin/50_scores.cf
debug: config: read file /usr/share/spamassassin/60_whitelist.cf
debug: using "/etc/mail/spamassassin" for site rules dir
debug: config: read file /etc/mail/spamassassin/local.cf
debug: config: read file /etc/mail/spamassassin/whitelist.cf
debug: using "/root/.spamassassin/user_prefs" for user prefs file
debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x90168a4)
debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::Hashcash=HASH(0x999f734)
debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::SPF=HASH(0x996cbb8)
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x90168a4) implements 'parse_config'
debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0x999f734) implements 'parse_config'
debug: Score set 0 chosen.
debug: bayes: no dbs present, cannot tie DB R/O: /root/.spamassassin/bayes_toks
ERROR: Bayes dump returned an error, please re-run with -D for more information
[root@galadriel ~]#
--------- snip ---------------
the debug output is saying no dbs present? This is due to the sa-learn not being enabled.
So - developers - was this by design? I realize the Bayes mechanism can be a CPU hog on very feeble systems but overall it is the key to long-term effectiveness.
At the very least, I think you will agree, this is a grey area and it would be awesome to have some details spelling out the many options in the documentation. The trade-offs, maybe some setting templates depicting the pros and cons of using this feature/filter or that one, etc., relative to the scale of a person's site and the horsepower of their server. It is certainly one of the most oblique features provided by this splendid server.
regards,
patrick
-
idyll
>... I did not make mention of RBLs or rejection thresholds.
That's why I mentioned them.
>....My site receives close to 1000 SPAM per day.
Here's the spam filter report for the last 24hrs from my sme6 server (which uses the same techniques as sme7) which shows that RBL rejection takes care of the majority of spam messages, very few get to the junkmail folders.
Total spam rejected : 764 ( 67.85%)
RBL rejected : 674 ( 59.86%)
Score above 15 : 30 ( 3.93%)
Total ham accepted : 362 ( 32.15%)
-------------------
Total emails processed: 1126 ( 47/hr)
Here is the virus scanner report, as you can see it has very little to do, also mainly due to RBL & pattern matching rejection.
Total emails rejected : 2 ( 0.40%)
Problems : 0 ( 0.00%)
Quarantined : 2 (100.00%)
Total emails accepted : 494 ( 99.60%)
-------------------
Total emails processed: 496 ( 21/hr)
> ...I realize the Bayes mechanism can be a CPU hog on very feeble
> systems but overall it is the key to long-term effectiveness.
If you want to be effective against spam (& viruses) I think RBL rejection is essential on any server (and also the use of the spamassassin custom rejection setting).
In that case, I was pointing out that you will have (significantly) less spam actually getting into the system for Bayes analysis to do its work on.
Perhaps the developers feel the same way so that may be part of the reason they did not enable Bayes by default.
-
Ray, do you know a way on SME7 to get those concise spam & virus reports? Thanks!
-
MSmith
I made necessary changes to both those scripts and configured appropriate entries for antivirus in the configuration db, and for conf.global in the spamassassin db, and both scripts run OK.
I don't have any mail messages on my test sme7 server, so I can't see if the reports are picking up the data correctly (all fields show 0).
-
deleted original post
Ray
-
MSmith
>...do you know a way on SME7 to get those concise spam & virus reports?
As my earlier post has disappeared (maybe I accidently edited it ??), I am reposting the info.
The reports are generated by scripts in spamfilter and antivirus for sme6 from jesper knudsen.
The scripts are
/usr/bin/antivirus-stats.pl
and
/usr/bin/spamfilter-stats.pl
These scripts are run by cron jobs in /etc/cron.d
These scripts can be copied to sme7 and modified to suit changes in sme7
ie different db location & log files
ie use
/home/e-smith/db/spamassassin
and
/home/e-smith/db/configuration
Also refer to /var/log/qpsmtpd/current rather than smtpfront-qmail/current
You will also need to put entries into the spamassassin db for conf.global
and into the configuration db for antivirus.
The important thing is to enable reports or when you run the scripts they will do nothing.
See the corresponding entries in sme6 db's for what is required.
Perhaps the scripts need to be rewritten for sme7.
My test sme7 server does not have enough email entries in the log files to see if the reports that are being generated by the scripts are valid, but the reports do appear to run OK (after modifying the scripts).
-
Ray - could you at least publish the contents of the modified scripts?
I have been looking at doing a similar thing, but do not want to duplicate your effort.
-
brianr
You will also need to add a few of the missing folders to your sme7 server to prevent script errors.
The changes I made were not extensive and depending how you implement it into sme7, there may be more work required.
Two scripts follow:
spamfilter-stats.pl
#!/usr/bin/perl
#############################################################################
#
# This script provides daily SpamFilter statistics and deletes all users
# junkmails. Configuration of the script is done by the Spam Filter
# Server-Manager module
#
# This script has been developed
# by Jesper Knudsen at http://sme.swerts-knudsen.dk
#
# Revision History:
#
# August 13, 2003: Initial version
# August 25, 2004: fixed problem when hostname had no-ASCII chars
# March 23, 2006 Revised for sme7 RM
#############################################################################
# internal modules (part of core perl distribution)
use Getopt::Long;
use Pod::Usage;
use POSIX qw/strftime floor/;
use Time::Local;
use Date::Manip;
use strict;
use esmith::ConfigDB;
use Sys::Hostname;
my $hostname = hostname();
#Configuration section
my %opt = ();
$opt{'logfile'} = '/var/log/maillog'; # Log file
$opt{'sendmail'} = '/usr/sbin/sendmail'; # Path to sendmail stub
$opt{'from'} = 'Admin'; # Who is the mail from
$opt{'end'} = "";
$opt{'start'} = "yesterday";
$opt{'mail'} = "admin";
$opt{'timezone'} = date +%z;
Date_Init("TZ=$opt{'timezone'}");
# Parameters for the Delete Junkmail functionality
my $file1 = "/home/e-smith/db/accounts"; #E-SMITH ACCOUNTS DATABASE
my $path = "/home/e-smith/files/users/"; #PATH TO USER DIRECTORIES
my $end_path_cur = "/Maildir/;junkmail/cur"; #END OF PATH STRING
my $end_path_new = "/Maildir/;junkmail/new"; #END OF PATH STRING
# end
my $sa_dbase = '/home/e-smith/db/spamassassin';
my $dbh = esmith::ConfigDB->open($sa_dbase) || die "Unable to open spamassassin configuration dbase.";
my %sa_conf = $dbh->get('conf.global')->props;
my $disabled = 1;
my $days_to_keep = 0; #How many days to keep junkmail
my $parameter = "";
my $value = "";
while (($parameter,$value) = each(%sa_conf)) {
if ($parameter eq 'daily_report' && $value eq '1') {
$disabled = 0;
}
if ($parameter eq 'delete_after') {
$days_to_keep = $value;
}
}
my $tstart = time;
# efficiency; don't rebuild the (constant) hash every loop iteration
my %month_list = ('Jan' => 0,
'Feb' => 1,
'Mar' => 2,
'Apr' => 3,
'May' => 4,
'Jun' => 5,
'Jul' => 6,
'Aug' => 7,
'Sep' => 8,
'Oct' => 9,
'Nov' => 10,
'Dec' => 11);
#Local variables
my $YEAR = (localtime(time))[5]; # this is years since 1900
my $total = 0;
my $spamcount = 0;
my $spamavg = 0;
my $hamcount = 0;
my $hamavg = 0;
my $threshtotal = 0;
my $above15 = 0;
my $RBLcount = 0;
my %spambyhour = ();
my %hambyhour = ();
my ($start, $end) = parse_arg($opt{'start'}, $opt{'end'});
#---------------------------------------
# First scan the maillog file
#---------------------------------------
#Open log file
open(LOG, "< $opt{'logfile'}") or die "Can't open $opt{'logfile'}: $!\n";
LINE: while (<LOG>) {
# Agh... this is ugly.
if (m/
^(\w{3})\s+ # Month
(\d+)\s+ # Day
(\d\d):(\d\d):(\d\d)\s+ # HH:MM:SS
(\S+)\s+ # Hostname?
spamd\[\d+\]:\s+ # spamd[PID]
(clean\smessage|identified\sspam)\s # Status
\(([-0-9.]+)\/([-0-9.]+)\)\s # Score, Threshold
for\s
\w+:\d+\s # for daf:1000
in\s
[0-9.]+\sseconds,\s+
[0-9]+\sbytes\./x) { # There's an extra space at the end for some reason.
#Split line into components
my $mon = $1;
my $day = $2;
my $hour = $3;
my $min = $4;
my $sec = $5;
my $status = $7;
my $score = $8;
my $threshold = $9;
# Convert to absolute time
my $abstime = timelocal($sec, $min, $hour, $day, $month_list{$mon}, $YEAR);
my $abshour = floor ($abstime / 3600); # Hours since the epoch
#If date specified, only process lines matching date
next LINE if ($abstime < $start);
# We can assume that logs are chronological
last if ($abstime > $end);
#Total score
$total++;
if ($status eq "identified spam") {
#Spam scores
$spamcount++;
$spamavg += $score;
$spambyhour{$abshour}++;
if ($score > 15 ){
$above15++;
}
} elsif ($status eq "clean message") {
#Nonspam scores
$hamcount++;
$hamavg += $score;
$hambyhour{$abshour}++;
} else {
die "Strange error in regexp";
}
$threshtotal += $threshold;
}
# SpamAssassin version 3.1x has changed the log file output (why???)
if (m/
^(\w{3})\s+ # Month
(\d+)\s+ # Day
(\d\d):(\d\d):(\d\d)\s+ # HH:MM:SS
(\S+)\s+ # Hostname?
spamd\[\d+\]:\s+ # spamd[PID]
spamd:\s+ # SPAMASSASSIN 3.1 LOGGING
(clean\smessage|identified\sspam)\s # Status
\(([-0-9.]+)\/([-0-9.]+)\)\s # Score, Threshold
for\s
\w+:\d+\s # for daf:1000
in\s
[0-9.]+\sseconds,\s+
[0-9]+\sbytes\./x) { # There's an extra space at the end for some reason.
#Split line into components
my $mon = $1;
my $day = $2;
my $hour = $3;
my $min = $4;
my $sec = $5;
my $status = $7;
my $score = $8;
my $threshold = $9;
# Convert to absolute time
my $abstime = timelocal($sec, $min, $hour, $day, $month_list{$mon}, $YEAR);
my $abshour = floor ($abstime / 3600); # Hours since the epoch
#If date specified, only process lines matching date
next LINE if ($abstime < $start);
# We can assume that logs are chronological
last if ($abstime > $end);
#Total score
$total++;
if ($status eq "identified spam") {
#Spam scores
$spamcount++;
$spamavg += $score;
$spambyhour{$abshour}++;
if ($score > 15 ){
$above15++;
}
} elsif ($status eq "clean message") {
#Nonspam scores
$hamcount++;
$hamavg += $score;
$hambyhour{$abshour}++;
} else {
die "Strange error in regexp";
}
$threshtotal += $threshold;
}
}
#Done reading file
close(LOG);
#---------------------------------------
# First scan the qpsmtpd log file
#---------------------------------------
system ("cat /var/log/qpsmtpd/current | /usr/local/bin/tai64nlocal > /var/tmp/sme-spamfilter.rbl.out");
open(LOG, "/var/tmp/sme-spamfilter.rbl.out") or die "Can't open qpsmtpd logfile\n";
LINE: while (<LOG>) {
# Agh... this is ugly.
if (m/
^(\w{4})\-+ # Year
(\d+)\-+ # Month
(\d+)\s+ # Day
(\d\d):(\d\d):(\d\d).(\w{9})\s+ # HH:MM:SS
rblsmtpd\:\s+ # rblsmtpd
\w+/x) { # There's an extra space at the end for some reason.
#Split line into components
my $year =$1;
my $mon = $2;
my $day = $3;
my $hour = $4;
my $min = $5;
my $sec = $6;
# Convert to absolute time
my $abstime = timelocal($sec, $min, $hour, $day, $mon-1, $YEAR);
my $abshour = floor ($abstime / 3600); # Hours since the epoch
#If date specified, only process lines matching date
next LINE if ($abstime < $start);
# We can assume that logs are chronological
last if ($abstime > $end);
#Total score
$total++;
#RBL score
$RBLcount++;
#Spam scores
$spamcount++;
$spambyhour{$abshour}++;
}
}
#Done reading file
close(LOG);
#Calculate some numbers
$spamavg=$spamavg/$spamcount if $spamcount;
$hamavg=$hamavg/$hamcount if $hamcount;
my $threshavg=$threshtotal/$total if $total;
my $spampercent=(($spamcount/$total) * 100) if $total;
my $rblpercent=(($RBLcount/$total) * 100) if $total;
my $hampercent=(($hamcount/$total) * 100) if $total;
my $hrsinperiod=(($end-$start) / 3600);
my $emailperhour=($total/$hrsinperiod) if $total;
my $above15percent = (($above15/$spamcount) * 100) if $spamcount;
my $oldfh;
#Open Sendmail if we are mailing it
if ($opt{'mail'} && !$disabled) {
open (SENDMAIL, "|$opt{'sendmail'} -oi -t -odq") or die "Can't open sendmail: $!\n";
print SENDMAIL "From: $opt{'from'}\n";
print SENDMAIL "To: $opt{'mail'}\n";
print SENDMAIL "Subject: Spam Filter Statistics from $hostname - ",strftime("%F", localtime($start)), "\n\n";
$oldfh = select SENDMAIL;
}
my $telapsed = time - $tstart;
if (!$disabled) {
#Output results
print "Period Beginning : ", strftime("%c", localtime($start)), "\n";
print "Period Ending : ", strftime("%c", localtime($end)), "\n";
print "SpamAssassin Version : ",spamassassin -V;
print "\n";
printf "Reporting Period : %.2f hrs\n", $hrsinperiod;
print "--------------------------------------------------\n";
print "\n";
printf "Total spam rejected : %8d (%6.2f%%)\n", $spamcount, $spampercent || 0;
printf " RBL rejected : %8d (%6.2f%%)\n", $RBLcount, $rblpercent || 0;
printf " Score above 15 : %8d (%6.2f%%)\n", $above15, $above15percent || 0;
printf "Total ham accepted : %8d (%6.2f%%)\n", $hamcount, $hampercent || 0;
print " -------------------\n";
printf "Total emails processed: %8d (%5.f/hr)\n", $total, $emailperhour || 0;
print "\n";
printf "Average spam threshold : %11.2f\n", $threshavg || 0;
printf "Average spam score : %11.2f\n", $spamavg || 0;
printf "Average ham score : %11.2f\n", $hamavg || 0;
print "\n";
print "Statistics by Hour\n";
print "-------------------------------------\n";
print "Hour Spam Ham\n";
print "------------- -------- --------\n";
my $hour = floor($start/3600);
while ($hour < $end/3600) {
printf("%s %8d %8d\n",
strftime("%F, %H", localtime($hour*3600)),
$spambyhour{$hour} || 0, $hambyhour{$hour} || 0);
$hour++;
}
print "\n";
} # not disabled
if ($days_to_keep > 0) {
Delete_Junkmail();
}
if (!$disabled) {
print "\nDone. Report generated in $telapsed sec.\n\n";
#Close Senmdmail if it was opened
if ($opt{'mail'}) {
select $oldfh;
close (SENDMAIL);
}
}
#All done
exit 0;
#############################################################################
# Subroutines ###############################################################
#############################################################################
########################################
# Process parms #
########################################
sub parse_arg {
my $startdate = shift;
my $enddate = shift;
my $secsinday = 86400;
my $time = 0;
my $start = UnixDate($startdate,"%s");
my $end = UnixDate($enddate, "%s");
if(!$start && !$end) {
$end = time;
$start = $end - $secsinday;
return ($start, $end);
}
if(!$start) {
$start = $end - $secsinday;
return ($start, $end);
}
if(!$end) {
$end = $start + $secsinday;
return ($start, $end);
}
if($start > $end) {
return ($end, $start);
}
return ($start, $end);
}
sub dbg {
my $msg = shift;
if ($opt{debug}) {
print STDERR $msg;
}
}
sub Delete_Junkmail {
my $deleted;
my $found;
my $junkmail_dir;
my $entry;
my $syscommand;
my $x;
open (ORIGINAL, "$file1"); #OPEN FILE FOR READING
my @original = <ORIGINAL>; #READ FILE INTO AN ARRAY
#PROCESS THE ARRAY
foreach $x (@original) {
#SPLIT THE RECORD TO RETRIEVE USER INFO
my @users_original = split /\|/, $x ;
#SPLIT THE FIRST ENTRY TO RETRIEVE USERNAME AND TYPE (users/pseudonym/system)
my @users = split /\=/, $users_original[0];
#PROCESS THE RECORDS THAT ARE ACTUAL USERS
if ($users[1] eq 'user') {
$deleted = 0;
$found = 0;
#Set path to the new mail folder
$junkmail_dir = "$path$users[0]$end_path_new";
# Now get the content list for the directory.
opendir(QDIR, "$junkmail_dir") or die "Couldn't read directory $junkmail_dir";
# Loop through this list looking for any *file* which hasn't been
# modified in the last $days_to_keep days.
while($entry = readdir(QDIR)) {
next if $entry =~ /^\./;
$entry = $junkmail_dir . '/' . $entry;
$syscommand = ("rm -f \"$entry\"");
$found++;
if (-f $entry && (-M $entry > $days_to_keep)) {
$deleted++;
$found--;
system("$syscommand");
}
}
closedir(QDIR);
#Set path to the new mail folder
$junkmail_dir = "$path$users[0]$end_path_cur";
# Now get the content list for the directory.
opendir(QDIR, "$junkmail_dir") or die "Couldn't read directory $junkmail_dir";
# Loop through this list looking for any *file* which hasn't been
# modified in the last $days_to_keep days.
while($entry = readdir(QDIR)) {
next if $entry =~ /^\./;
$entry = $junkmail_dir . '/' . $entry;
$syscommand = ("rm -f \"$entry\"");
$found++;
if (-f $entry && (-M $entry > $days_to_keep)) {
$deleted++;
$found--;
system("$syscommand");
}
}
closedir(QDIR);
if (!$disabled) {
printf "Deleted %d old spam email(s) from user \"%s\" ", $deleted, $users[0];
printf "- %d email(s) left in junkmail folder\n", $found ;
}
}
}
close ORIGINAL;
}
antivirus-stats.pl
#!/usr/bin/perl
#############################################################################
#
# This script provides daily Antivirus statistics and deletes all old
# Quarantined and Problems emails. Configuration of the script is done by the
# Antivirus Server-Manager module
#
# This script has been developed
# by Jesper Knudsen at http://sme.swerts-knudsen.dk
#
# Revision History:
#
# August 13, 2003: Initial version
# March 23, 2006 modified for sme7 RM
#############################################################################
# internal modules (part of core perl distribution)
use Getopt::Long;
use Pod::Usage;
use POSIX qw/strftime floor/;
use Time::Local;
use Date::Manip;
use strict;
use esmith::ConfigDB;
use Sys::Hostname;
my $hostname = hostname();
#Configuration section
my %opt = ();
$opt{'logfile'} = '/var/log/amavis-ng/amavis-ng.log'; # Log file
$opt{'sendmail'} = '/usr/sbin/sendmail'; # Path to sendmail stub
$opt{'from'} = 'Admin'; # Who is the mail from
$opt{'end'} = "";
$opt{'start'} = "yesterday";
$opt{'mail'} = "admin";
$opt{'timezone'} = date +%z;
Date_Init("TZ=$opt{'timezone'}");
my $disabled = 1;
my $days_to_keep = 0; #How many days to keep junkmail
my $lastupdate = 'never';
our $db = esmith::ConfigDB->open
|| warn "Couldn't open SME configuration database (permissions problems?)";
unless($db->get('antivirus'))
{
$disabled = 1;
}
my $days_to_keep = $db->get('antivirus')->prop('AutoDelete');
if ($db->get('antivirus')->prop('StatusReport') eq 'yes')
{
$disabled = 0;
}
my $tstart = time;
# efficiency; don't rebuild the (constant) hash every loop iteration
my %month_list = ('Jan' => 0,
'Feb' => 1,
'Mar' => 2,
'Apr' => 3,
'May' => 4,
'Jun' => 5,
'Jul' => 6,
'Aug' => 7,
'Sep' => 8,
'Oct' => 9,
'Nov' => 10,
'Dec' => 11);
#Local variables
my $YEAR = (localtime(time))[5]; # this is years since 1900
my $total = 0;
my $spamcount = 0;
my $hamcount = 0;
my $infectedcount = 0;
my $problemscount = 0;
my %infectedbyhour = ();
my %problemsbyhour = ();
my %spambyhour = ();
my %hambyhour = ();
my ($start, $end) = parse_arg($opt{'start'}, $opt{'end'});
#---------------------------------------
# First scan the maillog file
#---------------------------------------
#Open log file
open(LOG, "< $opt{'logfile'}") or die "Can't open $opt{'logfile'}: $!\n";
LINE: while (<LOG>) {
# Agh... this is ugly.
if (m/
^(\w{3})\s+ # Month
(\d+)\s+ # Day
(\d\d):(\d\d):(\d\d)\s+ # HH:MM:SS
(\S+)\s+ # Hostname?
amavis\[\d+\]:\s+ # amavis[PID]
(AMAVIS::MTA::Qmail:|Quarantining\sinfected\smessage\sto)\s+ # Status
(\S+)\s+
/x) { # There's an extra space at the end for some reason.
#Split line into components
my $mon = $1;
my $day = $2;
my $hour = $3;
my $min = $4;
my $sec = $5;
my $status = $7;
my $issue = $8;
# Convert to absolute time
my $abstime = timelocal($sec, $min, $hour, $day, $month_list{$mon}, $YEAR);
my $abshour = floor ($abstime / 3600); # Hours since the epoch
#If date specified, only process lines matching date
next LINE if ($abstime < $start);
# We can assume that logs are chronological
last if ($abstime > $end);
if ($status eq "Quarantining infected message to") {
$total++;
if ($issue =~ m/problems/) {
$problemscount++;
$infectedbyhour{$abshour}++;
}
else {
$infectedcount++;
$infectedbyhour{$abshour}++;
}
} elsif ($status eq "AMAVIS::MTA::Qmail:") {
if ($issue =~ m/Accepting/) {
$total++;
$hamcount++;
$hambyhour{$abshour}++;
}
}
}
}
#Done reading file
close(LOG);
#Calculate some numbers
my $totalissues = $infectedcount+$problemscount;
my $spampercent=((($totalissues)/$total) * 100) if $total;
my $problemspercent=(($problemscount/($totalissues)) * 100) if $totalissues;
my $infectedpercent=(($infectedcount/($totalissues)) * 100) if $totalissues;
my $hampercent=(($hamcount/$total) * 100) if $total;
my $hrsinperiod=(($end-$start) / 3600);
my $emailperhour=($total/$hrsinperiod) if $total;
my $oldfh;
#Open Sendmail if we are mailing it
if ($opt{'mail'} && !$disabled) {
open (SENDMAIL, "|$opt{'sendmail'} -oi -t -odq") or die "Can't open sendmail: $!\n";
print SENDMAIL "From: $opt{'from'}\n";
print SENDMAIL "To: $opt{'mail'}\n";
print SENDMAIL "Subject: Antivirus Statistics from $hostname - ",strftime("%F", localtime($start)), "\n\n";
$oldfh = select SENDMAIL;
}
my $telapsed = time - $tstart;
show_last_freshclam_update();
if (!$disabled) {
#Output results
print "Period Beginning : ", strftime("%c", localtime($start)), "\n";
print "Period Ending : ", strftime("%c", localtime($end)), "\n";
print "Clam Version : ",freshclam -V;
print "\n";
printf "Reporting Period : %.2f hrs\n", $hrsinperiod;
print "--------------------------------------------------\n";
print "\n";
printf "Total emails rejected : %8d (%6.2f%%)\n", $infectedcount+$problemscount, $spampercent || 0;
printf " Problems : %8d (%6.2f%%)\n", $problemscount, $problemspercent || 0;
printf " Quarantined : %8d (%6.2f%%)\n", $infectedcount, $infectedpercent || 0;
printf "Total emails accepted : %8d (%6.2f%%)\n", $hamcount, $hampercent || 0;
print " -------------------\n";
printf "Total emails processed: %8d (%5.f/hr)\n", $total, $emailperhour || 0;
print "\n";
show_virus_variants();
print "Statistics by Hour\n";
print "--------------------------------------\n";
print "Hour rejected accepted\n";
print "------------- -------- --------\n";
my $hour = floor($start/3600);
while ($hour < $end/3600) {
printf("%s %6d %6d\n",
strftime("%F, %H", localtime($hour*3600)),
$infectedbyhour{$hour} || 0, $hambyhour{$hour} || 0);
$hour++;
}
print "\n";
} # not disabled
Delete_Junkmail();
if (!$disabled) {
print "\nDone. Report generated in $telapsed sec.\n\n";
#Close Senmdmail if it was opened
if ($opt{'mail'}) {
select $oldfh;
close (SENDMAIL);
}
}
#All done
exit 0;
#############################################################################
# Subroutines ###############################################################
#############################################################################
sub show_virus_variants
{
my ($month, $day) = UnixDate("yesterday", "%b", "%e");
my $mydate = "$month $day";
# print($mydate);
my $command = 'cat /var/log/amavis-ng/amavis-ng.log | grep "' . $mydate . '" | grep -B 2 "Quarantining infected" | grep -v "AMAVIS" | grep -v "Quarantin" | grep -v !\-\-! | sed "s/^.*://" | sort | uniq -c | sort -r -g > /var/tmp/sme-antivirus.out';
system ($command);
#Open log file
my $LOG;
if (open(LOG, "/var/tmp/sme-antivirus.out")) {
print("Virus Statistics by name:\n");
print("---------------------------------------------\n");
LINE: while (my $line=<LOG>) {
if (not $line=~ m/--/) {
print("Rejected " .$line);
}
}
close(LOG);
print("\n");
system("rm -rf /var/tmp/sme-antivirus.out");
}
}
sub show_last_freshclam_update
{
my $updatefile = '/usr/share/clamav/last_update';
if(-f "$updatefile")
{
if(open(FILE, "$updatefile"))
{
$lastupdate = <FILE>;
if($lastupdate =~ /([A-z0-9\:\,\-\_\+\s]+)/)
{
$lastupdate = $1;
}
else
{
$lastupdate = 'unknown';
}
close(FILE);
}
else
{
warn "Cannot open last_update file (Permission problems?)";
}
}
# print ("Clam Database last updated:\t" . $lastupdate);
return '';
}
########################################
# Process parms #
########################################
sub parse_arg {
my $startdate = shift;
my $enddate = shift;
my $secsinday = 86400;
my $time = 0;
my $start = UnixDate($startdate,"%s");
my $end = UnixDate($enddate, "%s");
if(!$start && !$end) {
$end = time;
$start = $end - $secsinday;
return ($start, $end);
}
if(!$start) {
$start = $end - $secsinday;
return ($start, $end);
}
if(!$end) {
$end = $start + $secsinday;
return ($start, $end);
}
if($start > $end) {
return ($end, $start);
}
return ($start, $end);
}
sub dbg {
my $msg = shift;
if ($opt{debug}) {
print STDERR $msg;
}
}
sub Delete_Junkmail {
my $problems_dir = '/var/spool/amavis-ng/problems';
my $quarantine_dir = '/var/spool/amavis-ng/quarantine';
my $deleted_problems = 0;
my $found_problems = 0;
my $deleted_quarantine = 0;
my $found_quarantine = 0;
my $entry;
my $report;
# Standardise the format of the directory name
die 'Path for quarantine_dir must be absolute' unless $quarantine_dir =~ /^\//;
$quarantine_dir =~ s/\/$//; # Delete trailing slash
# Now get the content list for the directory.
opendir(QDIR, $quarantine_dir) or die "Couldn't read directory $quarantine_dir";
# Loop through this list looking for any *directory* which hasn't been
# modified in the last $days_to_keep days.
# Unfortunately this will do nothing if the filesystem is backed up using tar.
while($entry = readdir(QDIR)) {
next if $entry =~ /^\./;
$entry = $quarantine_dir . '/' . $entry;
if (-f $entry) {
$found_quarantine++;
}
if (-f $entry && (-M $entry > $days_to_keep) && $days_to_keep > 0) {
$deleted_quarantine++;
$found_quarantine--;
system("rm -f $entry");
}
}
closedir(QDIR);
$found_quarantine = $found_quarantine/2;
if (!$disabled) {
print "Deleted $deleted_quarantine old Quarantined email(s) - $found_quarantine email(s) left in Quarantine folder\n";
}
# Standardise the format of the directory name
die 'Path for problems_dir must be absolute' unless $problems_dir =~ /^\//;
$problems_dir =~ s/\/$//; # Delete trailing slash
# Now get the content list for the directory.
opendir(QDIR, $problems_dir) or die "Couldn't read directory $problems_dir";
# Loop through this list looking for any *directory* which hasn't been
# modified in the last $days_to_keep days.
# Unfortunately this will do nothing if the filesystem is backed up using tar.
while($entry = readdir(QDIR)) {
next if $entry =~ /^\./;
$entry = $problems_dir . '/' . $entry;
if (-f $entry) {
$found_problems++;
}
if (-f $entry && (-M $entry > $days_to_keep) && $days_to_keep > 0) {
$deleted_problems++;
$found_problems--;
system("rm -f $entry");
}
}
$found_problems = $found_problems/2;
if (!$disabled) {
print "Deleted $deleted_problems old Problem email(s) - $found_problems email(s) left in Problems folder\n";
}
}
-
Ray
ok, I am making some progress with this, but please don't hold your breath!!
I had already done some work for jesper on these scripts so i understand them a bit already.
It only needs to analyse the /var/log/qpsmtpd/current as the /var/log/maillog does not seem to get much info about individual mails anymore.
I am thinking of merging the two so that the analysis covers Virus and Spam info in one table. Also the junkmail deletion is already done in SME7, so I thought it would only need to give a count of junkmails left in the folders.
I am only running a "minimal" SME7 test mail server at the moment, and it does not receive any spam or viruses, so could some one email or give me access to a /var/log/qpsmtpd/current which has some spam and virus emails detected?
All suggestions and comments gratefully received.
-
I have got a new script for people to try, the announcement is here:
http://forums.contribs.org/index.php?topic=31357.0
am still looking for some logs to test it on..