Koozali.org: home of the SME Server

archiving e-mail: how to do

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
archiving e-mail: how to do
« on: January 08, 2023, 10:44:10 AM »
Hi

TARGET: I'd like to remove all e-mail before a data (+5y) to a mailbox file and store it outside.

I've one SME9 (I know = unsupported/old , but that's reason I need archivemail!) with +200GB e-mail.
I'd like to archive OUTSIDE all e-mail with more than 5 years.

So I read https://forums.koozali.org/index.php/topic,53584.msg278409.html#msg278409
and went to try.
So I discovered the archivemail utility do not recursive scan subdirs as you can see below:

[root@andorinha users]# ./archivemail -n  -d 1 escravo/Maildir/.Trash/
escravo/Maildir/.Trash:
    I would have archived 87 of 87 message(s) (15.3MB of 15.3MB) in 1.4 seconds

[root@andorinha users]# ./archivemail -n  -d 1 escravo/Maildir/.Sent/
escravo/Maildir/.Sent:
    I would have archived 15 of 15 message(s) (5.0kB of 5.0kB) in 0.4 seconds

[root@andorinha users]# ./archivemail -n  -d 1 escravo/Maildir
escravo/Maildir:
    I would have archived 7 of 7 message(s) (35.0kB of 35.0kB) in 1.8 seconds



When I scan the Maildir (main source) I'd like to recursively archive all e-mail.
I'm aware I could write a script to scan each directory and use suffix to add directory as suffix to stored  archive but so I'd have several (too many!) archives to one account!

There are any other ways to do this?

Regards,

Jáder
...

Offline Jean-Philippe Pialasse

  • *
  • 2,743
  • +11/-0
  • aka Unnilennium
    • http://smeserver.pialasse.com
Re: archiving e-mail: how to do
« Reply #1 on: January 08, 2023, 04:33:08 PM »
using find?

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
Re: archiving e-mail: how to do
« Reply #2 on: January 08, 2023, 05:21:42 PM »
But using FIND it would not generate a file for each interaction? One for Inbox, one for Sent , plus one for each existing folder?
Can you explain your suggestion ?
...

Offline Jean-Philippe Pialasse

  • *
  • 2,743
  • +11/-0
  • aka Unnilennium
    • http://smeserver.pialasse.com
Re: archiving e-mail: how to do
« Reply #3 on: January 08, 2023, 08:53:22 PM »
find to mv all the file older than x days in a similar tree architecture under a new path. 

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
Re: archiving e-mail: how to do
« Reply #4 on: January 09, 2023, 12:29:06 PM »
That already was done by Charlie:  https://forums.koozali.org/index.php/topic,31684.msg133393.html#msg133393

Quote
You can script removal of old read mail using something like:

cd ~user
find Maildir -type f  -mtime +100 -name \*R\*  | xargs rm -f

If we want to copy rather than delete, change the "xargs rm -f" bit of the script.

If you can't write the script your self, pay someone to do it. Remember that most "free" software is written by people who write code for a living.

I even found another script (no sure where in forums ) and there are another solutions:
1) script / tool to convert from Maildir to Mailbox (one big file for each folder).
like https://github.com/bluebird75/maildir2mbox/blob/master/maildir2mbox.py

2) Use a MailPiler to import (and later delete) = mailpiler.org

3) Use a Thunderbird extension to export to any of supported formats  using something like this extension:
https://github.com/thundernest/import-export-tools-ng

I was just wondering how older/smarter buys (you!) make this happens.
Do you delete e-mails older than XX or archive using .PST or mailbox or TB extension
or do you have a more enterprise solution using MailPiler ?

I'm seriously thinking about MailPiler but it has his own problems/limitations (everyone/everything has!).
My problem is right now I have more than 10 accounts with +40GB e-mails.

Let me show you a test made on weekend on SENT folder of one of those e-mail accounts:


[root@xxxxxxx files]# archivemail --dry-run --copy --date=2018-01-01 -o /home/e-smith/files/ArquivamentoEmails --suffix=before2018 /home/e-smith/files/users/axxxxxo2/Maildir/.Sent/
archivemail: Warning - changing effective user id: this automatic feature is deprecated and will be removed from later versions.
/home/e-smith/files/users/axxxxxxo2/Maildir/.Sent:
    I would have archived 15471 of 55560 message(s) (6161.5MB of 26952.7MB) in 1078.1 seconds



Regards,

Jáder
« Last Edit: January 09, 2023, 12:32:33 PM by Jáder »
...

Offline Stefano

  • *
  • 10,836
  • +2/-0
Re: archiving e-mail: how to do
« Reply #5 on: January 10, 2023, 12:33:30 PM »
another alternative is using imapsync

just setup a user with isAdmin in dovecot

This is a script I'm using on an old SME9

Code: [Select]
echo Looping on account credentials found in users db
echo
for i in $( /sbin/e-smith/db accounts show | grep '=user' | grep -v mailarchive | sed s/=user$// ); do

        echo "==== Starting imapsync from host1 $i ===="
        imapsync --host1 localhost --user1 "$i*mailarchive" --password1 "M@ilarch1ve" \
                 --host2 localhost --user2 "mailarchive" --password2 "M@ilarch1ve" \
                 --nofoldersizes --nofoldersizesatend --no-modules_version \
                 --subfolder2 "$i" --prefix2 "Archive." --subscribeall --skipemptyfolders "$@" --maxage 15 > /dev/null
        echo "==== Ended imapsync from host1  ===="

        echo
    done


/bin/find /home/e-smith/files/users/mailarchive/Maildir/ -type f  -mtime +30 -exec rm {} \;

Offline Jean-Philippe Pialasse

  • *
  • 2,743
  • +11/-0
  • aka Unnilennium
    • http://smeserver.pialasse.com
Re: archiving e-mail: how to do
« Reply #6 on: January 10, 2023, 07:16:56 PM »
You guys are about to write a whole wiki page on the subject !

Offline ReetP

  • *
  • 3,722
  • +5/-0
Re: archiving e-mail: how to do
« Reply #7 on: January 10, 2023, 08:48:56 PM »
You guys are about to write a whole wiki page on the subject !

:lol:

It would be excellent.

I'd love a cron to run every month and archive ours!

I keep them separated by year and Inbox/Sent eg

2019
-> Inbox
-> Sent

2020
-> Inbox
-> Sent

I usually do it manually once or twice a year. I keep meaning to script it with find blah -exec mv {} \; type thing.

...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline Jáder

  • *
  • 1,099
  • +0/-0
    • LinuxFacil
Re: archiving e-mail: how to do
« Reply #8 on: January 15, 2023, 10:15:11 AM »
:lol:

It would be excellent.

I'd love a cron to run every month and archive ours!

I keep them separated by year and Inbox/Sent eg

2019
-> Inbox
-> Sent

2020
-> Inbox
-> Sent

I usually do it manually once or twice a year. I keep meaning to script it with find blah -exec mv {} \; type thing.

I've teached my users to create folders on Thunderbird for years, so now I have dozen of subfolders.
I've create a bash script to search each of them and process year by year.

I think would be easier to do a single file (mbox format?) for older than 5 years (before2018) and one for each year later (2018,2019,2020,2021,2022).
Or just archive using MailPiler.
...

Offline ReetP

  • *
  • 3,722
  • +5/-0
Re: archiving e-mail: how to do
« Reply #9 on: January 15, 2023, 12:20:59 PM »
I've teached my users to create folders on Thunderbird for years, so now I have dozen of subfolders.

Yeah mine have the same. But don't archive themselves.

Quote
I've create a bash script to search each of them and process year by year.

Yup - I need to do something similar.

Quote
I think would be easier to do a single file (mbox format?) for older than 5 years (before2018) and one for each year later (2018,2019,2020,2021,2022).

I'm happy if they just go to the respective years - they can still search/read then (we still have really slow internet at one office so have to account for that)

Quote
Or just archive using MailPiler.

Seems to duplicate every mail? Not sure I need that.

imapsync seems OK but it has to duplicate before delete so you need some space first, and I am not sure it wil us emy preferred directory layout.

If I ever get time I might sit down and write myself a bit of perl or bash to do it the way I want it! Time - that is the issue :-)



...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline willdoicu

  • 4
  • +0/-0
Re: archiving e-mail: how to do
« Reply #10 on: January 17, 2023, 11:06:56 AM »
I will share my script. Create a Directory in /root folder called clean_older_mails. Else modify as you wish(eg remove the 1GB limit when finding). Daysbefore variable is what you whant to be left there. After that you wil have the archive to copy anywhere else. I do copy the archive before "# remove files stored in filestoberemoved" to a mounted share.

#!/bin/bash
# begin, remeber start, show date
start=$(date +%s)
echo "start clean older mails"
date
# stop service qpsmtpd
sv d /service/qpsmtpd
echo "service qpsmtpd stopped"
# days before to be scaned
daysbefore=60
# remember date
date=$(date +%y-%m-%d-%H-%M-%S)
# remove eventualy files left
rm -f dirs.txt filestoberemoved.txt users.txt
# change directory to users
cd /home/e-smith/files/users/
# find dirs greater than 1000 MB into text
du -sm * | awk '$1 > 1000' >user1.txt
# remove column with sizes and maillog(if exist) user from file user1.txt into final list users.txt
cut -f2 user1.txt > user2.txt | sed -n '/maillog/!p' user2.txt > users.txt

rm -f user1.txt
rm -f user2.txt

# move users.txt into working directory
cp users.txt /root/clean_older_mails/
rm -f users.txt

# go to working directory
cd /root/clean_older_mails/

# begin loop of users
while read line;
do
user="$line";
# go to user dir
cd /home/e-smith/files/users/$user/Maildir;
# find and list hidden directories into dirs.tmp.txt
ls -1d .* > dirs.tmp.txt;
# move dirs.tmp.txt to default working directory
mv dirs.tmp.txt /root/clean_older_mails/;
# change back to default working directory
cd /root/clean_older_mails/;
# remove unnecesary lines
sed -n '/.Junk E-mail/!p' dirs.tmp.txt > dirs.tmp2.txt;
sed -n '/.txt/!p' dirs.tmp2.txt > dirs.tmp3.txt;
# remove lines "." and ".."
sed '1,2d' dirs.tmp3.txt > dirs.txt;
# remove temporary file "dirs.tmp.txt"
rm -f dirs.tmp.txt dirs.tmp2.txt dirs.tmp3.txt;
# find files older than $daysbefore to be removed within hidden directories
while read line; do find /home/e-smith/files/users/$user/Maildir/"$line"/cur -type f -mtime +$daysbefore >>filestoberemoved.txt; done <dirs.txt;
while read line; do find /home/e-smith/files/users/$user/Maildir/"$line"/new -type f -mtime +$daysbefore >>filestoberemoved.txt; done <dirs.txt;
# find files older than $daysbefore to be removed within curent and new directories
find /home/e-smith/files/users/$user/Maildir/cur -type f -mtime +$daysbefore >>filestoberemoved.txt;
find /home/e-smith/files/users/$user/Maildir/new -type f -mtime +$daysbefore >>filestoberemoved.txt;
# end loop of users
done < users.txt
number=$(cat filestoberemoved.txt | wc -l)
echo "number of files to be removed = $number"
# archive files for backup
tar -czf $date.all_cleaned_users.tar.gz -T filestoberemoved.txt
date
# remove files stored in filestoberemoved
while read line; do rm -f "$line"; done <filestoberemoved.txt
# remove temporary used files
rm -f dirs.txt filestoberemoved.txt users.txt
end=$(date +%s)   
runtime=$(python -c "print '%u:%02u' % ((${end} - ${start})/60, (${end} - ${start})%60)")
echo "end clean older mails"
echo "Runtime was $runtime"
date
# start service
sv u /service/qpsmtpd
echo "service qpsmtpd started"
exit 0

Offline ReetP

  • *
  • 3,722
  • +5/-0
Re: archiving e-mail: how to do
« Reply #11 on: January 17, 2023, 11:47:28 AM »
I will share my script

The great thing about open source :-)

Quote
# stop service qpsmtpd
sv d /service/qpsmtpd
echo "service qpsmtpd stopped"

Note we use systemd on v10.

eg

Code: [Select]
systemctl status qpsmtpd
Also, if you want to play perl you can use this.

https://wiki.koozali.org/Systemd-control


...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation