Koozali.org: home of the SME Server

do_backupwk Lack of memory to achieve the operation

Offline Mirfster

  • **
  • 61
  • +0/-0
do_backupwk Lack of memory to achieve the operation
« on: February 10, 2016, 03:06:27 PM »
Hello all,

Have a weird situation where backups started to fail.  System is running older version of SME (8.2); and eventually be updated.  Currently they want to hold off for unknown reasons...

Anyways, for the past couple of days backups have failed with the following messages:
Quote
Subject: Cron <root@omegasme> /sbin/e-smith/do_backupwk
Lack of memory to achieve the operation, aborting operation Failed to add set /mnt/smb/omegasme.omega-mep.local/set2/inc-021-201602032303 to catalog.
Backup terminated: backup failed - status: 7424

Quote
Subject: Daily Backup Failed: omegasme.omega-mep.local
==================================
DAILY BACKUP TO WORKSTATION REPORT
==================================
Backup of omegasme.omega-mep.local started at Wed Feb 10 07:57:02 2016 Backup of mysql databases has been done Mounting backup shared directory <10.20.100.164:SMEBackupsCIFS> Using set number 2 of 12 Attempt incremental backup number 21 of 30 Backup temp directory /mnt/smb/tmp_dir/omegasme.omega-mep.local is mounted and is writable Backup base file name is inc-021-201602100757 Starting the backup with a timeout of : 82770 seconds


 --------------------------------------------
 1701 inode(s) saved
   including 0 hard link(s) treated
 0 inode(s) changed at the moment of the backup and could not be saved properly
 0 byte(s) have been wasted in the archive to resave changing files
 424017 inode(s) not saved (no inode/file change)
 0 inode(s) failed to be saved (filesystem error)
 254 inode(s) ignored (excluded by filters)
 32 inode(s) recorded as deleted from reference backup
 --------------------------------------------
 Total number of inode(s) considered: 426004
 --------------------------------------------
 EA saved for 16 inode(s)
 --------------------------------------------
Moving backup files to target directory /mnt/smb/omegasme.omega-mep.local/set2
Updating catalog
*** No backup allowed or error during backup *** Failed to add set /mnt/smb/omegasme.omega-mep.local/set2/inc-021-201602032303 to catalog.

Last successful backup occurred on 02/03/2016:
Quote
--------------------------------------------
Moving backup files to target directory /mnt/smb/omegasme.omega-mep.local/set2
Updating catalog
Disk usage 1.2T, 17% full, 5.9T available Updating backup configuration data Backup successfully terminated at : Wed Feb  3 00:12:49 2016

There is plenty of available space, all available updates were applied last night and system reconfigured/restarted.

I did notice this message on 02/03/2016 @ 04:05am:
Quote
Subject: Cron <root@omegasme> sleep $[ $RANDOM % 3600 ]; /sbin/e-smith/check4contribsupdates -m
Existing lock /var/run/yum.pid: another copy is running as pid 9649.
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  12 M RSS ( 22 MB VSZ)
    Started: Wed Feb  3 04:04:53 2016 - 00:02 ago
    State  : Running, pid: 9649
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  13 M RSS ( 53 MB VSZ)
    Started: Wed Feb  3 04:04:53 2016 - 00:04 ago
    State  : Uninteruptable, pid: 9649
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  14 M RSS ( 53 MB VSZ)
    Started: Wed Feb  3 04:04:53 2016 - 00:06 ago
    State  : Sleeping, pid: 9649
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  22 M RSS ( 63 MB VSZ)
    Started: Wed Feb  3 04:04:53 2016 - 00:08 ago
    State  : Running, pid: 9649
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  23 M RSS ( 64 MB VSZ)
    Started: Wed Feb  3 04:04:53 2016 - 00:10 ago
    State  : Running, pid: 9649
Existing lock /var/run/yum.pid: another copy is running as pid 9677.
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  13 M RSS ( 22 MB VSZ)
    Started: Wed Feb  3 04:05:05 2016 - 00:00 ago
    State  : Sleeping, pid: 9677
Existing lock /var/run/yum.pid: another copy is running as pid 9769.
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  17 M RSS ( 26 MB VSZ)
    Started: Wed Feb  3 04:05:07 2016 - 00:00 ago
    State  : Running, pid: 9769
Existing lock /var/run/yum.pid: another copy is running as pid 9790.
Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory :  23 M RSS ( 63 MB VSZ)
    Started: Wed Feb  3 04:05:08 2016 - 00:01 ago
    State  : Running, pid: 9790

Lastly, I did check system memory (but may have to see how much RAM was in there originally, which may be the issue):
Code: [Select]
[root@omegasme ~]# free -m
             total       used       free     shared    buffers     cached
Mem:          4040        804       3236          0         29        254
-/+ buffers/cache:        519       3521
Swap:         5023          0       5023

Please let me know if there is any additional info needed.

Thanks.

Offline DanB35

  • ****
  • 764
  • +0/-0
    • http://www.familybrown.org
Re: do_backupwk Lack of memory to achieve the operation
« Reply #1 on: February 10, 2016, 03:13:38 PM »
If the issue truly is one of running out of memory (i.e., RAM), a temporary, stopgap measure could be to add a swap file.  You could do that like this:
Code: [Select]
# dd if=/dev/zero of=/home/swap bs=1m count=4k
# mkswap /home/swap
# swapon /home/swap

This will add 4 GB of swap space, which I'd think should be plenty.
......

Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #2 on: February 10, 2016, 03:50:37 PM »
I can try that, would adding more system memory be a good route as well?

Offline DanB35

  • ****
  • 764
  • +0/-0
    • http://www.familybrown.org
Re: do_backupwk Lack of memory to achieve the operation
« Reply #3 on: February 10, 2016, 03:53:29 PM »
If I'm reading the output of free correctly, you already have 4 GB of RAM.  This isn't a huge amount, but it should generally be adequate, so I'm a little skeptical that the issue is actually running out of RAM.  Expanding swap is an easy and free way to determine if that's actually the issue.  If that fixes it, adding RAM would be a good idea.

Are you backing up to your FreeNAS box?
......

Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #4 on: February 10, 2016, 04:14:55 PM »
Not at the moment. :sad:... 

The FreeNas box is actually a "loaner" from me.  They are moving offices in November and want to wait til then to implement my design which will include a dedicated FreeNas Server, off-site backups of FreeNas (Rsync) and SME (affa to a VM or physical).  Of course, I have warned them; but it seems to fall on deaf ears.

For now, I will try out expanding swap and see how things go.  I don't believe a system restart is needed, but if so please let me know and I can at least get that scheduled.

Thanks.

Edited: I misread the question.  Yes, backups are going to a FreeNas Server via CIFS.
« Last Edit: February 10, 2016, 04:17:44 PM by Mirfster »

Offline DanB35

  • ****
  • 764
  • +0/-0
    • http://www.familybrown.org
Re: do_backupwk Lack of memory to achieve the operation
« Reply #5 on: February 10, 2016, 04:19:46 PM »
No, a system restart shouldn't be needed after adding swap.  FWIW, I had some trouble backing up to my FreeNAS box via CIFS, but it's been fine using NFS.  Nothing dealing with memory though.
......

Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #6 on: February 10, 2016, 04:26:58 PM »
Okay, done:
Quote
[root@omegasme ~]# dd if=/dev/zero of=/home/swap bs=1M count=4k
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 33.3002 seconds, 129 MB/s
[root@omegasme ~]# mkswap /home/swap
Setting up swapspace version 1, size = 4294963 kB
[root@omegasme ~]# swapon /home/swap
[root@omegasme ~]#

I will post back later with any results.

Yeah, NFS is for sure on the table as a possible replacement for CIFS.

Thanks for your quick assistance
« Last Edit: February 10, 2016, 04:31:00 PM by Mirfster »

Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #7 on: February 10, 2016, 04:33:50 PM »
Lol, I was wondering why you asked about FreeNas, then I just realized that your over on the FreeNas forums as well....  Small world.   :-D

Offline warren

  • *
  • 293
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #8 on: February 10, 2016, 05:16:01 PM »
Are you running 32bit or 64bit SME8 ?

What are your backup settings :
Number of rotating backup sets ?
Number of daily backups contained in each set ?

I had seen the same issue on a SME8.2 with 8GB of RAM; ( with number of sets =2 and number of incerementals = 28 ). Whilst watching memory used while dar was running, I found that with the very Large amount of Data been backup up (+- 1TB ) Dar would exceed the 2GB memory limit per process.

I eventually upgraded to SME9 64bit and have not had any issues since.


Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #9 on: February 10, 2016, 07:18:26 PM »
Backups are configured as:
Code: [Select]
Number of rotating backup sets is 12
Number of daily backups contained in each set is 31
Compression level (0-9) of backup is 6
Daily backup occurs at 10:30
Each daily backup session is cleanly timed out after 23 hours.except full backups which are cleanly timed out after 24 hours
Full backup sessions (new backup set) are allowed everyday

They wanted to be able to go back a year...

Uname results:
Code: [Select]
[root@omegasme ~]# uname -i
i386
[root@omegasme ~]# uname -r
2.6.18-408.el5PAE
[root@omegasme ~]# uname -m
i686
[root@omegasme ~]# uname -p
i686
[root@omegasme ~]# uname -a
Linux omegasme 2.6.18-408.el5PAE #1 SMP Tue Jan 19 09:54:24 EST 2016 i686 i686 i386 GNU/Linux

Offline warren

  • *
  • 293
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #10 on: February 10, 2016, 08:18:42 PM »
So you're runingg 32 bit;

How much data are you backing up ?

You download and run (http://dar.sourceforge.net/doc/samples/dar_rqck.bash ) bash script for Linux users to have raw estimation of the required amount of virtual memory to use to be able to save the whole system.

Try monitor the virtual memory usage while dar is running.
Code: [Select]
ps aux | grep dar
On my system (x86_64 : with a backup of 1 set and 30 incrementals) backing up 1TB of info i'm seeing Dar use about 5748M
Quote
root     18048  100 72.5 5886924 5757988 ?     RL   19:45  75:52 /usr/bin/dar_manager -Q -ai -B /media/usbBackup/keeper.xxxx.xx.xx/dar-catalo
g -A /media/usbBackup/keeper.xxxx.xx.xx/set1/inc-020-20160210180004


Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #11 on: February 11, 2016, 12:04:33 AM »
Sorry, had to step out for a while.  Will give this a try tonight.  Might see if I can just get them to expedite moving up to 9.x x64.  :)

Offline DanB35

  • ****
  • 764
  • +0/-0
    • http://www.familybrown.org
Re: do_backupwk Lack of memory to achieve the operation
« Reply #12 on: February 11, 2016, 03:02:36 AM »
The problem is that you need a good backup to upgrade to SME 9, as there's no way to do an upgrade in place from 8 to 9.  Hopefully this backup runs OK.
......

Offline Mirfster

  • **
  • 61
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #13 on: February 11, 2016, 03:07:32 PM »
Agreed.  Still going though logs to see what may have changed on 2/3; since that was the last successful backup.  Will post once I find out more.

Offline warren

  • *
  • 293
  • +0/-0
Re: do_backupwk Lack of memory to achieve the operation
« Reply #14 on: February 11, 2016, 03:50:26 PM »
Quote
Number of rotating backup sets is 12
Number of daily backups contained in each set is 31
Compression level (0-9) of backup is 6
Daily backup occurs at 10:30
Each daily backup session is cleanly timed out after 23 hours.except full backups which are cleanly timed out after 24 hours
Full backup sessions (new backup set) are allowed everyday

Has the backups using this set up ( ie 12 sets of 31 increments ) ever gone to the point where it is on set12 , incremental 31 ?

Does / Do  set0 run successfully with all it's 31 increments ? then set1 with it's 31 increments ?

From the documents ( http://dar.linux.free.fr/doc/Limitations.html ) & if I understood it, when doing differential / incremental backups, dar will have
in memory 3  catalogues in your instance[set0 , set1 , set2]   ( ...dar starts with the catalogue of the archive of reference which is needed to know which files to save and which not to save, and in another hand, builds the catalogue of the new archive all along the process. Now, for catalogue extraction, the process is equivalent to making a differential backup just after a full backup. )

So it's able to handle the initial full backup - set0 and 31 incrementals ; set1 and 31 incrementals ; set2 ONLY upto incremental 21 it  fails.

This is the same issue that i ran into ( a full backup using : sets =1 increments =30 ) would run fine ; the moment i tried using 2 sets , the backup would fail  with *** No backup allowed or error during backup *** Failed to add set xxxxxxxxxxxx to catalog when it got to inc-018


The problem is that you need a good backup to upgrade to SME 9, as there's no way to do an upgrade in place from 8 to 9.  Hopefully this backup runs OK.

That can be handled by running a backup schedule consisting of 1 set ( ie full backup ) and say 3 incrmentals.