Koozali.org: home of the SME Server

Obsolete Releases => SME Server 9.x => Topic started by: DanB35 on July 08, 2014, 11:24:41 PM

Title: Backup design/implementation
Post by: DanB35 on July 08, 2014, 11:24:41 PM
I spent some time debating with myself whether to post this, because I'm concerned that it will come across as simply complaining.  I haven't been able to think these issues through to the point of offering concrete suggestions, but I'm hopeful that raising them will get some discussion going toward implementing some improvements.  The necessity of backing up and restoring in order to upgrade to SME 9.0 makes these issues somewhat more timely, I believe.

I've been using SME Server since the days of e-smith 3.something.  Thankfully, in that time, I've never lost any data due to a hardware failure, even without running backups (or RAID for most of that time), but I know I'm on borrowed time in that regard.  Recently, I've had enough storage space available to start doing backups, and after a bit of a rocky start (addressed in other threads), I have daily backups running.  It appears, though, that the backup system in SME Server is an afterthought, at best.  There are several areas where it just doesn't seem to be particularly well thought through.  Here are some that I've noticed:

There are probably some other issues, but I'm sure I've complained more than enough already.  I understand that there's a historical reason for some of this--like that flexbackup was part of the original system, while dar was a contrib that was later rolled into the distro.  But right now, it results in a mish-mash of confusingly incompatible options with a further-confusing interface, where some of the functions just don't work at all (like NFS).  Here are some suggestions that I think would improve the situation:

I hope this is helpful, and can lead to some improvements in this area.
Title: Re: Backup design/implementation
Post by: ReetP on July 09, 2014, 01:20:30 AM
Your comments are not necessarily complaints but worthy criticism :-)

Helping to identify problem areas, and where we should be doing better, is very important. Constructive criticism is always a good thing.

Without wishing to teach you how to suck eggs the old adage applies - if it doesn't work as expected, or as you want it, first thing is to have a look on the bug tracker for stuff on backups and look for bugs both old and new (do an advanced search and include 'closed' bugs etc)

Some of these points have been raised before (I raised a few myself) and you may find your answer, or they may need to be reinvestigated

Go through them and if you see one of your points either reopen it, or create a new one. Don't try and bunch all of your points together. Each point is a separate bug even though they all come under backup/restore.

If in any doubt, ask on the dev mailing list at http://lists.contribs.org/mailman/listinfo/ (http://lists.contribs.org/mailman/listinfo/) for further guidance and someone will help.

At the end of the day if you join in and give us hand you are much more likely to get the system that you want :-) And we need all the help we can get.

B. Rgds
John
Title: Re: Backup design/implementation
Post by: DanB35 on July 09, 2014, 06:31:40 PM
I have no problem with filing one or more bugs (already done in the case of NFS, which is a clear bug IMO), but I was kind of hoping to spark some discussion to help me and others think through some of these issues first.  I'll need to ponder these some more and see what's already in bugzilla.
Title: Re: Backup design/implementation
Post by: Stefano on July 09, 2014, 08:09:46 PM
I agree..

we need to find the way to give admin user:
- the possibility to restore (in a easy and fast way) a single file
- the possibility to create an offline full backup for disaster recovery
- the possibility to open a backup set using standard tools, in windows/whatever O.S.

the first thought that comes in my mind is that dar is not the right tool.. it has almost no client for windows, is not "standard".. tar, instead, is widely used and known

we should/could/might think about backuppc.. this is my opinion and maybe this is not the right place to discuss..
Title: Re: Backup design/implementation
Post by: janet on July 09, 2014, 08:10:43 PM
DanB35

Some additional useful info is available under B for Backup ........
See
http://wiki.contribs.org/Category:Howto
Title: Re: Backup design/implementation
Post by: DanB35 on July 09, 2014, 08:49:46 PM
I don't feel that I know a whole lot about the various backend tools available.  Certainly tar is ubiquitous, but dar seems to support chunking in a fairly straightforward manner--nobody wants to deal with a 300 GB tarball, and many filesystems won't even support one, so the ability to split the backup into manageable-size files is important.  Dar also indexes the backup, unlike tar.  There are clients available for most Unix flavors and Windows, so I'm not sure the choice of dar as a backend is so problematic.

Obviously, the critical capability of a backup is the ability to restore the data, completely, reliably, and easily--otherwise we might as well write the backup to /dev/null.  As near as I can tell, we have that ability today.  Selective file/directory restore is, IMO, a step below this in importance, but it's a small step, and the current system allows for this, albeit with a cumbersome UI (enter a regex for what you want to restore?).  I hadn't tried this before, which is why I didn't address it in my post.

Disaster recovery would be very nice to have.  Early versions of e-smith had the option to create a reinstall floppy that would set the basic configuration and then (IIRC) restore a backup.  At least, that was the idea--I don't believe it ever worked well, if at all.  It would seem like the basics of this idea are present already--you can do a basic installation, reboot, and then restore from a backup.  The problem here is that this only works if you did the right kind of backup.  Otherwise, you have to go through a fair bit of server setup and configuration before you restore your backup (that replaces all that configuration).

One good thing to come out of this is that by the time I have SME 9.0 up and running, I will have validated my backup/restore process.
Title: Re: Backup design/implementation
Post by: DanB35 on July 09, 2014, 10:19:10 PM
Bugs 8490 and 8491 submitted.
Title: Re: Backup design/implementation
Post by: brianr on July 09, 2014, 11:01:26 PM
http://bugs.contribs.org/show_bug.cgi?id=8490

http://bugs.contribs.org/show_bug.cgi?id=8491

Title: Re: Backup design/implementation
Post by: CharlieBrady on July 10, 2014, 07:10:38 PM
Disaster recovery would be very nice to have.  Early versions of e-smith had the option to create a reinstall floppy that would set the basic configuration and then (IIRC) restore a backup.  At least, that was the idea--I don't believe it ever worked well, if at all.

Really? I don't recall many (any?) bug reports. As you know, we do take restore from backup bug reports seriously.

Quote
It would seem like the basics of this idea are present already--you can do a basic installation, reboot, and then restore from a backup.  The problem here is that this only works if you did the right kind of backup.

It was designed (a long time ago) specifically to work with tape backup and backup to desktop. Those were the only options available at the time.
Title: Re: Backup design/implementation
Post by: DanB35 on July 10, 2014, 07:40:09 PM
I just happened to be browsing some old devinfo emails the other day and saw some discussion from 2002 involving you, and mentioning continuing problems with reinstall floppies.  It was probably an overstatement to say "I don't believe it ever worked well, if at all"--I do recall having had trouble with it, but it's been long enough ago that I don't remember any details, which are moot in any event at this point.

I lived through enough of the history to understand at least the broad strokes of how we got where we are.  I remember Darrell May working on the "workstation backup" using dar, though I have to admit I don't clearly recall its being integrated into the base install.  But however reasonable or understandable the road that got us here, "here" is a state where we have two mutually-incompatible backup systems in place.  This is confusing to the end user, and I don't see that it's helpful to the developers' workloads either, since both systems have to be maintained.

Unfortunately, I can't code in any of the relevant languages, so I don't believe I can be of any help in the implementation.  I'll do whatever else I can to assist, though.
Title: Re: Backup design/implementation
Post by: CharlieBrady on July 10, 2014, 08:44:58 PM
... though I have to admit I don't clearly recall its being integrated into the base install.

I don't either. I don't think it was ever adequate quality, and the issues of incompatibility with the other systems should have been addressed before it was "integrated".

How different would your suggestions be if that "contrib" were removed?
Title: Re: Backup design/implementation
Post by: janet on July 10, 2014, 09:22:56 PM
DanB35

Quote
I remember Darrell May working on the "workstation backup" using dar, though I have to admit I don't clearly recall its being integrated into the base install.....

Darrell May developed the DAR2 contrib.

I believe this thread  http://forums.contribs.org/index.php?topic=37922.0  covers the early days of the subsequent backup with dar contrib development by jpl, & subsequent integration into sme server as a replacement for e-smith-backup

Quote
But however reasonable or understandable the road that got us here, "here" is a state where we have two mutually-incompatible backup systems in place.  This is confusing to the end user, and I don't see that it's helpful to the developers' workloads either, since both systems have to be maintained.

Is it really that hard to understand ? The Manual clearly states which backup to use when & where etc, despite your comments about documentation issues.

The different backup methods are there due to historical reasons, but so is most of sme server, in the form it is currently in.

If you create a backup using the admin console (tgz), then that is the backup you use when you are asked on first boot after installing the OS if you want to restore.

If you create a backup in server manager backup or restore panel (ie the dar workstation backup), then you need to install the OS, configure it for your network, configure backup in server manager, then restore using the server manager backup or restore panel.

I agree that the terminology could be improved, but backup & restore does work, & you have 2 choices (in a default install).
If you do not like either of those, then you have numerous other contribs available.

While it may take developer effort to maintain the two backup methods, it will probably take a lot more developer effort to create a new backup system, or otherwise integrate backup into one system etc.
Your ideas are good, but motivated developers with spare time are needed to do the work, & typically if something works OK in sme server, then it is usually left as is, unless someone is motivated enough to do something about it.

Now that sme9 is released, perhaps attention can be given to tidying up some aspects.
Ian Wells has already started to address issues with backups & make some improvements, so hopefully this can continue, with suitable assistance.
Title: Re: Backup design/implementation
Post by: DanB35 on July 10, 2014, 10:23:16 PM
Charlie, interesting hypothetical.  I'll probably be a little "stream of consciousness" in my reply at the moment.  Removing the "workstation backup" would, of course, remove the issue of incompatible USB disk backups.  It would also, if I'm not mistaken, remove the only "stock" facility to run regular scheduled backups, as well as the only way to back up to a network share, both of which are (IMO) pretty important capabilities.

I don't think the cosmetic cleanup suggestion is really affected, but that's also the least specific of my suggestions (and there's already a long-standing bug, #4710, on the subject).  I also don't think my suggestion of removing the "backup to web browser" is really affected--I just don't believe that's viable any more except for the smallest of installations, and the ability to restore from such a backup has long since been removed.

If that specific module were removed, I think its capabilities would need to be replaced--scheduled backups, in particular, are just too important to not have in the base installation.  Network backups might be less critical, but I'd still rate them as pretty important.  Of course, my network backup destination is a redundant ZFS pool, which I'd consider much more reliable than a single USB drive, so that's going to be a factor in my opinion.

As I see it, there are several independent pieces involved in how SME does backups.  There is the web manager panel, the server console, the scripts that run the actual jobs, the backend tools that package and write the data, and the devices on which the data is stored.  In the typical Unix-y way of things, any of these pieces can be replaced with something else, without great effect on the others.  If we decide that we're going to get rid of the workstation backup module, that leaves the question of what backend tool set we're going to use--tar, dar, or something else.  No doubt there are lots of technical pros and cons to the various tools available; the only one I know of (which may be a matter of implementation rather than inherent to the tool) is file chunking.