Backups, the sheer joy (not)

I was working on a nice complicated backup strategy where every server would do regular backups (full and partial) to its local filesystem; and every XXmins my master backup server would suck them up using SCP and then remove the local server copies to free up space on the local server for the next backup. It also handled user file restore requests by queueing those in the backup directory also, they were sucked back as part of the backup file retrieval, processed, and the restored files shoved back out to the requesting server where they were placed into the requesting users home directory filetree.

But I was only doing that because I could not figure out how to get NFS filesystems through my firewall as they kept using random port numbers.

Anyway, I did eventually find out how to get NFS to use specific port numbers and completely changed by backup strategy. All server backups (dump/tar/mkcdrec) were happily being written to server specific slices on my master server using NFS mounts (the scripts coded so the web server could mount/umount on demand for backup/restore…) …restore, yes, had it all working so I could ‘restore filename’ and it would just restore the file from the NFS backups.

BUT… I have started now getting ‘stale NFS handle’ errors midway through backups, randomly. No big issue on restores, but when it’s 20Gb into a backup it is a real pain as it means there is no usable backup. It’s a bigger pain as I had really started bells and whistling the restore features of my scripts (backups beig backups they either work or don’t, no bells and whistled needed there, but restores were getting really user friendly with file pick-lists etc.
Anyway, stale NFS handle errors I can’t work around; that backup solution has been scrapped, a backup that only works 90% of the time isn’t a good backup solution. So, no more NFS, lockup those ports in the firewalls again.

So I’ve restarted work on my SCP backup/restore scripts, only using tar. While I’m aware I could pipe the tar output directly to the ssh stream and avoid needing to store the backup on the local servers at all I have decided to enforce a local backup abd SCP the complete backups down so I know I have a good backup set, and it’s retryable without needing to redo the backup itself.

It’s probably a safer design anyway, using SCP the master server can pull backups down and push restored files out using SSH keys (in a one way fashion, the master server can see all the other servers, but the other servers can never contact the master (as noted above restores are done by storing requests in the local servers backup directory that the master will pull down, process, and push the file back)) which is possibly a little safer than every server being able to see the master servers NFS mount (although they could only ever see their tiny little slice of it).
So I’ll be spending a lot of time getting that set of SCP scripts working now.

Shouldn’t be too hard, already got the local server backup and restore scripts (with bells and whistles on the restore), just never tested the master server bits… well I probably did but it was a while ago; I’ll put them in CVS this time.

About mark

At work, been working on Tandems for around 30yrs (programming + sysadmin), plus AIX and Solaris sysadmin also thrown in during the last 20yrs; also about 5yrs on MVS (mainly operations and automation but also smp/e work). At home I have been using linux for decades. Programming background is commercially in TAL/COBOL/SCOBOL/C(Tandem); 370 assembler(MVS); C, perl and shell scripting in *nix; and Microsoft Macro Assembler(windows).
This entry was posted in Uncategorized. Bookmark the permalink.