The Essentials of Obsessive Backups
Rounding out a small diversion down the path of personal data backup, I thought I would document my backup philosophy and scheme. Now granted, most would think I’m absolutely over the top for the intricate plan I’ve devised over the years. Suffice it to say, I’ve thought about these details a lot and finally feel like I’m at the sweet spot between data availability and data security.
That last point is important. Your data could be replicated across every machine on the planet making it very available, but obviously very insecure. I take the challenge of finding the correct balance very seriously.
The first pillar of the philosophy is to isolate the data that should be backed up from the data that doesn’t need to be backed up. Typically the first thing I do when I get a new machine is partition into 3 or 4 drives. The C drive is left to anything that was pre-installed (operating system, shareware, etc.). I leave some extra space as a buffer here because some apps insist in being installed on C or create temporary files that live in the C drive. The D drive is for applications I’ve installed with the exception of games. And all data, regardless of what application it’s from, goes to the E drive. Usually games and pictures (12 gigs and counting) go to the F drive.
Over the years this isolation has worked in my favor a couple of times. There were times that I had to re-install the OS and was thrilled to find my E drive with all data still intact. There were times a bad game install hosed the F drive but left the other untouched. In short, drive partitioning is a must. In ancient times, the process was a little harrowing and not to be done carelessly. It’s gotten a lot easier and safer now, so there’s no excuse.
The next pillar is that backups must be automated. A backup that is not automated is almost useless, as you’ll probably do it for the first couple of weeks and then quickly lose interest. There are a ton of applications that can help with this task. I rely on a mix of SyncBack and rsync, depending on the target of the backups (more on this below).
The third pillar is having a reliable, simple, accessible offsite backup. It must be reliable for obvious reasons. It must be simple because a complicated interface or API (I’m looking at you A3) only makes it less likely that I’ll work through the frustrations when things go wrong. It must be accessible so I can get my data from any machine at any time. And it must be offsite because a fire or theft could easily compromise my home machine. I found all 4 of these with rsync.net. I could write endlessly about the majesty of rsync.net. But I’ll summarize to these short points:
- I don’t have to install any proprietary client-side apps, such as the ones iBackup or others make you install. This is one obstacle to data accessibility that is removed.
- Since it supports SFTP, SCP, rsync, unison, and subversion, it will work on either a PC, Mac, or *nix machine. Another obstacle removed.
- It’s cheap. Not as cheap as A3, but pretty cheap ($1.60/gig)
- They have great customer support, with a privacy policy that puts the customer first
- Since they support rsync (and the others listed above), they are very developer-friendly. Since it supports SFTP, I can use a client like WinSCP if I want a GUI
Obviously this isn’t for everyone. I wouldn’t suggest it for my Aunt Millie, but for me it’s about as good as it gets.
With those pillars in place, I’ve set up the following backup scheme:
- Core data, including Quicken files, Word docs, and source code gets backed up to rsync.net every night. Additionally, the Quicken file is encrypted using TrueCrypt for additional security.
- Pictures get backed up to a Dreamhost account, which gives me plenty of space to spread out. Additionally, I’ve hacked Plogger to display the photos, making this account double as a photo gallery for friends and family. Since this data isn’t critical, it’s not important to me if it gets compromised for some reason.
- Core data from rsync.net is also backed up to a USB key I keep on my keychain. This provides additional data accessibility while incurring no additional security risk since the entire set of data is encrypted with TrueCrypt.
- Most recently I purchased a $60 USB hard drive that is connected to my home machine. This backs up all data and photos every hour. The reason for this is that in the case of data loss it would be a lot easier to restore from the USB drive than from downloading from rsync.net or Dreamhost. Also, it provides a clear data transfer path when the time comes to move to a new machine.
- All the data on my E drive is also kept in a Subversion repository. Data versioning is a little different than backup. The goal here is to make sure that if some file becomes corrupted I could roll back to a previous state. This is not ensured by most backup schemes, where only 1 version of each file is kept. The subversion repository also happens to be backed up to both rsync.net, the USB key, and the USB harddrive. Again, just in case.
I feel good about the logic here, but I’m constantly thinking about whether I’ve done too much or not enough. Admittedly, that’s obsessive.
0 comments