Automatically Backup Your Data from Online Services (Part I)

I am fanatical about backups. It borders on obsession. It didn’t stem out of any major data loss, it stems out of the fear of a data loss, which I guess is about the same but with more paranoia. This posed a unique issue with the advent of Web 2.0 applications, where the data is frequently stored on somebody else’s server. It took some time to work out a system that worked, but I’ve gotten it down to something of a science now and thought it’d be worthwhile to share.

There are a ton of useful services out there, but keep in mind that it’s your responsibility, not theirs, to make sure you have your data backed up. Services go out of business, change owners, have downtimes, go premium, etc. A little thought up front saves you from a frantic weekend of cutting and pasting screen fulls of data from an old service into the new one.

Below are a few rules I live by. As a preface, if you don’t have a hosted account, get one. They’re dirt cheap in most cases, and quickly becoming nearly essential. My hosted account is bang-for-the-buck the most useful service I pay for monthly. I’ve used A Small Orange for a number of years now and can highly recommend them. (If you happen to decide to use them, please consider thanking me by putting “” in the referrer box on the order form 🙂

  1. Always prefer a cloned or good-enough version of software that can be installed in a hosted web account. For instance, Basecamp is a great application. But did you know there’s a pretty good knock-off called ActiveCollab that was free until version 0.7.1? You can probably scrounge up a copy of that version still (wink wink nudge nudge). Even if you have to pay the $99 for a perpetual license for version 1.0.4, in my mind it’s still better to access and control over all your data.
  2. If you can’t find a hosted version, make sure the online service you select provides a means to export your data. Most of the big players like Google and Yahoo allow you to get backups of your data from inside the web application. If you know what you’re doing, you might want to make sure their service is compatible with something like curl or wget so you can call it from a script, which leads me to…
  3. Create a script to automatically pull all your data from each service. I’m a big believer in the motto that backups should run automatically otherwise they’re probably useless. I have just so happened to create a Perl script to backup my data from the various online applications. The script runs every night on my hosted account and emails me the results. From there the possibilities are endless. For instance, if I sent to my Google account I could keep them indefinitely and have implicit ability to search for a particular version. I choose to just copy them to my hard drive and use Subversion to keep them versioned. The important thing is collating the data from the various services in one place in an automated fashion.

This post turned out a little longer than I expected, so I’ll plan to cover the actual script in the next post.

Be Sociable, Share!

Leave a Reply