My Backup Strategy
After having three hard drives fail on me in the last 6 months, I have become much more aware of trying to increase my data security. I have been photographing digitally for 14 years now, which amounts to a whole lot of irreplaceable photos and memories. The same goes for my design work, though much of design work in general is not reusable beyond the initial project, it’s still nice to have for portfolio and posterity. I started researching other creative’s backup strategies and came up with a mixed bag, but central themes.
What others are doing:
Scott Kelby had a bad experience with Drobo and moved on to a set of G-Tech G-Speed RAID drives paired with the online backup service, CrashPlan. In the comments on in that post, he took heat for picking an outdated piece of tech, and for going with G-Tech. It seems G-Tech used to be a great company. Then in 2009, G-Tech was purchased by HGST (aka, Hitachi). In early 2012, HGST was then purchased by Western Digital. During Hitachi’s helm, innovation at G-Tech apparently stagnated, prices are significantly higher than the competition, customer service is terrible, and quality is down to the point where the brand is taking a serious beating in professional circles. Apparently Kelby didn’t get the memo.
Chase Jarvis, video extraordinaire, posted his studio’s very complex server and backup plan in 2010. Like Kelby, Jarvis uses G-Tech drives, but he’s got a whole menagerie with both portable, RAID, and SAN setups. Though unlike Kelby, it sounds like he gets his G-Tech equipment for free as a “tester.” He mates all of this with a massive Apple X Server. His system is overly complex for my needs, but it seems to work for him. I wonder if he’s still using this setup now, three years later.
The American Society of Media Photographers (ASMP), has a well-regarded “backup best practices” page full of great info. They recommend having at least 3 copies of your files (1 working copy, 2 backups), on at least 2 different types of media (i.e., hard drive, online, tape, DVD, etc.), and one of those should be stored off-site. I found in my research how important off-site backing up really is, especially after witnessing the incredibly awful flooding we had here in Colorado a few weeks ago. Natural disaster, fire, flood, burglary, etc. could render your 10 backups useless if you keep them all in one place.
PSD Tuts has a great article with all sorts of recommendations for online services and drive options. “My Hard Drive Crashed…” (And What I Learned From It), by Smashing Magazine, is a great article on their tests with online backup sites and strategy.
What I learned:
Off-site backup is very important. Keep at least one copy of your files geographically away from your home or office in a safe location.
RAID 5 and 6 are complex and prone to failing in a big beautiful ball of fire. With larger drives coming out, a little less necessary. RAID is a method of combining multiple hard drives into an array. There are several variations of this. RAID 5 and 6 operate by combining 3 or more drives into a single huge storage space by splitting up your data between the drives in a very complicated way. The theory is that reading your data back from 3+ drives is faster than reading it back from 1 drive, you get a single larger storage “drive” than is possible from a single physical drive, and one or more drives can fail without losing your data. If a drive in a RAID 5 setup were to fail, you simply replace that drive and the RAID controller would rebuild your data. I say in theory because there are many horror stories on the internet of drives failing and the RAID controller being unable to rebuild your data, and more commonly, the RAID controller itself fails and corrupts the entire drive system and all your data. It also suffers from the same drawbacks as RAID 1, below. This automatically negated hardware like the Promise Pegasus and Drobo.
RAID 1 would be the most applicable in my situation. RAID 1 uses two drives, and mirrors your data simultaneously between the two drives constantly and instantly. The bonus is that if you have a drive failure, in theory, the other drive is still functioning and contains a good copy of your data. The problem with that, and why I didn’t choose that route, is that if your computer has a glitch, gets a virus, corrupts the drives, experiences a power surge, or the drive enclosure malfunctions, etc., both drives will be affected (and damaged) the same. Similarly, if you accidentally overwrite or delete an important file, it is instantaneously ruined on both drives.
Drobo seems like it had a really strong start, and still has a strong feature set (battery backup, advanced RAID setups, easy drive replacement, network-attached or direct-attached, etc.). Unfortunately, it seems plagued by bugs, reliability issues, and premature hardware failures. You can still find pockets of enthusiastic Drobo users that haven’t had any problems, but overall, most professionals warn against using a Drobo.
Network-attached storage (NAS) is a concept I looked into extensively. It seems the way of the future is through using a NAS. The concept is that you access all of your files through your office or home network, not using a USB or Firewire cable to the drive on your desk. The bonus is that you can get access your data from anywhere in your home, office, or on the road, with any device you may have. You can also access higher quality RAID hardware from makers like Synology and QNAP that seem more reliable than cheaper desktop RAID enclosures. The downsides, for me, were too plentiful to make is a viable option however. The first downside is that NAS devices are significantly more expensive than a stand alone drive, especially considering you have to fill them with 4-8 hard drives usually. The second downside is what looks like a huge headache in the complexity of setting up these devices. A NAS is infinitely more complex than plugging in a USB hard drive and dropping some files on it. They come with all sorts of bells and whistles, FTP servers, image and music software, link aggregation requirements, social and mobile apps, IP configurations up to your neck… no thanks. Lastly, they are generally much slower than your average direct-attached hard drive. Some of Synology units can approach 100MBps real world speeds (good, but still not USB3 or Thunderbolt speed), but only if you spend almost $1k on the empty enclosure, get a dedicated GigE switch and hook it up to your computer via wired parallel ethernet cables (link aggregation).
What I decided to do:
I ended up with a 5-layer system that is fast, inexpensive, scalable, and extremely safe (that was the whole point right?).
I have four Western Digital 4TB 7200RPM Black drives, each in a separate StarTech USB 3.0 enclosure. Two of the drives are my primary working drives for my design and photography files. The other two drives are exact mirrors of the working drives. I chose separate enclosures/drives because part of my plan involves only attaching/powering on the mirror drives at the end of each day to sync with the working drives. By slightly delaying and controlling when I sync the two sets, I can protect my data against accidental deletions, viruses, computer glitches or sudden shutdowns, drive enclosure failure, transfer corruption, human error, lightning strike, etc. Worst case scenario if one of my working drives fails is that I’ve lost a few hours of work from the last time I synced the mirror drive.
I then have two sets of two Western Digital 2TB My Passport portable drives. One set of two drives gets a backup from the working drives once a week and is stored on-site in my fire- and water-proof safe. The other set of two drives also gets a backup copy of the working drives, but only once every 2-3 weeks, and this set is kept safely off-site. The 2-3 week delay lets me catch any errant file corruption, deletion, or virus activity I didn’t catch in the first couple of days.
My fifth layer of protection is an online backup service called OpenDrive. Like BackBlaze and CrashPlan, OpenDrive has unlimited space, automatic scheduled backups, and will backup external drives. They also offer drive seeding service, where you send your physical drive(s) to them to put on the server instead of waiting months while your terrabytes of data upload, and drive shipping if you lose all your data and need your online backup data fast.
Left: WD Passport 2TB Portable (5200RPM) via USB 3.0
Right: WD Black 4TB (7200RPM) via USB 3.0
Previously I was using a set of WD Passport portable drives as my working drives, and a hodgepodge of older drives crammed away. As a side benefit to moving to the WD 7200RPM Black drives, I have now also doubled my data read speed from 88MBps on the portable drive to 167MBps. Write speeds are up 60% from 80MBps to 138MBps. That is huge! Lightroom is going to fly now, as will saving and opening huge 5TB tradeshow graphics files. And by using the portable drives as a backup, I can easily take them with me on the road if I want to work and not be tethered to my desk. Bonus!
So there you have it, 5 layers of protection. One in a safe, one off-site, one on-line, and two types of media. I would be statistically impossible for me to lose data now, as long as I don’t screw something up.
What’s your backup strategy and storage system look like?