I have a pet peeve and it is people who don't back their shit up. To quote Scott:
AprecheY U NO BACKUP?
In the spirit of not just telling you to back up, here is how to do it, and how to do it well.
1) Back up to the cloud. This costs about $50-$60 per year, choose any of the many providers. I use JungleDisk which is a nice front end to Amazon S3 and CrashPlan which has their own data centers. Amazon S3 is great because it is huge and reliable. Also, your storage is accessible via public API's and in an emergency you can write a three line python script to get it all back (JungleDIsk actually provides a free open source program for this on their site). Finally, Amazon S3 is cheap; for small amounts of data (hundreds of MB's) it is literally pennies a month. CrashPlan is free unless you use their own cloud storage. They have a neat inbuilt system where you can give space on your machine for a friend's backup or just backup all your machines to each other. I use their unlimited plan for $36/year (you have to buy a 4 year plan) which is for one computer plus attached drives.
2a) Local backup. If you have a desktop, permanently attach an external drive to it. If you have a laptop, attach the external disk to your WiFi router (if your router does not support this, get yourself a modern router).
2b) Use some software to do the backup for you. This will cost some amount of money, possibly 0 since there are some free programs out there (e.g. CrashPlan, Carbon Copy Cloner). If you need to do it yourself, you
will forget. I use ChronoSync (mac only) because it has customizations out the wazoo. I back up all my machines to a RAID disk that is attached to a server and that server backs itself, including the external disk, up to CrashPlan. This is in addition to each machine individually having an automated JungleDisk backup running.
You need to do both 1) and 2). If your house burns down and you lose all of your data, then it was
not a backup. If Amazon flubs big and your data is gone, then it was
not a backup. Don't call it a backup if it is not a backup.
A couple more general points
Never, ever, mirror your data. This, coupled with automatic backups can result in a glitch where your backup gets corrupted and backing up corrupts your data. Always have primary backups be one way only and keep copies of deleted/modified files. After that you can have a secondary backup which is, e.g., a cloned image of your boot drive.
Always have reporting. JungleDisk had an RSS feature that reports every backup, successful or not. ChronoSync has customizable emails, and I have my mail filters set up such that warnings get flagged. CrashPlan is the best, since their server end will actually email you if your backup hasn't happened within a specified timeframe.
A backup is never a backup if you can't get your files back. So after setting everything up, test the setup by retrieving some files. Note where you need to know passwords and try to figure out if you could do the retrieval without them (e.g. by requesting new ones). If there are passwords which are absolutely necessary, write them down on a piece of paper and store it safely
somewhere else than your house.
This last one is to protect your strategy from biological memory loss and, in case you are backing up not just your own data, from the possible bioligical loss of yourself. Write down your backup strategy with specific instructions on how to retrieve the backup. Give to a person you trust.
So, how do you back up? Feel free to suggest improvements to my setup.
Comments
1) Storage that is super safe and far away. S3 is perfect for this.
2) A second hard drive that is nearby that you backup to once in awhile. Apple Time Machine does this. It will save you if you accidentally delete a file or something.
3) A RAID 1 mirror. This will save you if one of the drives crashes, but it won't save if you if you accidentally make a mistake and delete or corrupt something. That's why you need the other two kinds.
Remember, you only need to backup data that can not be replaced. I do not backup anything that can be downloaded again. For example, I only backup music that is weird and rare that I downloaded back in the Napster/WinMX college days. I know if I didn't back it up, it might be lost forever. I don't backup NES roms or Madoka fansubs, because I know I can torrent them again when I need them.
For me, I'm really struggling to think of something on my home computer or work computer that I can't possibly go on without. I mean maybe my resumes, but that would only be annoying to lose completely.
In regards to video games, of course there is no need to backup Steam games, but I do backup my save game directories if saves aren't stored to the cloud.
I do have some weird music lying around, but I have so much music that I really won't care about losing those few rare songs.
This may sound ideal, since if something goes wrong with your laptop, you'll get it back bit for bit the way it was at the last backup, with minimal effort. Also the external drive has to be at most the same size as the internal drive in your laptop. However, if there is something wrong with your drive in such a way that non critical files just get slowly corrupted, you may not notice the problem before those corruptions have been backed up and replaced the healthy files on the external drive.
It get's worse if you use two way mirroring (and yes, some backup programs actually let you choose this feature!), because then trouble that originates with your external drive can propagate back to your laptop. You have effectively doubled the chances that your data will get corrupted. This is true, but with the current plans the cloud backup companies have it cost you little to nothing to back up everything. This is also again a precaution against edge cases, if you set up for this and that directory to be excluded from the backup, or just backup that directory and those files, there will come a time when you accidentally have an important file in the wrong place. For some people, who are awesome at having their shit together, this may never happen, but as a general rule; just back everything up.
Before cloud-based backups became common and cheap, I would sometimes burn some DVDs of my files and mail them to my mom, just to give me an off-site backup strategy. If you don't want to use a cloud service for some reason, this may still be a valid solution for you, if somewhat annoying.
At the moment I'm a making an effort to sort through all my data going back to 1998, since I have it on an assortment of internal hard drives, external hard drives, DVDs and CDs. I'm copying everything onto two 2TB hard drives (except for video rushes and RAW image files). I'll keep every old hard drive too, and all the old media, in case things don't copy across. The only thing I'm having trouble recovering are a few folders of photos that I know are on an old laptop. But the laptop's power adapter is broken, and the hard drive is an ATA type thing, and I don't have an adapter for that either. The photos that I know are on this drive, but that I can't find backed up anywhere else, are those my girlfriend took when we broke through the ceiling of our old apartment into the apartment above. We cut a hole and just kept going. It was pretty crazy, with both of us pushing the other on. As it as totally illegal we never shared the photos with anyone. I really want those photos! And there are some of the original recordings of the SFBRP on there too. It would be nice to have the uncompressed data for every one, not just the 64k mp3 files. And a few other folders of this and that that I kind find elsewhere.
Once I get everything on to one hard drive I'll see if there is a way to easily de-duplicate them, as I know there is a lot of repetition.
My ultimate goal is to scour the internet, on as many forums and usenet groups I've posted in, and collect together everything I've ever posted online. Usenet shouldn't be a problem, but a few forums probably won't be on the wayback machine or anywhere else. I'm not sure how I'll go about this, but it should be a fun research project.
Scott, would it be possible to list every post I've ever made to this forum in a single page? With the link to the original thread for context? I think there could be a lot of fun stuff there, things I wouldn't even remember I've posted.
Anyway, to address the first post, Time Machine is awesome. I always have at least two Time Machine disks up to date. And every other file I have on an external hard drive is on at least one other. And I use ftp for a lot of stuff. And Dropbox is cool too. I don't use any other online backup, because I travel so much, and it's not practical for me to do so.
Maybe I can charge money to do these things by hand for the time being.
And the main reason that I wouldn't be able to keep ahead of the upload is that I travel so much. I've not been home for over three weeks now, and my main internet connection is this satellite connection on a cruise ship. It's slightly faster than dialup, and costs about 10 cents per minute each time I log on.
In that same three weeks I've created 53GB of new files that I'm going to archive. Am I even going to try to back this up to the cloud in the 10 days I have at home before I leave to Portland and SF for another two weeks? Nope! No you wouldn't, you'd just have to do it for me. Alternatively I'll see if my python kung fu is good enough for me to work out how to scrape it myself.
Also, at your rate of data creation, this problem is not going to get better for you. Sure data speeds increase with time, but so do pixel counts in DSLR's :-). You better start backing up now because later it will be an even greater pain in the butt. Do what I did, get an old ass computer, set it up next to your router at home, attach all drives and back up for three months.
Some results!
A history of the movies I have seen since December 2009 (or Every Post Luke Made To The Movies Thread).
Luke's dating history since 2009 (or Every Post Luke Made To The Dating Thread).