Monday, November 24, 2008

data preservation

Computer problems are inevitable, but digital data can be endlessly, perfectly copied. Therefore a computer problem need not result in lost data.

Step 1: Accept reality

Hard drives die. They aren't solid state, they are mechanical. They spin at 1000s of RPM, have bearings, physical arms like a turntable reading data at microscopic scales. A hard drive is either dead, or dying / destined to die. The only way to protect your data is to back up religiously. If you wait until your drive dies, it will be too late. You will lose data and work. Accept this.

Step 2: Backup

linux: g4l aka ghost 4 linux - a bootable CD. Get a USB drive of greater or equal size, boot off CD, type g4l at terminal, select click-n-clone, set source and target very carefully, clone. Because it is a bootable cd which copies raw data from drives g4l can be used to clone any drive, any format or operating system.

mac: OSX include rsync on command line, or you can use Disk Utility "restore" facility to clone drives or make an img/dmg file backup of your maser drive. Super Duper! is a friendly frontend for rsync. Get a firewire drive of equal or greater size, clone onto it. Buy the software for faster update backups. Newer versions of OSX include Time Machine as a friendly builtin backup solution.

win: g4l or windows builtin backup/restore facility.

Step 3: Monitor Health

Modern hard drives track their own health using technology called SMART. At any time is possible to query a hard drive and get info such as error counts, dead sectors, and uptime. This can give you earlier warning as a drive starts to die. If the "reallocated sector" count is increasing, beware! SMART is generally available on all internal drives. USB and FireWire enclosures may report SMART data, although depending on your OS and specific drive you may have to use a third party app or utility from manufacturer.

Linux: smartmontools, the -i, --attribites, and --all options. There are some GUIs for your desktop.
smartmontools article
short version
good tips in comments

Mac: Disk Utility -> First Aid will display yes/no health status for internal drives but you should immediately install SMARTReporter a free app that sits in your menu bar and pops up a warning when a drive crosses the failure threshold. smartmontools in the Terminal gives you detailed info on error count even before the "bad health" threshhold is crossed. Currently a limitation of OSX means you can't check the SMART status of firewire drives!

Win: HD Tune free version lets you check the SMART data of all drives.

Strategy: the first time you use a drive, do a data wipe or full format - the goal is that every sector is written to. Drives come from the factory with some bad sectors; the first time you write to a bad sector it is noted as such. So, touch every sector as soon as possible. Then, do a SMART scan and note the relevant numbers. If the nubers change in the future, that indicates that sectors that used to be good are dying: a sure sign that you need to back up your data and retire the drive in question. Put the new drive's SMART data on a sticker stuck to the drive for posterity, and email the SMART data to yourself so you can reference the baseline SMART data without the need to open your PC case to read the sticker.


Step 4: Data Recovery

Step 4 is avoidable, don't get to Step 4! If you're at Step 4, then you didn't follow 1-3 correctly. "Repairing" a broken drive/filesystem is appropriate for a healthy drive which was rudely unplugged or corrupted by a computer crash. But for failing drives read-only data recovery is safer - the repair process might encounter errors and leave your drive in even worse condition.

Mac: Data Rescue. It is read-only, works even if the drive won't mount, it can scan drives in various states of decay and discover lost files and deleted files. If possible it will recover filenames and locations of files. The free trial will scan your drive, list files, and let you recover one small file. If it lists your important files, buy the software and recover what you can to another drive.