Quick Lessons in Computer Maintenance,Text copyright Trevor Ash. All rights reserved.
Disaster Recovery, and Data Backup: Part 2
Software tools for backing up your data come in a few general categories:
- Software which mirrors the contents of a drive or directory between two separate locations (makes an identical copy).
- Software which makes an image (a file) representing the contents of a hard drive or partition.
- Software which supports the backup process or makes it easier.
There are other types as well but these are the most commonly used. I also wanted to limit my discussions to those that I am most familiar with.
In short, we have mirroring software, imaging software, and backup software. I’ll be using these terms from now on when I talk about these types of software tools.
Mirroring software is a lot easier to understand than imaging software so I’ll start with a brief explanation of how mirroring software works.
How does mirroring software work?
Every file on your computer has an associated date/time used to determine when the file was last modified. Mirroring software uses the last modified information when performing its task.
Let’s assume that you want to backup everything in your “My Documents” folder to an external hard drive using mirroring software. The steps are pretty standard:
- Specify the files that you want mirrored. In this case, we want everything in the My Documents folder.
- Specify the location where a mirror of the files should be stored.
- Specify the frequency for which the mirroring should be performed (when and how often).
When the mirroring software needs to perform the mirroring it works pretty much like this:
- For each file that needs to be mirrored it checks to see if a file with the same file name exists in the mirror location.
- If the file name doesn’t exist in the mirror location then the file is simply copied over.
- If the file name already exists, and the “last modified” date of the subject file is newer than the mirrored file, then the mirrored file is replaced.
- If any files exist in the mirror location that don’t exist in the original location then they are deleted. This is usually an optional feature in the software.
That’s pretty much all there is to it. Of course not every mirroring tool is created equal, but they all have these standard features and operate in the same fundamental way.
There are a few things to be careful about with mirroring software. Since mirroring software usually relies entirely on the “last modified” field, you will want to make sure that the regular software you use to edit your files doesn’t do anything strange or abnormal with that field. It’s probably a good idea to test things out a bit before fully committing to a solution. Usually the worst thing that happens is that the mirrored copy is updated unnecessarily. This isn’t much of a worry for most of us.
Many operating systems include mirroring type tools that you can run at the command line which do just as good a job as the expensive tools you would have to purchase. If you’re running Windows, take a look and see if the “xcopy” program is available on your computer. The easiest way to see if you have it is to bring up a command prompt window and type “xcopy /?” If you see a big list of program options, then you know you have it.
Limitations with Mirroring Software
You might wonder why you can’t just use xcopy or similar mirroring software tools to mirror your entire computer or hard drive to another drive. Well, unfortunately you need to be very careful with that assumption because in certain cases it can be a dangerous one. The easiest way to explain why you shouldn’t rely on mirroring software to back up your entire computer goes like this:
Let’s assume that you want to update your resume on your computer using WordPad. Your resume is stored in the “My Documents” folder. When you open up the file with WordPad, WordPad makes a copy of the file and places it in memory (RAM). If you don’t understand memory and RAM, it’s okay. You should at least understand that RAM is NOT the same as, nor does it reside on, the hard drive. RAM and the hard drive are entirely separate and the mirroring software only reads from the hard drive. So, you have the file open with WordPad and make a few changes but you don’t actually save it yet. In the meantime, your mirroring software executes and successfully completes a mirror of the My Documents folder including your resume document. Since the changes you made to your resume haven’t been saved from memory to the hard drive the mirror that was just made doesn’t contain your updated resume. You’ll need to save the resume to the hard drive by choosing “File->Save” and run the mirroring software again for it to update the mirror with your changes.
Sorry for the long explanation! But now instead of your resume, imagine a file that you don’t have opened yourself but a program you have running does. Many programs read and write to files which store the state of the program and the program settings. If you restore from these setting files which were backed up during the kind of scenario described above then you could be in for a lot of trouble. All it takes is one incompatible setting in these files to break a software application or worse, your whole system.
In all honesty, that’s the only limitation that I’ve found which matters to me. I’ve used multiple mirroring tools in my past and they do seem to keep getting better with time, even for such deceivingly simple tools.
What’s especially cool about most mirroring tools is that you can easily setup multiple mirrors that run at different time frames. I use this capability myself to have a daily mirror and a weekly mirror of my important files.
Professionally, I work with a large group of highly intelligent individuals in the software industry. During my career as a quality assurance engineer I’ve had to explain countless times to very intelligent people what an “image” is and how to go about making them. Most people never seemed to truly understand as was evidenced by their future questions. In fact, most of them would transform into a wall of blank stares after about 30 seconds. I guess what I’m trying to say is that this is a complicated topic that is difficult to explain.
How does imaging software work?
First, we need to understand a little bit about hard drives and how they are organized into usable data. We first need to understand what a partition is. A partition can be thought of as a standard or protocol which tells the operating system how to physically store files onto the hard drive. When you buy a hard drive, say a 40 gigabyte drive, you have to create a partition on it before you can begin reading and writing data to it. When you make a partition you specify the type of partition and how large (in bytes) you want the partition to be.
Most people only make one partition per drive using all available space from the drive. In the case of our 40 gigabyte drive that would be a 40 gigabyte partition. However, there’s nothing preventing you from creating more than one partition. You can split up the 40 gigabytes into as many partitions as you will want. For example, you could create two 20 gigabyte partitions.
In Windows, hard drive partitions will always be represented by a drive letter. So that “C:” actually represents a partition on your hard drive. If you have a “D:” which isn’t your CD-ROM, then chances are it’s either the main partition on a second hard drive, or it’s the second partition on the main drive.
So what does imaging software do? It makes a copy of the partitions on a hard drive and all of the data within the partitions and stores that copy as a single file or set of files. These image files can then be accessed just like other files on your computer. This means that you can burn them to CD-ROM or copy them to other locations.
How does imaging software compare with mirroring software?
You might be asking how this differs from mirroring software. It’s convenient to think of imaging software as being more “low level” than mirroring software. The biggest convenience of imaging software is being able to copy the image of your hard drive to anywhere you want for later restoration. The biggest drawback as opposed to mirroring software is that if you want to restore individual files, you’ll need to use the imaging program to “browse” the image in order to restore the files. Images and imaging applications really weren’t designed to restore single files on a frequent basis. Rather, they were designed to allow complete, quick, and convenient hard drive restoration in case of hard drive failure. Mirroring software is definitely much better at frequently restoring individual files.
Unlike mirroring software, imaging applications are the perfect choice for quick disaster recovery. Most imaging applications provide for some method of booting a computer from floppy or CD-ROM and allowing you to restore your image completely outside of Windows. If your hard drive breaks and your imaging software won’t let you recover an image from outside Windows, then you’ll have to completely reinstall the operating system and the imaging program onto your new hard drive before you can restore the image. This is just plain ridiculous to me and I wouldn’t buy any imaging application that didn’t allow me to restore an image from outside of Windows. Fortunately, most of them do. Most imaging software will also automatically recreate the partitions for you during recovery which is a huge plus for people that aren’t comfortable creating partitions themselves. This is one of the “low level” advantages.
The imaging applications achieve the ability to boot from alternative devices (usually floppies or CD-ROMS) by putting an operating system on the device and a version of the imaging application which runs under that operating system. Make sure your computer can correctly boot off of the device before relying on it. As usual, you’ll want to test things out a bit before committing to the solution.
One other important point to note is that today’s modern mirroring software only backs up “used” data on the hard drive. That’s a good thing! Otherwise (as with first generation mirroring tools) even if you had 39 gigabytes of free space on your 40 gigabyte drive it would still backup all 40 gigabytes. That’s a pretty big waste of space isn’t it? Today’s mirroring tools will only need to backup that 1 gigabyte of used space instead of all 40.
What features make for a good imaging tool?
Today’s imaging tools are very space efficient and include algorithms which can compress the image into half the size of the used hard drive space. This is an additional benefit from mirroring software. You could always compress the mirror location if you wanted, but it would greatly lengthen the mirroring process (because it often has to uncompress to do the comparison) and therefore, isn’t usually a preferred option.
Unlike mirroring software, the features included in imaging software can make or break it for you. There are huge varieties and quality levels of imaging tools and applications available. Some are very limited and only support specific partition types. Look for software which supports the partition types you use or plan on using in the near future (for Windows that would be FAT or FAT16, FAT32, and NTFS). Some imaging software doesn’t compress the images. Having no compression means that instead of needing 20G to backup 40G of data you need 40G to backup 40G. Others don’t allow disk spanning (saving the image across multiple disks). Does the imaging tool allow images to be scheduled and created automatically? If not, you’re going to rely on your own memory to manually initiate the backups. I’ve never met anyone who consistently kept up a backup process that wasn’t mostly automated. Does the imaging software allow you to span an image at a later date than when it was created? I’ve had many times where I made an image in 2 gigabyte chunks but later wanted to store the entire image onto CD-ROM. As we all know, you can’t fit 2 gigabytes into 650 megabytes. Fortunately, the imaging software allowed me to take that old backup and split it up into smaller sized pieces.
Things get very tricky very quickly with needed features. All of the mentioned features are important ones and I use them as examples of make or break features. You’ll definitely find your own list of make or break features based on your own specific requirements. Shopping for disk imaging tools is probably one of the more difficult things to do when thinking about backup solutions.
Limitations with Imaging Software
There is one major limitation that should be noted. With imaging software you cannot choose specific files on a partition to be backed up. It’s simply all or nothing. This is quite unlike mirroring applications which allow you to be very specific as to which files are backed up. A work-around for this limitation is to carefully design the way you store files on your hard drives. I’ll get into this a little later.
Which one is better and why? Mirroring or Imaging?
Neither is better. They both have certain uses which are good for one and not as good for the other. As an example, you wouldn’t want to use mirroring software to backup your entire operating system partition because of the possibility for files to be out of synch. And you wouldn’t want to use imaging software to backup a partition which contain files which don’t need to be backed up. After a little bit of your own research, you’ll quickly learn when to use which method and you’ll probably find that you end up using both methods.
A Quick Tip on Partitioning
Now that you know a little bit about partitions and two major types of backup software tools, I can share a small piece of wisdom known commonly amongst IT and computer professionals; keep your data files on a separate partition from the operating system partition. In other words, your operating system and programs go onto the C: and all your data files or photographs go on a different partition.
The reason is simple. It makes most sense to use an imaging application to backup the operating system partition. If your data and photographs all reside on the same partition as the operating system then every time you want to backup just the OS or just the data you always have to backup both. Having the operating system and data on one partition is not usually an optimal scenario, especially when you want to have multiple simultaneous backups where the operating system needs to be backed up more or less frequently than the data partition. In short, try and keep a separate partition to store all of your important data files. Even if you can see no benefit by doing it today, you’ll be allowing yourself a lot more options in the future if you take the advice.
Don’t interpret this recommendation to mean that you should also install your programs onto the data partition. Don’t do that unless you have specific instruction and assurance from the software manufacturer. Install the programs into whatever the default location they provide is unless you have good reason to do otherwise. There are too many software packages out there that are buggy or untested when installed in a location other than the default location (usually c:\program files\). Also, you’ll find that some programs only install “some” of their files on the specified partition but still install other files on the system partition. It’s just not worth the trouble and risk associated with it. Trust me on this one. I’ve worked on projects where we’ve knowingly shipped software that wouldn’t function if installed somewhere other than the default location. Oh, don’t tell!
I’ve purposely saved software which supports the backup process last because it is the most generic form of backup tool, but which can also provide numerous unique functionalities.
One example of backup software most notable to photographers is a program called Archive Creator by PictureFlow. This one deserves special attention because many photographers find it useful. I personally like it because this tool really does nothing more than make it easy for a photographer to burn their photographs onto CD-ROM or DVD and thus have them backed up.
The largest reason that people’s computers aren’t properly backed up is due to the hassle and inconvenience of setting up a good process for getting it done and keeping up with the maintenance. Any tool which successfully makes the task easier will always get the thumbs up from me. Archive Creator is that kind of tool.
Always keep an eye out for tools that make the job of backing up easier. The easier you make it on yourself the more successful you will be. I guarantee it.
What’s coming next?
In the next issue of this series, we’ll begin look at required hardware and storage media including RAID, external hard drives, and DVD’s. We’ll also look at a few examples of common real world backup strategies which you can adapt to your specific needs.
Editor's Note - Trevor Ash is employed by a large test and measurement company as a Software Quality Assurance Engineer. Visit Trevor's website at www.persistinglight.com.
TA - NPN 0830
Comments on NPN digital photography articles? Send them to the editor.