Unlike disk separation, which we discussed yesterday, as a method to optimize database performance, RAID combines multiple disks for a single task.
RAID can be used only to provide redundancy of the hard disk to avoid data loss. However, it can be configured so that when you write data to the disk, it spreads the data across multiple disks, resulting in faster reads and writes. If you have three disks working as one, then the time to write or read is divided by three. This assumes that all three disks are saving data, and not saving a parity. This is known as a striped set without parity, which is the fastest method.
There are many different RAID conigurations, which is far beyond the scope of this editorial. Here are some common configurations.
You can mirror a disk, meaning you have two disks that have identical copies of data. When one dies the other remains.
You can use three or more disks to form a RAID group, and include parity. Each disk stores the data written to the other two, so that any disk may be lost, and the volume still remain intact. You lose 1 disk in this RAID (RAID 5) configuration in order to store parity. So, the more disks in the RAID, the less disk space is lost for parity. With only three disks, you lose 33%. With four disks, you lose 25%, etc.
RAID 5 is the slowest performing RAID. It is not meant to provide speed. It provides redundancy, just like the mirroring above.
The fastest RAID is a Striped set without parity. That means, every disk is used to write data. So, if you create a number of mirrored disks, and then add the mirrors to a RAID set without parity, you get both performance and redundancy. This implementation (RAID 10) is not often used because 50% of your disk space is used for parity. However, it is the fastest performing configuration with failover.
When you use RAID, in most cases, you won’t have the separate spindles as I demonstrated yesterday. It is difficult enough to get 4 or five different physical disks. To accomplish the same thing using RAID is impractial on most implementations. However, this can be implemented, and the performance is amazing.
More likely, when you get to dealing with a lot of disks and RAID, you are going to be moving to SAN based storage. We’ll take a brief look at that tomorrow.
Cheers,
Ben