RAID: Redundant Array of Inexpensive Disks

Malo Le Goff
8 min readOct 29, 2022

--

Nope, it’s not about Raid: Shadow Legends.

RAID is a technology combining several disk drives (SSDs, HDDs, or even a combination of both) into one or more virtual disks in order to improve performance, redundancy, and/or capacity. These disks are managed by what is called a RAID controller which enables external entities to interact with this set of drives.
There are different RAID combinations of disk drives with different characteristics. These combinations are called RAID levels and we’re going to see what they consist of.

Photo by ELLA DON on Unsplash

I. Standard RAID levels

As mentioned above, each different RAID level provides a different combination of reliability, performance, availability, and capacity. Let’s see how this mix is created from normal drives:

RAID 0

Figure 1: RAID 0 disks

RAID 0 is about splitting data into 2 or more blocks and allocating these blocks across the available disks. In the example above, we only have 2 disks and there are 2 blocks corresponding to {A1, A3, A5, A7} and {A2, A4, A6, A8}. This process is also called striping.
But there is no effort on redundancy so if one of the drives fails, the data is lost. The benefit is the performance increase as the reads and writes can be split across all the available disks of the array.

So RAID 0 is used for applications that need high performance but can bear low reliability. To solve this many RAID levels use parity or mirroring as we’ll see soon.

RAID 1

Figure 2: RAID 1 disks

RAID 1 is about copying the entire dataset onto another drive (also called mirroring). Thus, it makes the read performance and the reliability better but the write performance is not as good since you need to write the same modifications to all the mirrored drives.

With RAID 1, you can have many failures as long as 1 mirrored drive is still working, your application will still be running properly

RAID 2

Figure 3: RAID 2 disks

RAID 2 also involves stripping like RAID 0 but stripping by bits and not by blocks. So it means that each piece of data is split into parts of 1 bit and allocated to different disks. And each sequential bit is put on a different drive. In contrast, with RAID 0, the data was split into blocks containing several bits. If you compare the figures, it means that A1 in RAID 2 is a bit whereas it is a block in RAID 0.

But more than stripping, RAID 2 also implemented a reliability feature. In the figure above, you can see there are 2 groups of disks, one used to store the data that is being striped on these disks (from disk 0 to disk 3) and the other group to store error-correcting code (from disk 4 to disk 6). This error-correcting code is a sequence of bits stored to check the integrity of the data and correct it if it is flawed. But this is expensive to implement in the RAID controller so this configuration is not used anymore

NB: The error-correcting code in RAID 2 is generated using the Hamming code

RAID 3

Figure 4: RAID 3 disks

The principle is the same that with RAID 2 except that the data is now stripped by byte and not by bit. And instead of using Hamming Error correction data stored on separate disks, it uses a single separate disk to store parity data.

We’ll take some time to explain the concept of parity and how it can be used for reliability purposes.

It basically means we’re storing on a disk a piece of data that can be used for retrieving one of the pieces of data that is being stored on another disk. If a drive in the array fails, the remaining data on the other drives can be combined with the parity data (using the Boolean XOR function) to reconstruct the missing data.

Let’s take an example: we have 3 drives and we want to store 2 sequences of bits 01101101 and 11010100. For read and write performance, we can store these sequences on drive 1 and drive 2. We still have one disk left that we can use for reliability. Indeed, if we use the XOR function on the 2 sequences of bits :

XOR(01101101, 11010100) = 10111001

This result is now stored on disk 3 and if disk 2 fails, you can find back the stored sequence by using the XOR function on the data stored in disk 1 and disk 3 :

XOR(01101101, 10111001) = 11010100

The principle is the same with RAID 3but with any number of drives (as long as it’s greater or equal than 3). Note that it can support only the failure of a single drive otherwise it can’t rebuild the data that has been lost.

Note that with RAID 2 and RAID 3, the disks are synchronized by the controller to spin at the same angular orientation, so it generally cannot service multiple requests simultaneously.

RAID 4

Figure 5: RAID 4 disks

Same as RAID 3 but with block-level striping (like in RAID 0).

RAID 5

Figure 6: RAID 5 disks

RAID 5 is again very similar to RAID 4 but this time, the parity data is allocated over all the disks and not all stored on one disk alone.

So with RAID 5, write performance is increased since all RAID members participate in the serving of write requests. Although it will not be as efficient as a striping (RAID 0) setup, because parity must still be re-written when one of the arguments of the XOR operation changes.

RAID 6

Figure 7: RAID 6 disks

It’s the same principle as RAID 5 but this time with double parity so it can take up to the failure of 2 drives instead of only 1. In other words, for each piece of data (A, B, C, D, or E), 2 parity blocks are distributed among the drives compared to 1 for RAID 5.

NB: You also have non-standard RAID implementations that I did not talk about here for the sake of brevity and clearness like RAID 1E, … More here if you are curious

Now, we’re going to go beyond the standard RAID levels and see what we can do by combining these levels into what is called Nested RAID levels.

II. Nested Raid levels

RAID 01

Figure 8: RAID 01 disks

As hinted, RAID 01 is a combination of RAID 0 and RAID 1. To give an example, the original data could be split across 2 disks (the striping of RAID 0) and the drives where the data is being split on are mirrored into a new set of drives (RAID 1’s contribution).

RAID 10

Figure 9: RAID 10 disks

Another combination of RAID 1 and RAID 0 but the other way around: the RAID array is a stripe of mirrors and not a mirror of stripes like in RAID 01

As this article is already long, we’ll stop here for nested architectures as RAID 10 and RAID 01 are the main nested architectures in use. But keep in mind you also have RAID 03, RAID 05, … More here if you want to go into further detail

NB: We talked a lot about drive failure in the last parts, if you want to learn more about it, check out this article: https://malolegoff.medium.com/hard-disk-drive-failures-c5515d1adf3c

III. Non-RAID Architectures

Although the most widespread architecture for configuring multiple hard disk drives is RAID. Non-RAID drive architectures also exist :

  • JBOD (Just a Bunch Of Disks): Architecture using multiple hard drives exposed as individual devices. Hard drives may be treated independently or may be combined into one or more logical volumes using a volume manager like LVM or mdadm, such volumes are usually called “spanned” volumes.
  • SPAN or BIG: Architectures based on spanning methods for combining multiple physical disk drives into a single logical disk. It provides no data redundancy. Drives are merely concatenated together, end to beginning, so they appear to be a single large disk. What makes a SPAN or BIG different from RAID configurations is the selection of drives. While RAID usually requires all drives to be of similar capacity and it is preferred that the same or similar drive models are used for performance reasons (otherwise the slowest disk becomes a bottleneck), a spanned volume does not have such requirements.
  • MAID (Massive Architecture of Idle Disks): Architecture using hundreds to thousands of hard disk drives for providing nearline storage of data, primarily designed for “Write Once, Read Occasionally”. I.E. for storing data that is being accessed with a frequency between online and archive/offline.
  • SLED (Single Large Expensive Disk): Opposite approach of RAID, we use only one disk and not an array of smaller ones. As you may have guessed, SLED doesn’t have the reliability, flexibility, and even the performance than RAID can have (in terms of reads-writes speeds). Indeed, even if failures would rise in proportion to the number of drives, by configuring for redundancy, the reliability of an array of drives could far exceed that of any large single drive.

Conclusion

In this article, we have seen the standard RAID architectures combining several drives to get a mix of performance, reliability, and capacity. All of these architectures have specific use cases depending on the needs of the application. Yet, despite RAID being the de facto solution, it’s not the only one. You could also go for non-RAID solutions like JBOD or SPAN. But again, it depends on what you need your array of disks for.

References

[1] : https://en.wikipedia.org/wiki/RAID

[2] : https://en.wikipedia.org/wiki/Standard_RAID_levels

[3] : https://en.wikipedia.org/wiki/Parity_bit#RAID_array

[4] : https://www.techopedia.com/definition/12364/single-large-expensive-disk-sled

[5] : https://en.wikipedia.org/wiki/Nested_RAID_levels

--

--

Malo Le Goff
Malo Le Goff

Written by Malo Le Goff

Student Engineer | Engineering school : IMT Atlantique | Software Engineering & Data Science & Cybersecurity

No responses yet