Storage Spaces Direct: Parallel rebuild

Parallel rebuild is a Storage Spaces features that enables to repair a storage pool even if the failed disk is not replaced. This feature is not new to Storage Spaces Direct because it exists also since Windows Server 2012 with Storage Spaces. This is an automatic process which occurs if you have enough free space in the storage pool. This is why Microsoft recommends to leave some free space in the storage pool to allow the parallel rebuild. This amount of free space is often forgotten when designing Storage Spaces Direct solution, this is why I wanted to write this theoretical topic.

How works parallel rebuild

Parallel rebuild needs some free spaces to work. It’s like spare free space. When you create a RAID6 volume, a disk is in spare in case of failure. In Storage Spaces (Direct), instead of spare disk, we have spare free space. Parallel rebuild occurs when a disk fails. If enough of capacity is available, parallel rebuild runs automatically and immediately to restore the resiliency of the volumes. In fact, Storage Spaces Direct creates a new copy of the data that were hosted by the failed disk.

When you receive the new disk (4h later because you took a +4h support :p), you can replace the failed disk. The disk is automatically added to the storage pool if the auto pool option is enabled. Once the disk is added to the storage pool, an automatic rebalance process is run to spread data across all disks to get the best efficiency.

How to calculate the amount of free spaces

Microsoft recommends to leave free space equal to one capacity disk per node until 4 drives:

  • 2-node configuration: leave free the capacity of 2 capacity devices
  • 3-node configuration: leave free the capacity of 3 capacity devices
  • 4-node and more configuration: leave free the capacity of 4 capacity devices

Let’s think about a 4-node S2D cluster with the following storage configuration. I plan to deploy 3-Way Mirroring:

  • 3x SSD of 800GB (Cache)
  • 6x HDD of 2TB (Capacity). Total: 48TB of raw storage.

Because, I deploy a 4-node configuration, I should leave free space equivalent to four capacity drives. So, in this example 8TB should be the amount of free space for parallel rebuild. So, 40TB are available. Because I want to implement 3-Way Mirroring, I divide the available capacity per 3. So 13.3TB is the useable storage.

Now I choose to add a node to this cluster. I don’t need to reserve space for parallel rebuild (regarding the Microsoft recommendation). So I add 12TB capacity (6x HDD of 2TB) in the available capacity for a total of 52TB.

Conclusion

Parallel rebuild is an interesting feature because it enables to restore the resiliency even if the failed disk is not yet replaced. But parallel rebuild has a cost regarding the storage usage. Don’t forget the reserved capacity when you are planning the capacity.

About Romain Serre

Romain Serre works in Lyon as a Senior Consultant. He is focused on Microsoft Technology, especially on Hyper-V, System Center, Storage, networking and Cloud OS technology as Microsoft Azure or Azure Stack. He is a MVP and he is certified Microsoft Certified Solution Expert (MCSE Server Infrastructure & Private Cloud), on Hyper-V and on Microsoft Azure (Implementing a Microsoft Azure Solution).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

x

Check Also

Deploy a Software-Defined Storage solution with StarWind Virtual SAN

StarWind Virtual SAN is a Software-Defined Storage solution which enables to replicate data across several ...

Don’t worry, Storage Spaces Direct is not dead !

Usually, I’m not speaking about news but only about technical details. But with the release ...

Get Storage Spaces Direct insights from StarWind Manager

Earlier in the week, I published a blog post about Honolulu project and how in ...