Real case: Storage Spaces Direct physical disk replacement

This week I was in Stockholm to build a Storage Spaces Direct cluster (hyperconverged model). When implementing the cluster, I have seen that a physical disk was failing. I’ve written this topic to show you how I have replaced this disk.

Identify the failed physical disk

I was deploying VMFleet when I saw the both virtual disks in a degraded state. So, I checked the job my running Get-StorageSubSystem *Cluster* | Get-StorageJob. Then I opened the Storage Pool and I have seen the following:

So, it seems that this physical disk was not healthy and I decided to change it. First, I ran the following cmdlet because my trust in the Failover Cluster Manager is limited:

Get-StoragePool *S2D* | Get-PhysicalDisk

Then I add the physical disk object into a PowerShell variable (called $Disk) to manipulate the disk. You can change the OperationalStatus filter by another thing while you get the right disk.

$Disk = Get-PhysicalDisk |? OperationalStatus -Notlike ok

Retire and physically identify storage device

Next I set the usage of this disk to Retired to stop writing on this disk and avoid data loss.

Set-PhysicalDisk -InputObject $Disk -Usage Retired

Next I tried to remove the physical disk from the Storage Pool. It seems that the physical disk is in really bad state. I can’t remove it from the pool. So, I decided to change it anyway.

I ran the following cmdlet to turn on the storage device LED to identify it easily in the datacenter:

Get-PhysicalDisk |? OperationalStatus -Notlike OK | Enable-PhysicalDiskIdentification

Next I move to the server room and as you can see in the below photo, the LED is turned on. So, I changed this disk.

Once the disk is replaced, you can turn off the LED:

Get-PhysicalDisk |? OperationalStatus -like OK | Disable-PhysicalDiskIdentification

Add physical disk to storage pool

Before a reboot of the server, the physical disk can’t identify its enclosure name. The disk automatically joined the Storage Pool but without enclosure information. So, you have to reboot the server to get the right information.

Storage Spaces Direct spread automatically the data across the new disk. This process took almost 30mn.

Sometime the physical disk doesn’t join automatically the Storage Pool. So, you can run the following cmdlet to add the physical disk to the Storage Pool.

Conclusion

With storage solutions, you can be sure that a physical disk, either SSD or HDD will fail some days. With Storage Spaces Direct, Microsoft provides all required tools to change properly failed disks easily. Just set the physical disk as retired, then remove the physical disk (if you can) from the storage pool. To finish you can change the disk.

About Romain Serre

Romain Serre works in Lyon as a Senior Consultant. He is focused on Microsoft Technology, especially on Hyper-V, System Center, Storage, networking and Cloud OS technology as Microsoft Azure or Azure Stack. He is a MVP and he is certified Microsoft Certified Solution Expert (MCSE Server Infrastructure & Private Cloud), on Hyper-V and on Microsoft Azure (Implementing a Microsoft Azure Solution).

8 comments

  1. Why drive not detected without rebooting the server? It is only in this case, or it is a normal for s2d? Windows?

  2. Hi Romain, I am thinking about using s2d in our prod vmware environment, is that fully supported?

    Regards Johan

  3. I paused one node, rebooted it after draining the roles and I’m in a situation now by which the 2 SSD Cache drives reported loss communication in cluster failover manager on the node i rebooted, I can see the disks fine in the BIOS, but not in the OS. Is it worth treating them like a failure and kill them off, reseat them and add them back in. Open to suggestions.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

x

Check Also

Storage Spaces Direct: Parallel rebuild

Parallel rebuild is a Storage Spaces features that enables to repair a storage pool even ...

Storage Spaces Direct and deduplication in Windows Server 2019

When Windows Server 2016 has been released, the data deduplication was not available for ReFS ...

Real Case: Implement Storage Replica between two S2D clusters

This week, in part of my job I deployed a Storage Replica between two S2D ...