Software-Defined – Tech-Coffee https://www.tech-coffee.net Fri, 09 Aug 2019 11:34:26 +0000 en-US hourly 1 https://wordpress.org/?v=5.2.11 65682309 Don’t do it: enable performance history in an Azure Stack HCI mixed mode cluster https://www.tech-coffee.net/dont-do-it-enable-performance-history-in-an-azure-stack-hci-mixed-mode-cluster/ https://www.tech-coffee.net/dont-do-it-enable-performance-history-in-an-azure-stack-hci-mixed-mode-cluster/#respond Fri, 09 Aug 2019 11:34:16 +0000 https://www.tech-coffee.net/?p=6935 Lately I worked for a customer to add two nodes in an existing 2-nodes Storage Spaces Direct cluster. The existing nodes are running on Windows Server 2016 while the new ones are running on Windows Server 2019. So, when I integrated the new nodes to the cluster, it was in mixed operating system mode because ...

The post Don’t do it: enable performance history in an Azure Stack HCI mixed mode cluster appeared first on Tech-Coffee.

]]>
Lately I worked for a customer to add two nodes in an existing 2-nodes Storage Spaces Direct cluster. The existing nodes are running on Windows Server 2016 while the new ones are running on Windows Server 2019. So, when I integrated the new nodes to the cluster, it was in mixed operating system mode because two different versions of Windows Server were in the cluster. For further information about this process, you can read this topic.

After the integration of the new nodes, I left the customer because the data in the cluster were replicated and spread on them. During this period, the customer ran this command:

Start-ClusterPerformanceHistory

This command start performance history in a Storage Spaces Direct cluster. In a native Windows Server 2019 cluster, a cluster shared volume called ClusterPerformanceHistory is created to store performance metrics. Because the cluster were in mixed operating system mode and not in native Windows Server 2019 mode, it resulted in an unexpected behavior. Several ClusterPerformanceHistory CSV were created. Even if they were deleted, new ClusterPerformanceHistory were created indefinitely.

A circuit board

Description automatically generated

The customer tried to run the following cmdlet without success:

Stop-ClusterPerformanceHistory -DeleteHistory

How to resolve the performance history issue

To solve this issue, the customer ran this cmdlets:

$StorageSubSystem = Get-StorageSubSystem Cluster* $StorageSubSystem | Set-StorageHealthSetting -Name “System.PerformanceHistory.AutoProvision.Enabled” -Value “False”

The option System.PerformanceHistory.AutoProvision.Enabled is set to True when the cmdlet Start-ClusterPerformanceHistory is run. However, the cmdlet Stop-ClusterPerformanceHistory doesn’t disable this setting.

The post Don’t do it: enable performance history in an Azure Stack HCI mixed mode cluster appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/dont-do-it-enable-performance-history-in-an-azure-stack-hci-mixed-mode-cluster/feed/ 0 6935
Keep Dell Azure Stack HCI hardware up to date with WSSD Catalog https://www.tech-coffee.net/keep-dell-azure-stack-hci-hardware-up-to-date-with-wssd-catalog/ https://www.tech-coffee.net/keep-dell-azure-stack-hci-hardware-up-to-date-with-wssd-catalog/#respond Fri, 03 May 2019 10:17:53 +0000 https://www.tech-coffee.net/?p=6902 The firmware and driver’s management can be a pain during the lifecycle of an Azure Stack HCI. Some firmware are not supported, others must be installed to solve an issue. In the case of Dell hardware, a support matrix is available here. If you look at that matrix, you’ll see firmware and drivers for storage ...

The post Keep Dell Azure Stack HCI hardware up to date with WSSD Catalog appeared first on Tech-Coffee.

]]>
The firmware and driver’s management can be a pain during the lifecycle of an Azure Stack HCI. Some firmware are not supported, others must be installed to solve an issue. In the case of Dell hardware, a support matrix is available here. If you look at that matrix, you’ll see firmware and drivers for storage devices, Host Bus Adapter, switches, network adapters and so on. It’s nice to get that support matrix but should I find and download each drivers or firmware manually? Of course not.

Dell provides since a few months a WSSD catalog that enables to download only latest supported firmware and drivers for Azure Stack HCI and for your hardware. You can use this catalog from OpenManage (OME) or from Dell EMC Repository Manager. I prefer the second option because not all my customers have deployed OME. Dell EMC Repository Manager can be downloaded from this link.

Download the WSSD Catalog

The best way to download the WSSD Catalog is this webpage. Download the file and unzip it. You should get two files: the catalog and the signature file.

Add the catalog to Dell EMC Repository Manager

Now that you have the WSSD catalog file, you can add it to Dell EMC Repository Manager. When you open it, just click on Add Repository.

Specify a repository name and click on Choose File in base Catalog. Then select the WSSD catalog file.

Then you have to choose the Repository Type. Either I choose Manual or Integration. Integration is nice because you can specify an iDRAC name or IP. Then only specific firmware and drivers are downloaded for the hardware. You can also choose Manual for a new infrastructure to prepare your deployment. In this example, I choose Manual and I select the 740XD model and Windows Server 2019. When you have finished, click on Add.

Create a custom SUU

Once the repository is added, you should see firmware and drivers. Select it and click on export.

Then select the SUU ISO tab. Choose a location where will be exported the SUU file.

Once the export job is finished, you get a SUU image file to update your Azure Stack HCI servers. You just have to copy it to each server, mount the ISO and run suu.cmd -e. Or you can create a script to make a package to deploy firmware and drivers automatically.

Conclusion

The WSSD Catalog provided by Dell enables to ease the management of firmware and drivers in an Azure Stack HCI solution. They have to be updated several times a year and before it would be time consuming. Now it’s straightforward and you don’t have excuse to not update your platform.

The post Keep Dell Azure Stack HCI hardware up to date with WSSD Catalog appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/keep-dell-azure-stack-hci-hardware-up-to-date-with-wssd-catalog/feed/ 0 6902
Design the network for a Storage Spaces Direct cluster https://www.tech-coffee.net/design-the-network-for-a-storage-spaces-direct-cluster/ https://www.tech-coffee.net/design-the-network-for-a-storage-spaces-direct-cluster/#comments Mon, 07 Jan 2019 12:57:57 +0000 https://www.tech-coffee.net/?p=6689 In a Storage Spaces Direct cluster, the network is the most important part. If the network is not well designed or implemented, you can expect poor performance and high latency. All Software-Defined are based on a healthy network whether it is Nutanix, VMware vSAN or Microsoft S2D. When I audit S2D configuration, most of the ...

The post Design the network for a Storage Spaces Direct cluster appeared first on Tech-Coffee.

]]>
In a Storage Spaces Direct cluster, the network is the most important part. If the network is not well designed or implemented, you can expect poor performance and high latency. All Software-Defined are based on a healthy network whether it is Nutanix, VMware vSAN or Microsoft S2D. When I audit S2D configuration, most of the time the issue comes from the network. This is why I wrote this topic: how to design the network for a Storage Spaces Direct cluster.

Network requirements

The following statements come from the Microsoft documentation:

Minimum (for small scale 2-3 node)

  • 10 Gbps network interface
  • Direct-connect (switchless) is supported with 2-nodes

Recommended (for high performance, at scale, or deployments of 4+ nodes)

  • NICs that are remote-direct memory access (RDMA) capable, iWARP (recommended) or RoCE
  • Two or more NICs for redundancy and performance
  • 25 Gbps network interface or higher

As you can see, for a 4-Node S2D cluster or more, Microsoft recommends 25 Gbps network. I think it is a good recommendation, especially for a full flash configuration or when NVMe are implemented. Because S2D uses SMB to establish communication between nodes, RDMA can be leveraged (SMB Direct).

RDMA: iWARP and RoCE

You remember about DMA (Direct Memory Access)? This feature allows a device attached to a computer (like an SSD) to access to memory without passing by CPU. Thanks to this feature, we achieve better performance and reduce CPU usage. RDMA (Remote Direct Memory Access) is the same thing but across the network. RDMA allows a remote device to access to the local memory directly. Thanks to RDMA the CPU and latency is reduced while throughput is increased. RDMA is not a mandatory feature for S2D but it’s recommended to have it. Last year Microsoft stated RDMA increases S2D performance about 15% in average. So, I recommend heavily to implement it if you deploy a S2D cluster.

Two RDMA implementation is supported by Microsoft: iWARP (Internet Wide-area RDMA Protocol) and RoCE (RDMA over Converged Ethernet). And I can tell you one thing about these implementations: this is war! Microsoft recommends iWARP while lot of consultants prefer RoCE. In fact, Microsoft recommends iWARP because less configuration is required compared to RoCE. Because of RoCE, the number of Microsoft cases were high. But consultants prefer RoCE because Mellanox is behind this implementation. Mellanox provides valuable switches and network adapters with great firmware and drivers. Each time a new Windows Server build is released, a supported Mellanox driver / firmware is also released.

If you want more information about RoCE and iWARP, I suggest you this series of topics from Didier Van Hoye.

Switch Embedded Teaming

Before choosing the right switches, cables and network adapters, it’s important to understand what is the software story. In Windows Server 2012R2 and prior, you had to create a teaming. When the teaming was implemented, a tNIC was created. The tNIC is a sort of virtual NIC but connected to the Teaming. Then you were able to create the virtual switch connected to the tNIC. After that, the virtual NICs for management, storage, VMs and so on were added.

In addition to complexity, this solution prevents the use of RDMA on virtual network adapter (vNIC). This is why Microsoft has improved this part with Windows Server 2016. Now you can implement Switch Embedded Teaming (SET):

This solution reduces the network complexity and vNICs can support RDMA. However, there are some limitations with SET:

  • Each physical network adapter (pNIC) must be the same (same firmware, same drivers, same model)
  • Maximum of 8 pNIC in a SET
  • The following Load Balancing mode are supported: Hyper-V Port (specific case) and Dynamic. This limitation is a good thing because Dynamic is the appropriate choice for most of the case.

For more information about Load Balancing mode, Switch Embedded Teaming and limitation, you can read this documentation. Switch Embedded Teaming brings another great advantage: you can create an affinity between vNIC and pNIC. Let’s think about a SET where two pNICs are member of the teaming. On this vSwitch, you create two vNICs for storage purpose. You can create an affinity between one vNIC and one pNIC and another for the second vNIC and pNIC. It ensures that each pNIC are used.

The design presented below are based on Switch Embedded Teaming.

Network design: VMs traffics and storage separated

Some customers want to separate the VM traffics from the storage traffics. The first reason is they want to connect VM to 1Gbps network. Because storage network requires 10Gbps, you need to separate them. The second reason is they want to dedicate a device for storage such as switches. The following schema introduces this kind of design:

If you have 1Gbps network port for VMs, you can connect them to 1Gbps switches while network adapters for storage are connected to 10Gbps switches.

Whatever you choose, the VMs will be connected to the Switch Embedded Teaming (SET) and you have to create a vNIC for management on top of it. So, when you will connect to nodes through RDP, you will go through the SET. The physical NIC (pNIC) that are dedicated for storage (those on the right on the scheme) are not in a teaming. Instead, we leverage SMB MultiChannel which allows to use multiple network connections simultaneously. So, both network adapters will be used to establish SMB session.

Thanks to Simplified SMB MultiChannel, both pNICs can belong to the same network subnet and VLAN. Live-Migration is configured to use this network subnet and to leverage SMB.

Network Design: Converged topology

The following picture introduces my favorite design: a fully converged network. For this kind of topology, I recommend you 25Gbps network at least, especially with NVMe or full flash. In this case, only one SET is created with two or more pNICs. Then we create the following vNIC:

  • 1x vNIC for host management (RDP, AD and so on)
  • 2x vNIC for Storage (SMB, S2D and Live-Migration)

The vNIC for storage can belong to the same network subnet and VLAN thanks to simplified SMB MultiChannel. Live-Migration is configured to use this network and SMB protocol. RDMA are enabled on these vNICs as well as pNICs if they support it. Then an affinity is created between vNICs and pNICs.

I love this design because it really simple. You have one network adapter for BMC (iDRAC, ILO etc.) and only two network adapters for S2D and VM. So, the physical installation in datacenter and the software configuration are easy.

Network Design: 2-node S2D cluster

Because we are able to direct-attach both nodes in a 2-Node configuration, you don’t need switch for storage. However, Virtual Machines and host management vNIC requires connection so switches are required for these usages. But it can be 1Gbps switches to drastically reduce the solution cost.

The post Design the network for a Storage Spaces Direct cluster appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/design-the-network-for-a-storage-spaces-direct-cluster/feed/ 11 6689
S2D Real case: detect a lack of cache https://www.tech-coffee.net/s2d-real-case-detect-a-lack-of-cache/ https://www.tech-coffee.net/s2d-real-case-detect-a-lack-of-cache/#comments Tue, 04 Dec 2018 18:11:39 +0000 https://www.tech-coffee.net/?p=6667 Last week I worked for a customer who went through a performance issue on a S2D cluster. The customer’s infrastructure is composed of one compute cluster (Hyper-V) and one 4-node S2D cluster. First, I checked if it was related to the network and then if it’s a hardware failure that produces this performance drop. Then ...

The post S2D Real case: detect a lack of cache appeared first on Tech-Coffee.

]]>
Last week I worked for a customer who went through a performance issue on a S2D cluster. The customer’s infrastructure is composed of one compute cluster (Hyper-V) and one 4-node S2D cluster. First, I checked if it was related to the network and then if it’s a hardware failure that produces this performance drop. Then I ran the script watch-cluster.ps1 from VMFleet.

The following screenshot comes from watch-cluster.ps1 script. As you can see, a CSV has almost 25ms of latency. A high latency impacts overall performance especially when intensive IO applications are hosted. If we look into the cache, a lot of miss per second are registered especially on the high latency CSV. But why Miss/sec can produce a high latency?

What happens in case of lack of cache?

The solution I troubleshooted is composed of 2 SSD and 8 HDD per node. The cache ratio is 1:4 and its capacity is almost of 6,5% of the raw capacity. The IO path in normal operation is depicted in the following schema:

Now in the current situation, I have a lot Miss/Sec, that means that SSD cannot handle these IO because there is not enough cache. Below the schema depicts the IO path for miss IO:

You can see that in case of miss, the IO go to HDD directly without being cached in SSD. HDD is really slow compared to SSD and each time IO works directly with this kind of storage device, the latency is increased. When the latency is increased, the overall performance decrease.

How to resolve that?

To resolve this issue, I told to customer to add two SSD in each node. These SSD should be equivalent (or almost) than those already installed in nodes. By adding SSD, I improve the cache ratio to 1:2 and the capacity to 10% compared to raw capacity.

It’s really important to size kindly the cache tier when you design your solution to avoid this issue. As said a fellow MVP: storage is cheap, downtime is expensive.

The post S2D Real case: detect a lack of cache appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/s2d-real-case-detect-a-lack-of-cache/feed/ 1 6667
Storage Spaces Direct: performance tests between 2-Way Mirroring and Nested Resiliency https://www.tech-coffee.net/storage-spaces-direct-performance-tests-between-2-way-mirroring-and-nested-resiliency/ https://www.tech-coffee.net/storage-spaces-direct-performance-tests-between-2-way-mirroring-and-nested-resiliency/#comments Wed, 17 Oct 2018 09:38:52 +0000 https://www.tech-coffee.net/?p=6581 Microsoft has released Windows Server 2019 with a new resiliency mode called nested resiliency. This mode enables to handle two failures in a two-node S2D cluster. Nested Resiliency comes in two flavors: nested two-way mirroring and nested mirror-accelerated parity. I’m certain that two-way mirroring is faster than nested mirror-accelerated parity but the first one provides ...

The post Storage Spaces Direct: performance tests between 2-Way Mirroring and Nested Resiliency appeared first on Tech-Coffee.

]]>
Microsoft has released Windows Server 2019 with a new resiliency mode called nested resiliency. This mode enables to handle two failures in a two-node S2D cluster. Nested Resiliency comes in two flavors: nested two-way mirroring and nested mirror-accelerated parity. I’m certain that two-way mirroring is faster than nested mirror-accelerated parity but the first one provides only 25% of usable capacity while the second one provides 40% of usable capacity. After having discussed with some customers, they prefer improve the usable capacity than performance. Therefore, I should expect to deploy more nested mirror-accelerated parity than nested two-way mirroring.

Before Windows Server 2019, two-way mirroring (provide 50% of usable capacity) was mandatory in two-node S2D cluster. Now with Windows Server 2019, we have the choice. So, I wanted to compare performance between two-way mirroring and nested mirror-accelerated parity. Moreover, I want to know if compression and deduplication has an impact on performance and CPU workloads.

N.B: I executed tests on my lab which is composed of Do It Yourself servers. What I want to show is a “trend” to know what could be the bottleneck in some cases and if nested resiliency has an impact on performance. So please, don’t blame me in comment section 🙂

Test platform

I run my tests on the following platform composed of two nodes:

  • CPU: 1x Xeon 2620v2
  • Memory: 64GB of DDR3 ECC Registered
  • Storage:
    • OS: 1x Intel SSD 530 128GB
    • S2D HBA: Lenovo N2215
    • S2D storage: 6x SSD Intel S3610 200GB
  • NIC: Mellanox Connectx 3-Pro (Firmware 5.50)
  • OS: Windows Server 2019 GA build

Both servers are connected to two Ubiquiti ES-16-XG switches. Even if it doesn’t support PFC/ETS and so one, RDMA is working (I tested it with test-RDMA script). I have not enough traffic in my lab to disturb RDMA without a proper configuration. Even if I implemented that in my lab, it is not supported and you should not implement your configuration in this way for production usage. On Windows Server side, I added both Mellanox network adapters in a SET and I created three virtual network adapters:

  • 1x Management vNIC for RDP, AD and so one (routed)
  • 2x SMB vNIC for live-migration and SMB traffics (not routed). Each vNIC is mapped to a pNIC.

To test the solution I use VMFleet. First I created volumes in two-way mirroring without compression, then I enabled deduplication. After I deleted and recreated volumes in nested mirror-accelerated parity without deduplication. Finally, I enabled compression and deduplication.

I run the VM Fleet with a block size of 4KB, an outstanding of 30 and on 2 threads per VM.

Two-Way Mirroring without deduplication results

First, I ran the test without write workloads to see the “maximum” performance I can get. My cluster is able to deliver 140K IOPS with a CPU workload of 82%.

In the following test, I added 30% of write workloads. The total IOPS is almost 97K for 87% of CPU usage.

As you can see, the RSS and VMMQ are well set because all Cores are used.

Two-Way Mirroring with deduplication

First, you can see that deduplication is efficient because I saved 70% of total storage.

Then I run a VMFleet test and has you can see, I have a huge drop in performance. By looking closely to the below screenshot, you can see it’s because of my CPU that reach almost 97%. I’m sure with a better CPU, I can get better performance. So first trend: deduplication has an impact on CPU workloads and if you plan to use this feature, don’t choose the low-end CPU.

By adding 30% write, I can’t expect better performance. The CPU still limit the overall cluster performance.

Nested Mirror-Accelerated Parity without deduplication

After I recreated volumes I run a test with 100% read. Compared to two-way mirroring, I have a slightly drop. I lost “only” 17KIOPS to reach 123KIOPS. The CPU usage is 82%. You can see also than the latency is great (2ms).

Then I added 30% write and we can see the performance drop compared to two-way mirroring. My CPU usage reached 95% that limit performance (but the latency is content to 6ms in average). So nested mirror-accelerated parity require more CPU than two-way mirroring.

Nested Mirror-Accelerated Parity with deduplication

First, deduplication works great also on nested mirror-accelerated parity volume. I saved 75% of storage.

As two-way mirroring with compression, I have poor performance because of my CPU (97% usage).

Conclusion

First, deduplication works great if you need to save space at the cost of a higher CPU usage. Secondly, nested mirror-accelerated parity requires more CPU workloads especially when there are write workloads. The following schemas illustrate the CPU bottleneck. In the case of deduplication, the latency always increases and I think because of CPU bottleneck. This is why I recommend to be careful about the CPU choice. Nested Mirror Accelerated Parity takes also more CPU workloads than 2-Way Mirroring.

Another interesting thing is that Mirror-Accelerated Parity produce a slightly performance drop compared to 2-Way Mirroring but brings the ability to support two failures in the cluster. With deduplication enabled we can save space to increase the usable space. In two-node configuration, I’ll recommend to customer Nested Mirror-Accelerated Parity by paying attention to the CPU.

The post Storage Spaces Direct: performance tests between 2-Way Mirroring and Nested Resiliency appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/storage-spaces-direct-performance-tests-between-2-way-mirroring-and-nested-resiliency/feed/ 2 6581
Support two failures in 2-node S2D cluster with nested resiliency https://www.tech-coffee.net/support-two-failures-in-2-node-s2d-cluster-with-nested-resiliency/ https://www.tech-coffee.net/support-two-failures-in-2-node-s2d-cluster-with-nested-resiliency/#comments Mon, 08 Oct 2018 09:19:01 +0000 https://www.tech-coffee.net/?p=6554 Microsoft just released Windows Server 2019 with a lot of improvement for Storage Spaces Direct. One of these improvements is the nested resiliency that is specific for 2-node S2D cluster. Thanks to this feature, a 2-node S2D cluster can now support two failures, at the cost of storage dedicated for resiliency. Nested Resiliency comes in ...

The post Support two failures in 2-node S2D cluster with nested resiliency appeared first on Tech-Coffee.

]]>
Microsoft just released Windows Server 2019 with a lot of improvement for Storage Spaces Direct. One of these improvements is the nested resiliency that is specific for 2-node S2D cluster. Thanks to this feature, a 2-node S2D cluster can now support two failures, at the cost of storage dedicated for resiliency. Nested Resiliency comes in two flavors:

  • Nested two-Way mirroring: It’s more or less a 4-way mirroring that provide 25% of usable storage
  • Nested mirror-accelerated parity: it’s a volume with a mirror tier and a parity tier.

The following slide comes from a deck presented at Ignite.

To support two failures, a huge amount of storage is consumed by the resiliency. Hopefully, Windows Server 2019 allows to run deduplication in ReFS volume. But be careful about the CPU usage and storage device performances. I’ll talk about that in a next topic.

Create a Nested Two-Way Mirror volume

To create a Nested 2-Way Mirroring volume, you have to create a storage tier and a volume. Below you can find an example in my lab (full flash solution) where Storage Pool is called VMPool:

New-StorageTier -StoragePoolFriendlyName VMPool -FriendlyName Nested2wMirroringTier -ResiliencySettingName Mirror -NumberOfDataCopies 4 -MediaType SSD

New-Volume -StoragePoolFriendlyName VMPool -FriendlyName CSV-01 -StorageTierFriendlyNames Nested2wMorringTier -StorageTierSizes 500GB

Create a Nested Mirror-Accelerated Parity volume

To create a Nested Mirror-Accelerated Parity volume, you need to create two tiers and a volume composed of these tiers. In the below example, I create two nested Mirror-Accelerated Parity volume:

New-StorageTier -StoragePoolFriendlyName VMPool -FriendlyName Nested2wMirroringTier -MediaType SSD -ResiliencySettingName Mirror -NumberOfDataCopies 4

New-StorageTier -StoragePoolFriendlyName VMPool -FriendlyName NestedSParityTier -ResiliencySettingName Parity -NumberOfDataCopies 2 -PhysicalDiskRedundancy 1 -NumberOfGroups 1 -FaultDomainAwareness StorageScaleUnit -ColumnIsolation PhysicalDisk -MediaType SSD

New-Volume -StoragePoolFriendlyName VMPool -FriendlyName PYHYV01 -StorageTierFriendlyNames NestedMirror,NestedParity -StorageTierSizes 80GB, 150GB

New-Volume -StoragePoolFriendlyName VMPool -FriendlyName PYHYV02 -StorageTierFriendlyNames NestedMirror,NestedParity -StorageTierSizes 80GB, 150GB

Conclusion

Some customers didn’t want to deploy a 2-node S2D cluster in branch office because of lack of the support of two failures. Thanks to nested resiliency we can support two failures in a 2-node cluster. However be careful to storage usage for resiliency and the performance of the overall cluster if you enable deduplication.

The post Support two failures in 2-node S2D cluster with nested resiliency appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/support-two-failures-in-2-node-s2d-cluster-with-nested-resiliency/feed/ 2 6554
Storage Spaces Direct: Parallel rebuild https://www.tech-coffee.net/storage-spaces-direct-parallel-rebuild/ https://www.tech-coffee.net/storage-spaces-direct-parallel-rebuild/#comments Fri, 22 Jun 2018 15:39:59 +0000 https://www.tech-coffee.net/?p=6416 Parallel rebuild is a Storage Spaces features that enables to repair a storage pool even if the failed disk is not replaced. This feature is not new to Storage Spaces Direct because it exists also since Windows Server 2012 with Storage Spaces. This is an automatic process which occurs if you have enough free space ...

The post Storage Spaces Direct: Parallel rebuild appeared first on Tech-Coffee.

]]>
Parallel rebuild is a Storage Spaces features that enables to repair a storage pool even if the failed disk is not replaced. This feature is not new to Storage Spaces Direct because it exists also since Windows Server 2012 with Storage Spaces. This is an automatic process which occurs if you have enough free space in the storage pool. This is why Microsoft recommends to leave some free space in the storage pool to allow the parallel rebuild. This amount of free space is often forgotten when designing Storage Spaces Direct solution, this is why I wanted to write this theoretical topic.

How works parallel rebuild

Parallel rebuild needs some free spaces to work. It’s like spare free space. When you create a RAID6 volume, a disk is in spare in case of failure. In Storage Spaces (Direct), instead of spare disk, we have spare free space. Parallel rebuild occurs when a disk fails. If enough of capacity is available, parallel rebuild runs automatically and immediately to restore the resiliency of the volumes. In fact, Storage Spaces Direct creates a new copy of the data that were hosted by the failed disk.

When you receive the new disk (4h later because you took a +4h support :p), you can replace the failed disk. The disk is automatically added to the storage pool if the auto pool option is enabled. Once the disk is added to the storage pool, an automatic rebalance process is run to spread data across all disks to get the best efficiency.

How to calculate the amount of free spaces

Microsoft recommends to leave free space equal to one capacity disk per node until 4 drives:

  • 2-node configuration: leave free the capacity of 2 capacity devices
  • 3-node configuration: leave free the capacity of 3 capacity devices
  • 4-node and more configuration: leave free the capacity of 4 capacity devices

Let’s think about a 4-node S2D cluster with the following storage configuration. I plan to deploy 3-Way Mirroring:

  • 3x SSD of 800GB (Cache)
  • 6x HDD of 2TB (Capacity). Total: 48TB of raw storage.

Because, I deploy a 4-node configuration, I should leave free space equivalent to four capacity drives. So, in this example 8TB should be the amount of free space for parallel rebuild. So, 40TB are available. Because I want to implement 3-Way Mirroring, I divide the available capacity per 3. So 13.3TB is the useable storage.

Now I choose to add a node to this cluster. I don’t need to reserve space for parallel rebuild (regarding the Microsoft recommendation). So I add 12TB capacity (6x HDD of 2TB) in the available capacity for a total of 52TB.

Conclusion

Parallel rebuild is an interesting feature because it enables to restore the resiliency even if the failed disk is not yet replaced. But parallel rebuild has a cost regarding the storage usage. Don’t forget the reserved capacity when you are planning the capacity.

The post Storage Spaces Direct: Parallel rebuild appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/storage-spaces-direct-parallel-rebuild/feed/ 2 6416
Convert VMs with StarWind V2V converter https://www.tech-coffee.net/convert-vms-with-starwind-v2v-converter/ https://www.tech-coffee.net/convert-vms-with-starwind-v2v-converter/#comments Fri, 15 Jun 2018 13:47:37 +0000 https://www.tech-coffee.net/?p=6409 StarWind V2V converter is a free tool provided by StarWind to convert virtual hard drive. You can convert Hyper-V virtual hard drive (VHDX) to VMware ESXi virtual hard drive (VMDK) and vice versa. Other virtual hard drive formats are supported such as qcow2. Because StarWind V2V convert only the virtual hard drive, you can’t automate ...

The post Convert VMs with StarWind V2V converter appeared first on Tech-Coffee.

]]>
StarWind V2V converter is a free tool provided by StarWind to convert virtual hard drive. You can convert Hyper-V virtual hard drive (VHDX) to VMware ESXi virtual hard drive (VMDK) and vice versa. Other virtual hard drive formats are supported such as qcow2. Because StarWind V2V convert only the virtual hard drive, you can’t automate the migration between hypervisor. So, this tool is not appropriated if you have hundred or thousand VM. Usually, with this number of virtual machines, a smarter product is required (so a paid product). But if you have a small amount of VM these smarter products are overkill and StarWind V2V can help you. In this topic, we’ll see how to convert a Hyper-V VM to VMware VM.

Convert a Hyper-V VM with StarWind V2V

You can download StarWind V2V Converter from this link. Once you have downloaded and installed the product, you can launch it. To convert a Hyper-V VM, select Microsoft Hyper-V Server.

Then specify the name of the Hyper-V Host and credentials. Unfortunately, you can’t specify a cluster name.

Then select the virtual hard drive you want to convert and click on Next.

Next, select VMware direct conversion to ESXi. The description sysa that only ESXi 5.0, 5.5 and 6.0 are supported. But I have successfully converted a VHDX to an ESXi 6.7.

In the next window, specify the IP address and credentials of the target ESXi server.

Next select the datastore where you want to store the converted virtual hard drive.

When the VM is converting, you can get a progress bar.

When the migration is finished, you can connect to your ESXi and create a new VM with the same features. Remove the default hard disk Then add an existing hard disk and select the disk you’ve just converted.

Now you can start the VM. As you can see the VM is working (The VM I have converted was also on License terms). Once you are logged into operating system, you can install the VMware Tools.

Convert VMware VM to Hyper-V VM.

This time we want to convert a VMware VM to a Hyper-V VM. So I choose VMware ESXi Server.

Then specify the IP address and credentials of the source VMware ESXi server.

N.B: The migration from a VMware ESXi 6.7 doesn’t work with StarWind V2V Converter. I had to use a VMware ESXi 6.5 to make the screenshot.

Next select the VMDK you want to convert and click on Next. As you can see in the following screenshot, you can’t convert two VMDK in the same time. It’s a shame.

In the next window, choose Microsoft VHDX image.

Specify the hostname and credentials of the Hyper-V host. You can’t specify a cluster.

To finish choose the destination folder and click on next to start the convert process.

Conclusion

StarWind V2V is not the smartest converter product on the market. Some features are missing. But if you have a small amount of VM and you don’t want to pay a converter product, StarWind V2V can help you. You can migrate VM per VM from a lot of hypervisor. Thanks to this tool you can plan to migrate from Hyper-V to VMware or vice versa. However if you have hundred of VMs, don’t use this tool, it is not made for that.

The post Convert VMs with StarWind V2V converter appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/convert-vms-with-starwind-v2v-converter/feed/ 2 6409
Storage Spaces Direct and deduplication in Windows Server 2019 https://www.tech-coffee.net/storage-spaces-direct-and-deduplication-in-windows-server-2019/ https://www.tech-coffee.net/storage-spaces-direct-and-deduplication-in-windows-server-2019/#respond Tue, 05 Jun 2018 14:58:14 +0000 https://www.tech-coffee.net/?p=6382 When Windows Server 2016 has been released, the data deduplication was not available for ReFS file system. With Storage Spaces Direct, the volume should be formatted in ReFS to get latest features (Accelerated VHDX operations) and to get the best performance. So, for Storage Spaces Direct data deduplication was not available. Data Deduplication can reduce ...

The post Storage Spaces Direct and deduplication in Windows Server 2019 appeared first on Tech-Coffee.

]]>
When Windows Server 2016 has been released, the data deduplication was not available for ReFS file system. With Storage Spaces Direct, the volume should be formatted in ReFS to get latest features (Accelerated VHDX operations) and to get the best performance. So, for Storage Spaces Direct data deduplication was not available. Data Deduplication can reduce the storage usage by removing duplicated blocks by replacing by metadata.

Since Windows Server 1709, Data deduplication is supported for ReFS volume. That means that it will be also available in Windows Server 2019. I have updated my S2D lab to Windows Server 2019 to show you how it will be easy to enable deduplication on your S2D volume.

Requirements

To implement data deduplication on S2D volume in Windows Server 2019 you need the following:

  • An up and running S2D cluster executed by Windows Server 2019
  • (Optional) Windows Admin Center: it will help to implement deduplication
  • Install the deduplication feature on each node: Install-WindowsFeature FS-Data-Deduplication

Enable deduplication

Storage Spaces Direct in Windows Server 2019 will be fully manageable from Windows Admin Center (WAC). That means that you can also enable deduplication and compression from WAC. Connect to your hyperconverged cluster from WAC and navigate to Volumes. Select the volume you want and enable Deduplication and compression.

WAC raises an information pop-up to explain you what deduplication and compression is. Click on Start.

Select Hyper-V as deduplication mode and click on Enable deduplication.

Once it is activated, WAC should tell you the percent of saved state. Currently it is not working in WAC 1804.

Get information by using PowerShell

To get deduplication information on S2D node, open a PowerShell prompt. You can get the list of data deduplication command by running Get-Command *Dedup*.

If you run Get-DedupStatus, you should get the following data deduplication summary. As you can see in the following screenshot, I have saved some spaces in my CSV volume.

By running Get-DedupVolume, you can get the saving rate of the data deduplication. In my lab, data deduplication helps me to save almost 50% of storage space. Not bad.

Conclusion

Data deduplication on S2D was expected by many of customers. With Windows Server 2019, the feature will be available. Currently when you deploy 3-way Mirroring for VM, only 33% of raw storage is available. With Data Deduplication we can expect 50%. Thanks to this feature, the average cost of a S2D solution will be reduced.

The post Storage Spaces Direct and deduplication in Windows Server 2019 appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/storage-spaces-direct-and-deduplication-in-windows-server-2019/feed/ 0 6382
Migrate VMs from VMware to Nutanix AHV with Nutanix Xtract https://www.tech-coffee.net/migrate-vms-from-vmware-to-nutanix-ahv-with-nutanix-xtract/ https://www.tech-coffee.net/migrate-vms-from-vmware-to-nutanix-ahv-with-nutanix-xtract/#comments Fri, 18 May 2018 08:57:48 +0000 https://www.tech-coffee.net/?p=6361 Nutanix AHV is a custom KVM hypervisor integrated to Nutanix ecosystem such as Prism. This is an enterprise-class hypervisor and an alternative solution to VMware ESXi or Microsoft Hyper-V when deploying Nutanix. Nutanix AHV is fully integrated to Nutanix Prism and there is no other GUI to manage this hypervisor. To eases the migration to ...

The post Migrate VMs from VMware to Nutanix AHV with Nutanix Xtract appeared first on Tech-Coffee.

]]>
Nutanix AHV is a custom KVM hypervisor integrated to Nutanix ecosystem such as Prism. This is an enterprise-class hypervisor and an alternative solution to VMware ESXi or Microsoft Hyper-V when deploying Nutanix. Nutanix AHV is fully integrated to Nutanix Prism and there is no other GUI to manage this hypervisor. To eases the migration to the Nutanix hypervisor from VMware, Nutanix has released a web appliance called Nutanix Xtract. In this topic, we’ll see how to deploy the appliance and how to migrate virtual machines from VMware vSphere to Nutanix AHV.

Requirements

The VMs with the following configurations are not supported by Nutanix Xtract.

  • Guest OSes not supported by AHV (see Supported Guest VM Types for AHV in the Nutanix Support Portal)
  • VM names with non-English characters
  • Custom vCenter ports
  • Selecting individual ESXi hosts as source of VMs
  • PCIE pass-through (only certain devices)
  • Independent disks
  • Physical RDM based disks
  • VMs with multi-writer disks attached
  • VMs with 2 GB sparse disk attached
  • VMs with SCSI controllers with a SCSI bus sharing attached

Following operating system are fully supported:

  • Windows 2016 Standard, 2016 Datacenter
  • Windows 7, 8, 8.1, 10
  • Windows Server 2008 R2, 2012, 2012 R2, 2016
  • CentOS 6.4, 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3
  • Ubuntu 12.04.5, 14.04.x, 16.04.x, 16.10, Server, Desktop (32-bit and 64-bit)
  • FreeBSD 9.3, 10.0, 10.1,10.2, 10.3, 11.0
  • SUSE Linux Enterprise Server 11 SP3 / SP4
  • SUSE Linux Enterprise Server 12 Oracle Linux 6.x, 7.x
  • RHEL 6.4, 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3

Following operating system are partially supported:

  • Windows 32-bit operating systems
  • Windows with UAC enabled
  • RHEL 4.0, 5.x
  • CentOS Linux 4.0, 5.x
  • Ubuntu 12.x or lower
  • VMs using UEFI-VMs requiring PCI or IDE bus

The following configurations are required by Nutanix Xtract:

  • Supported browsers: Google Chrome
  • VMware Tools must be installed and up to date on the guest VMs for migration
  • Virtual hardware version running on a VM must be 7.0 minimum.
  • Source VMs must support Changed Block Tracking (CBT). See https://kb.vmware.com/kb/1020128
  • CBT-based snapshots are supported for certain VMs.
  • Disks must be either sparse or flat format and must have a minimum version of 2.
  • ESXi version must be 5.5 minimum.
  • Hosts must not be in maintenance mode.
  • vCenter reachable from Xtract appliance on Port TCP 443.
  • ESXi hosts reachable from Xtract appliance on Ports TCP 443 and TCP 902.
  • Every VM must have a UUID.
  • ESXi hosts must be have complete configuration details of the VMs.
  • Complete VM configuration details in ESXi.
  • VMs must have multiple compatible snapshots.
  • Allow port 2049 and port 111 between the Xtract for VM network and the AHV cluster network (CVMs).
  • Accounts used for performing in-guest operations require Login as Batch Job rights in the local security policy on Windows or within the group policy, see https://technet.microsoft.com/en-us/library/cc957131.aspx. Administrator users do not have sufficient rights.

Before a migration, the VMware tools must be started and running and the snapshots must be deleted

Deploy Nutanix Xtract

First of all, download the appliance image from the Nutanix portal. Then log on Nutanix Xtract and navigate to Home |VM.

Next click on the wheel and select Image Configuration.

Then click on Create Image, specify a name and an annotation. Choose Disk image type and upload the qcow2 file from the Nutanix Xtract that you have previously downloaded.

The image upload take a moment and you can check the progression in task menu.

Then create a VM with the following settings:

  • 2 vCPUs
  • 2 Cores per vCPU
  • 4GB of Memory

If you scroll down to Disks setting, you’ll get this message. Click on Add New Disk.

Configure the disk as the following and select the Nutanix Xtract image you’ve just uploaded. Then click on Add.

In Network Adapters section, specify the VLAN where will be connected Nutanix Xtract.

To finish, enable Custom Script and upload the script called xtract-vm-cloudinit-script located in the Nutanix Xtract archival that you have previously downloaded from Nutanix portal.

Then start the VM, connect to the console and wait a while. From my side, the appliance was ready after 30 minutes.

Configure the appliance

When the appliance is ready, you can enter admin credentials (admin / nutanix/4u).

When you are logged with admin user, run the rs command and type again the admin password.

Edit the file /etc/sysconfig/network-scripts/ifcfg-eth0 and specify a static IP as below configuration.

Then restart the service network by running service network restart.

Next edit the file /etc/resolv.conf and specify your suffix DNS and the DNS server(s):

  • search mydomain.local
  • nameserver 10.10.201.2

Restart the Nutanix Xtract appliance. Now connect through HTTPS to the appliance by using the static IP you have set previously. Accept the license agreement and click on Continue.

Next specify a password for the nutanix account.

Now you can log on Nutanix Xtract with the nutanix account.

Configure Nutanix Xtract

Now that you are connected to the appliance, you have to add the source and the target environment. First click on Add Source environment.

Then enter the source name, the vCenter Server address and admin credentials.

Next click on Add target environment and specify your Nutanix Prism.

Now you have the source and the target environment. You are ready to migrate VMware VM to Nutanix AHV.

Migrate a VMware VM to Nutanix AHV

Now to migrate VMs, we have to create a migration plan. To create it, click on Create a Migration Plan.

Provide a name for the migration plan and click on OK.

Next select the target environment and the target container where you want to store VMs.

Next you can look for VMs you want to migrate by using the search field. Then click on the “+” button to add VM into the migration plan.

The guest credentials are used if you run guest operations on source VMs such as install the VirtIO tools. I recommend to not bypass Guest Operations on Source VMs to install VirtIO automatically. Lot of VMs I have migrated without these operations didn’t boot. You can also make the mapping between the source network and the target network.

Next check the migration plan summary and click on Save And Start to run immediately the migration. The data will be copied but the cutover will be done manually later.

Then you can monitor the migration progression.

When you are ready to cutover the VM, you can click on Cutover. The source VMs will be shutdown and the target VMs will be started. I final incremental data copy is executed.

When the copy is finished, the migration status should be completed. Congratulation, you have migrated VMware VMs to Nutanix AHV easily :).

Conclusion

Nutanix provides a powerful tool to migrate VMware VM to Nutanix AHV. All is included to plan the migration and you can schedule the failover. I had some issue with Microsoft UAC but globally the tool works great.

The post Migrate VMs from VMware to Nutanix AHV with Nutanix Xtract appeared first on Tech-Coffee.

]]>
https://www.tech-coffee.net/migrate-vms-from-vmware-to-nutanix-ahv-with-nutanix-xtract/feed/ 4 6361