hyperconverged – Tech-Coffee

Configure Dell S4048 switches for Storage Spaces Direct

Romain Serre — Thu, 26 Apr 2018 09:57:59 +0000

When we deploy Storage Spaces Direct (S2D), either hyperconverged or disaggregated, we have to configure the networking part. Usually we work with Dell hardware to deploy Storage Spaces Direct and the one of the switches supported by the Dell reference architectures is the Dell S4048 (Force 10). In this topic, we will see how to configure this switch from scratch.

This topic has been co-written with Frederic Stefani – Dell architect solution.

Stack or not

Usually, customers know the stack feature which is common to all network vendors such as Cisco, Dell, HP and so on. This feature enables to add several identical switches in a single configuration managed by a master switch. Because all switches share the same configuration, for the network administrators, all these switches are seen like a single one. So, the administrators connect to the master switch and then edit the configuration on all switches member of the stack.

If the stacking is sexy on the paper, there is a main issue especially with storage solution such as S2D. With S4048 stack, when you run an update, all switches reload at the same time. Because S2D highly relies on the network, your storage solution will crash. This is why the Dell reference architecture for S2D recommends to deploy a VLT (Virtual Link Trunking).

With Stacking you have a single control plane (you configure all switches from a single switch) and a single data plane in a loop free topology. In a VLT configuration, you have also a single data plane in a loop free topology but several control planes, which allow you to reboot switches one by one.

For this reason, the VLT (or MLAG) technology is the preferred way for Storage Spaces Direct.

S4048 overview

A S4048 switch has 48x 10GB/s SFP+ ports, 6x 40GB/s QSFP+ ports, a management port (1GB/s) and a serial port. The management and the serial ports are located on the back. In the below diagram, there is three kinds of connection:

Connection for S2D (in this example from port 1 to 16, but you can connect until port 48)
VLTi connection
Core connection: the uplink to connect to core switches

In the below architecture schema, you can find both S4048 interconnected by using VLTi ports and several S2D nodes (hyperconverged or disaggregated, that doesn’t matter) connected to port 1 to 16. In this topic, we will configure these switches regarding this configuration.

Switches initial configuration

When you start the switch for the first time you have to configure the initial settings such as switch name, IP address and so on. Plug a serial cable from the switch to your computer and connect through Telnet with the following settings:

Baud Rate: 115200
No Parity
8 data bits
1 stop bit
No flow control

Then you can run the following configuration:

Enable
Configure

# Configure the hostname
hostname SwitchName-01

# Set the IP address to the management ports, to connect to switch through IP
interface ManagementEthernet 1/1
ip address 192.168.1.1/24
no shutdown
exit

# Set the default gateway
ip route 0.0.0.0/0 192.168.1.254/24

# Enable SSH
ip ssh server enable

# Create a user and a password to connect to the switch
username admin password 7 MyPassword privilege 15

# Disable Telnet through IP
no ip telnet server enable
Exit

# We leave enabled Rapid Spanning Tree Protocol.
protocol spanning-tree rstp
no disable
Exit

Exit

# Write the configuration in memory
Copy running-configuration startup-configuration

After this configuration is applied, you can connect to the switch through SSH. Apply the same configuration to the other switch (excepted the name and IP address).

Configure switches for RDMA (RoCEv2)

N.B: For this part we assume that you know how RoCE v2 is working, especially DCB, PFC and ETS.

Because we implement the switches for S2D, we have to configure the switches for RDMA (RDMA over Converged Ethernet v2 implementation). Don’t forget that with RoCE v2, you have to configure DCB and PFC end to end (on servers and on switches side). In this configuration, we assume that you use the Priority ID 3 for SMB traffic.

# By default the queue value is 0 for all dot1p (QoS) traffic. We enable this command globally to change this behavivor.
service-class dynamic dot1p

# Data-Center-Bridging enabled. This enable to configure Lossless and latency sensitive traffic in a Priority Flow Control (PFC) queue.
dcb enable

# Provide a name to the DCB buffer threshold
dcb-buffer-threshold RDMA
priority 3 buffer-size 100 pause-threshold 50 resume-offset 35
exit

# Create a dcb map to configure the PFC and ETS rule (Enhanced Transmission Control)
dcb-map RDMA

# For priority group 0, we allocate 50% of the bandwidth and PFC is disabled
priority-group 0 bandwidth 50 pfc off

# For priority group 3, we allocate 50% of the bandwidth and PFC is enabled
priority-group 3 bandwidth 50 pfc on

# Priority group 3 contains traffic with dot1p priority 3.
priority-pgid 0 0 0 3 0 0 0 0

Exit

Exit
Copy running-configuration startup-configuration

Repeat this configuration on the other switch.

VLT domain implementation

First of all, we have to create Port Channel with two QSFP+ ports (port 1/49 and 1/50):

Enable
Configure

# Configure the port-channel 100 (make sure it is not used)
interface Port-channel 100

# Provide a description
description VLTi

# Do not apply an IP address to this port channel
no ip address

#Set the maximum MTU to 9216
mtu 9216

# Add port 1/49 and 1/50
channel-member fortyGigE 1/49,1/50

# Enable the port channel
no shutdown

Exit

Exit
Copy Running-Config Startup-Config

Repeat this configuration on the second switch Then we have to create the VLT domain and use this port-channel. Below the configuration on the first switch:

# Configure the VLT domain 1
vlt domain 1

# Specify the port-channel number which will be used by this VLT domain
peer-link port-channel 100

# Specify the IP address of the other switch
back-up destination 192.168.1.2

# Specify the priority of each switch
primary-priority 1

# Give an used MAC address for the VLT
system-mac mac-address 00:01:02:01:02:05

# Give an ID for each switch
unit-id 0

# Wait 10s before the configuration saved is applied after the switch reload or the peer link restore
delay-restore 10

Exit

Exit
Copy Running-Configuration Startup-Configuration

On the second switch, the configuration looks like this:

vlt domain 1
peer-link port-channel 100
back-up destination 192.168.1.1
primary-priority 2
system-mac mac-address 00:01:02:01:02:05
unit-id 1
delay-restore 10

Exit

Exit
Copy Running-Configuration Startup-Configuration

No the VLT is working. You don’t have to specify VLAN ID on this link. The VLT manage itself tagged and untagged traffic.

S2D port configuration

To finish the switch configuration, we have to configure ports and VLAN for S2D nodes:

Enable
Configure
Interface range Ten 1/1-1/16

# No IP address assigned to these ports
no ip address

# Enable the maximum MTU to 9216
mtu 9216

# Enable the management of untagged and tagged traffic
portmode hybrid

# Enable switchport Level 2 and this port is added to default VLAN to send untagged traffic.
Switchport

# Configure the port to Edge-Port
spanning-tree 0 portfast

# Enable BPDU guard on these port
spanning-tree rstp edge-port bpduguard

#Apply the DCB policy to these port
dcb-policy buffer-threshold RDMA

# Apply the DCB map to this port
dcb-map RDMA

# Enable port
no shutdown

Exit

Exit
Copy Running-Configuration Startup-Configuration

You can copy this configuration to the other switch. Now just VLAN are missing. To create VLAN and assign to port you can run the following configuration:

Interface VLAN 10
Description "Management"
Name "VLAN-10"
Untagged TenGigabitEthernet 1/1-1/16
Exit

Interface VLAN 20
Description "SMB"
Name "VLAN-20"
tagged TenGigabitEthernet 1/1-1/16
Exit

[etc.]
Exit
Copy Running-Config Startup-Config

Once you have finished, copy this configuration on the second switch.

The post Configure Dell S4048 switches for Storage Spaces Direct appeared first on Tech-Coffee.

Use Honolulu to manage your Microsoft hyperconverged cluster

Romain Serre — Wed, 21 Feb 2018 11:09:19 +0000

Few months ago, I have written a topic about the next gen Microsoft management tool called Honolulu project. Honolulu provides management for standalone Windows Server, failover clustering and hyperconverged. Currently hyperconverged management works only on Windows Server Semi-Annual Channel (SAC) versions (I cross finger for Honolulu support on Windows Server LTSC). I have upgraded my lab to latest technical preview of Windows Server SAC to show you how to use Honolulu to manage your Microsoft hyperconverged cluster.

In part of my job, I deployed dozen of Microsoft hyperconverged cluster and to be honest, the main disadvantage of this solution is the management. Failover Clustering console is archaic and you have to use PowerShell to manage the infrastructure. Even if the Microsoft solution provides high-end performance and good reliability, the day-by-day management is tricky.

Thanks to Honolulu project we have now a modern management which can compete with other solutions on the market. Currently Honolulu is still in preview version and some of features are not yet available but it’s going to the right direction. Moreover, Honolulu project is free and can be installed on your laptop or on a dedicated server. As you wish !

Honolulu dashboard for hyperconverged cluster

Once you have added the cluster connection to Honolulu, you get a new line with the type Hyper-Converged Cluster. By clicking on it, you can access to a dashboard.

This dashboard provides a lot of useful information such as latest alerts provided by the Health Service, the overall performance of the cluster, the resource usage and information about servers, virtual machines, volumes and drives. You can see that currently the cluster performance charts indicate No data available. It is because the Preview of Windows Server that I have installed doesn’t provide information yet.

From my point of view, this dashboard is pretty clear and provides global information about the cluster. At a glance, you get the overall health of the cluster.

N.B: the memory usage indicated -35,6% because of a custom motherboard which not provide memory installed on the node.

Manage Drives

By clicking on Drives, you get information about the raw storage of your cluster and your storage devices. You get the total drives (I know I don’t follow the requirements because I have 5 drives on a node and 4 on another, but it is a lab ). Honolulu provides also the drive health and the raw capacity of the cluster.

By clicking on Inventory, you have detailed information about your drives such as the model, the size, the type, the storage usage and so on. At a glance, you know if you have to run an Optimize-StoragePool.

By clicking on a drive, you get further information about it. Moreover, you can act on it. For example, you can turn light on, retire the disk or update the firmware. For each drive you can get performance and capacity charts.

Manage volumes

By clicking on Volumes, you can get information about your Cluster Shared Volume. At a glance you get the health, the overall performance and the number of volumes.

In the inventory, you get further information about the volumes such as the status, the file system, the resiliency, the size and the storage usage. You can also create a volume.

By clicking on create a new volume, you get this:

By clicking on a volume, you get more information about it and you can make action such as open, resize, offline and delete.

Manage virtual machines

From Honolulu, you can also manage virtual machines. When you click on Virtual Machines | Inventory, you get the following information. You can also manage the VMs (start, stop, turn off, create a new one etc.). All chart values are in real time.

vSwitches management

From the Hyper-Converged cluster pane, you have information about virtual switches. You can create a new one, delete rename and change settings of an existing one.

Node management

Honolulu provides also information about your nodes in the Servers pane. At a glance you get the overall health of all your nodes and resource usage.

In the inventory, you have further information about your nodes.

If you click on a node, you can pause the node for updates or hardware maintenance. You have also detailed information such as performance chartsm drives connected to the node and so on.

Conclusion

Project Honolulu is the future of Windows Server in term of management. This product provides great information about Windows Server, Failover Clustering and Hyperconverged cluster in a web-based form. From my point of view, Honolulu eases the Microsoft hyperconverged solution management and can help administrators. Some features are missing but Microsoft listen the community. Honolulu is modular because it is based on extensions. Without a doubt, Microsoft will add features regularly. Just I cross finger for Honolulu support on Windows Server 2016 released in October 2016 but I am optimistic.

The post Use Honolulu to manage your Microsoft hyperconverged cluster appeared first on Tech-Coffee.

Re-image Nutanix nodes to VMware ESXi 6.5u1

Romain Serre — Thu, 07 Dec 2017 14:42:57 +0000

This week I deployed a Nutanix cluster based on VMware ESXi 6.5u1. I wanted to share with you how to re-image the Nutanix nodes to VMware ESXi. Usually Nutanix blocks are shipped with nodes imaged on AHV. So, you have to re-image the Nutanix nodes to the wanted hypervisor (Hyper-V, ESXi or KVM). In this topic we’ll see how to re-image Nutanix nodes to VMware ESXi

The following procedure regards Nutanix blocks. If you have a branded block (such as Dell), the procedure may change especially about network adapter to plug.

I’m sorry, the screenshots are blurred. I promise, I’ll do better next time

Requirements

To re-image Nutanix nodes, you need Nutanix foundation that you can download from Nutanix portal. This is a VM provided by Nutanix which contains tools to re-image node. This VM should be run on your laptop so you need a hypervisor which can run this VM such as VMware Workstation or VirtualBox.

I heavily recommend you to bring a simple switch (no manageable, 8 ports) to plug your laptop and Nutanix nodes. In this way nothing will trouble the deployment. The node discovering is based on IPv6LL.

You need also the lastest AOS release (5.1.3 at the time of writing this topic) and the VMware ESXi 6.5u1 ISO. To summary you need:

A laptop with VMware Workstation/Fusion, VirtualBox
Download the Foundation VM, add it to the hypervisor you chose and start it
A non-manageable switch 8 ports
Download the last AOS from the Nutanix Portal
Download VMware ESXi 6.5u1

About foundation VM

Once you have downloaded the foundation VM, you can unzip it and add it to the hypervisor on your laptop. The VM must be connected to a bridge network. Be sure about this configuration. The default credentials are Nutanix / nutanix/4u.

About Nutanix nodes

The Nutanix nodes must be connected to the switch where your laptop is plugged. From my side, I have plugged the lower integrated network adapter port (Nutanix brand block). This configuration can change depending on the block brand. For example, for Lenovo blocks, I read that 1GB and 10GB network adapter must be plugged. For other, you need to plug also the IPMI. Please read the documentation for branded block about the right connection for re-imaging.

About network

Each Nutanix node requires four (three mandatory) IP addresses:

One for IPMI
One for Hypervisor (management)
One for Controller VM (CVM)
vMotion (Optional but recommended)

It is recommended that the CVM network and the Hypervisor network were on the same subnet. An additional IP is required for the cluster address.

Re-image Nutanix Node

Once the Foundation VM is started and you are authenticated to the system, you should get the following desktop. Run the script set_foundation_ip_address.

Choose Device configuration. Then set an IP address which will be on the hypervisor network to reach the cluster and the nodes.

Next I use a tool like PSCP to copy inside the VM the AOS and VMware image. To copy files, run the following command:

Pscp.exe c:\path\to\my\file nutanix@:/tmp

Then run the Foundation Applet to open the following GUI. The interface shows you the discovered nodes. In this example, only two nodes are discovered instead of three, I don’t know why. I chose to click on a node and launch foundation.

By magic in foundation, the three nodes are well discovered. If your nodes are not discovered, you can specify the IPMI MAC address to discover nodes manually. A label is stuck on back of each node with the IPMI MAC address.

On the next screen, specify the cluster information: cluster name and IP address, NTP server, DNS server and time zone. I activate the checkbox Configure IPMI IP to configure IPMI of each node. Next, I choose a netmask and a gateway for CVM and hypervisor network. Finally, I changed the CVM memory to 32GB which is a recommended value when you enable deduplication.

Next for each node, specify the node name and an IP for each network.

In this example, I choose Single Hypervisor because I want to deploy the same hypervisor in each node.

On the next screen, click on manage in AOS section and upload the AOS image that you have copied previously in /tmp. Do the same thing for ESXi.

After upload is finished, you should have something like that:

PS: You can upgrade the hypervisor whitelist to install latest VMware ESXi version. For that, you need to connect to Nutanix portal and download the last ISO_WhiteList.json. Then use PSCP to copy the file in the Foundation VM. Next click on View Whitelist and update the whitelist with the file that you have just copied.

Then the re-imaging process is running. It took 2h to finish the re-imaging.

Once th re-imaging process is finished, the cluster is creating.

If the cluster is well created, you should get the following screen:

Connect to Prism from a web browser (https://) and complete requested information. You have to change the default password also.

Next choose to activate or not Pulse.

After the wizard, you should get the Prism dashboard with cluster information. Now you can configure Prism and deploy vCenter. Have fun

The post Re-image Nutanix nodes to VMware ESXi 6.5u1 appeared first on Tech-Coffee.

Get Storage Spaces Direct insights from StarWind Manager

Romain Serre — Wed, 27 Sep 2017 15:09:33 +0000

Earlier in the week, I published a blog post about Honolulu project and how in the future, this tool can ease Windows Server management. Today I introduce another management tool for Storage Spaces Direct (hyperconverged or disaggregated). This tool is called StarWind Manager and it is developed by … StarWind.

StarWind Manager is currently in preview version and for the moment, it is free. You can download it from this link. This tool provides you real time metrics such as bandwidth, IOPS, CPU usage and so on. You can get also insights about Storage Spaces Direct such as the physical disks, the Cluster Shared Volumes, the jobs which are running etc. In this topic we’ll see how to deploy StarWind Manager and which kind of information you can retrieve.

StarWind Manager roles

StarWind Manager comes with two roles: the StarWind Manager Core and StarWind Manager agent. The agent must be deployed on Storage Spaces Direct (S2D) cluster nodes while the core can be deployed in a VM. The core role provides a web interface to get information about your cluster and takes information from agent. Currently StarWind manager enables to add only nodes. You can’t add entire cluster with a single click.

Deploy StarWind Manager Core role

After you have downloaded StarWind Manager, you can copy the executable to your VM. I have created a VM with 2 vCPU and 4GB of dynamic memory for this. Then run the executable to start the setup wizard. You can achieve the install process quickly because few information are asked.

Select to install StarWind Manager Core and do not install StarWind Manager agent.

That’s all. StarWind Manager Core is installed after the wizard and it is ready to use.

Deploy StarWind Manager Agent role

To install StarWind Manager Agent on your S2D Cluster nodes, copy the installer on servers and run the wizard. It’s work for Windows Server 2016 Core: I have deployed the agents on Core edition in my lab. In the wizard, select StarWind Manager agent and do not install the StarWind Manager Core.

Repeat the agent installation for each S2D cluster node you have.

Connect to StarWind Manager

To connect to StarWind Manager, open a browser and type https://:8100/client. Default credentials are root / Starwind.

For the moment, StarWind Manager provides only the ability to add S2D cluster nodes. To add nodes, click on … Add New Node.

After you’ve added your nodes, you can retrieve information about your nodes on dashboard pane. You get the status, the IP, the name, the uptime, information about software and hardware.

On performance tab, you can retrieve real time metrics about your node such as CPU utilization, Memory Usage, IOPS and bandwidth.

On Storage Spaces Direct tab you get information about S2D. This pane provides you cluster overview such as the nodes in the cluster the storage capacity and space allocation and the health.

In the same tab, information about Storage Pools and virtual volumes are provided.

You can get also information about physical disks and running jobs.

Conclusion

I’m more than happy that lot of GUI are under development for Storage Spaces Direct. The main disadvantage of the Microsoft solution compared to VMware vSAN or Nutanix is the user experience. But currently Microsoft is working on Honolulu and StarWind is working on this product. Even if both product are under development, they provide clear information about S2D. Now I hope both products will provide in near future easy access to complex S2D operations for day to day administration such has a physical disk removal (place disk in retired mode, enable LED on front of disk, then change the disk and disable LED). From my point of view, this kind of products can heavily help the adoption of Storage Spaces Direct, in hyperconverged or disaggregated model.

The post Get Storage Spaces Direct insights from StarWind Manager appeared first on Tech-Coffee.

Next gen Microsoft management tool: Honolulu

Romain Serre — Mon, 25 Sep 2017 06:10:47 +0000

Since the beginning of the year, Microsoft is working on a new management tool based on modern web languages such as HTML5, Angular and so on. This tool is called Honolulu. Honolulu is a user-friendly web interface that enables to manage Windows Server, Failover Clustering and Hyperconverged cluster. Currently, to manage hyperconverged cluster, Honolulu requires Semi-Annual Windows Server release.

Honolulu is currently in public preview release which means that the product is under construction :). Honolulu is built in a modular way where you can add or remove extensions. Each management feature is included in an extension that you can add or remove. Microsoft expects later that vendors develop third party extensions. To be honest with you, this is the set of tools I’m waiting for a while ago. Microsoft was in late in management tools compared to other companies such as VMware. I hope that Honolulu will close the gap with VMware vCenter and Nutanix Prism.

Microsoft listens customers and feedback to improve this product. So you can download the product here and report feedback in this user voice.

In this topic, we will see an overview of Honolulu. I’ll dedicate a topic about Honolulu and Microsoft hyperconverged solution because Honolulu requires Windows Server 2016 RS3 release (in Semi-Annual Channel) to work with and I have not yet upgraded my lab.

Getting started with Honolulu

In the below screenshot, you can see Honolulu home page. You get all your connections (and the type) and you can add more of them.

By clicking on arrow next to Project Honolulu, you can filter the connection type on Server Manager, Failover Cluster Manager and Hyper-Converged Cluster Manager.

By clicking on the wheel (top right), you can access to extension manager and you get installed extensions. For example you have extensions for firewall management, Hyper-V, failover clustering and so on. You can remove extensions you don’t want.

Sever Manager

As you have seen before, you can manage a single server from Honolulu. I will not show you all management tools but just an overview of Honolulu. By adding and connecting to a server, you get the following dashboard. In this dashboard you can retrieve real-time metrics (CPU, memory and network) and information, you can restart or shutdown the system or edit RDP access and environment variables. For the moment you can’t resize columns and tables and I think in near future that Microsoft will add this feature.

An interesting module is the Events. In this pane, you get the same thing as this good old Event Viewer. You can retrieve all the events of your system and you can filter them. Maybe a checkbox enabling real-time events could be interesting :).

The devices pane is also available. In a single view, you have all hardware installed in the system. If Microsoft adds the ability to install drivers from there, Honolulu can replace DevCon for Core servers.

You can also browse the system files and manage file and folders.

Another pane enables to manage the network adapters as you can see below. For the moment this pane is limited because it doesn’t allow to manage advanced feature such as RDMA, RSS, VMMQ and so on.

You can also add or remove roles and features from Honolulu. It is really cool that you can manage this from a Web service.

If you use Hyper-V, you can manage VMs from Honoulu. The dashboard also is really nice because there is counters about VMs and last events.

Another super cool feature is the ability to manage updates from Honolulu. I hope Microsoft will add WSUS configuration from this pane with some scheduling.

Failover Cluster management

Honolulu enables also to manage failover cluster. You can add a failover cluster connection from Honolulu home page. Just click on Add.

Then specify the cluster name. Honolulu asks if you want to add also the servers member of the cluster.

One it is added, you can select it and you get this dashboard. You get cluster core ressource states, and some information about the cluster such as the number of roles, networks and disks.

By clicking on disks, you can get a list of Cluster Shared Volumes in the cluster and information about them.

If your cluster hosts Hyper-V VMs (not in hyperconverged way), you can manage VMs from there. You get the same pane than in Honolulu server manager. The VMs and related metrics are shown and you can create or delete virtual machines. A limited set of option is currently available.

You can also get the vSwitches deployed in each node. It’s pitty that Switch Embedded Teaming is not yet supported but I think the support will be added later.

Hyperconverged cluster management

As I said earlier, hyperconverged cluster is supported but only for Windows Server Semi-Annual channel (for the moment). I’ll dedicate a topic about Honolulu and hyperconverged cluster once I’ll upgrade my lab.

Update Honolulu

When a Honolulu update is released, you get notified by Update Available mention. Currently, the update process is not really user-friendly because when you click on Update Available, an executable is downloaded and you have to run again the Honolulu installation (specify installation path, certificate thumbprint etc.). I hope in the future that the update process will be a self-update.

When I have downloaded the executable, I checked the package size and it is amazing: only 31MB.

Conclusion

Finally, they did it! A true modern management tool. I try for Microsoft this tool for 3 months and I can say you that developers work really quickly and they make a great job. Features are added quickly and Microsoft listens customers. I recommend you to post in the user voice the features you want. The tool is currently not perfect, some features are missing but Honolulu is still in preview release ! Microsoft is in the right direction with Honolulu and I hope this tool will be massively used. I hope also that Honolulu will help to install more Windows Server in Core edition, especially for Hyper-V and storage server.

The post Next gen Microsoft management tool: Honolulu appeared first on Tech-Coffee.

Storage Spaces Direct: plan the storage for hyperconverged

Romain Serre — Mon, 17 Jul 2017 09:49:52 +0000

When a customer calls me to design or validate the hardware configuration for hyperconverged infrastructure with Storage Spaces Direct, there is often a misunderstanding about the remaining useable capacity, the required cache capacity and ratio, and the different mode of resilience. With this topic, I’ll try to help you to plan the storage for hyperconverged and to clarify some points.

Hardware consideration

Before sizing the storage devices, you should be aware about some limitations. First you can’t exceed 26 storage devices per node. Windows Server 2016 can’t handle more than 26 storage devices so if you deploy your Operating System on two storage devices, 24 are available for Storage Spaces Direct. However, the storage devices are bigger and bigger so 24 storage devices per node is enough (I have never seen a deployment with more than 16 storage devices for Storage Spaces Direct).

Secondly, you have to pay attention on your HBA (Host Bus Adapter). With Storage Spaces Direct, this is the Operating System which is in charge to handle the resilience and cache. This is a software-defined solution after all. So, there is no reason that the HBA manages RAID and cache. In Storage Spaces Direct case, the HBA is used mainly to add more SAS ports. So, don’t buy an HBA with RAID and cache because you will not use these features. Storage Spaces Direct storage devices will be configured in JBOD mode. If you choose to buy Lenovo server, you can buy N2215 HBA. If you choose Dell, you can select HBA330. The HBA must provide the following features:

Simple pass-through SAS HBA for both SAS and SATA drives
SCSI Enclosure Services (SES) for SAS and SATA drives
Any direct-attached storage enclosures must present Unique ID
Not Supported: RAID HBA controllers or SAN (Fibre Channel, iSCSI, FCoE) devices

Thirdly, there are requirements regarding storage devices. Only NVMe, SAS and SATA devices are supported. If you have old SCSI storage devices, you can drop them :). These storage devices must be physically attached to only one server (local-attached devices). If you choose to implement SSD, these devices must be enterprise-grade with power loss protection. So please, don’t install a hyperconverged solution with Samsung 850 pro. If you plan to install cache storage devices, these SSD must have 3 DWPD. That means that this device can be written entirely three times per day at least.

To finish, you have to respect a minimum number of storage devices. You must implement at least 4 capacity storage devices per node. If you plan to install cache storage devices, you have to deploy two of them at least per node. For each node in the cluster, you must have the same kind of storage devices. If you choose to deploy NVMe in a server, all servers must have NVMe. The most possible, keep the same configuration across all nodes. The below table provides the minimum storage devices per node regarding the configuration:

Drive types present	Minimum number required
All NVMe (same model)	4 NVMe
All SSD (same model)	4 SSD
NVMe + SSD	2 NVMe + 4 SSD
NVMe + HDD	2 NVMe + 4 HDD
SSD + HDD	2 SSD + 4 HDD
NVMe + SSD + HDD	2 NVMe + 4 Others

Cache ratio and capacity

The cache ratio and capacity is an important part when you choose to deploy cache mechanism. I have seen a lot of wrong design because of cache mechanism. The first thing to know is that the cache is not mandatory. As explained in the above table, you can implement an all flash configuration without cache mechanism. However, if you choose to deploy a solution based on HDD, you must implement a cache mechanism. When the storage devices behind cache are HDDs, the cache is set to Read / Write mode. Otherwise, it is set to write only mode.

The cache capacity must be at 10% of the raw capacity. If in each node you have 10TB of raw capacity, you need at least 1TB of cache. Moreover, if you deploy cache mechanism, you need at least two cache storage devices. This ensures the high availability of the cache. When Storage Spaces Direct is enabled, capacity devices are bound to cache devices in round-robin manner. If a cache storage device fails, all its capacity devices are bound to another cache storage device.

To finish, you must respect a ratio between the number of cache devices and capacity devices. The capacity devices must be a multiple of cache devices. This ensures that each cache device has the same number of capacity devices.

Reserved capacity

When you design the storage pool capacity and you choose the number of storage devices, you need to keep in mind that you need some unused capacity in the storage pool. This is the reserved capacity in case of repair process. If a capacity device fails, storage pool duplicates blocks that were written in this device to respect the resilience mode. This process requires free space to duplicate blocks. Microsoft recommends to leave empty the space of one capacity device per node up to four drives.

For example, I have 6 nodes with 4x 4TB HDD per node. I leave empty 4x 4TB (one per node up to four drives) in the storage pool for reserved capacity.

Example of storage design

You should know that in hyperconverged infrastructure, the storage and the compute are related because these components reside in the same box. So before calculate the required raw capacity you should have evaluated two things: the number of nodes you plan to deploy and the useable storage capacity required. For this example, let’s say that we need four nodes and 20TB of useable capacity.

First thing, you have to choose a resilience mode. In hyperconverged, usually 2-way Mirroring and 3-way Mirroring are implemented. If you choose 2-Way mirroring (1 fault tolerance), you have 50% of useable capacity. If you choose 3-Way Mirroring (recommended, 2 fault tolerances) you have only 33% of useable capacity.

PS: At the time of writing this topic, Microsoft has announced deduplication in next Windows Server release for ReFS volume.

So, if you need 20TB of useable capacity and you choose 3-Way Mirroring, you need at least 60TB (20 x 3) of raw storage capacity. That means that in each node (4-node) you need 15TB of raw capacity.

Now that you know you need 15TB of raw storage per node, you need to define the number of capacity storage devices. If you need maximum performance, you can choose only NVMe devices. But this solution will be very expensive. For this example, I choose SSD for the cache and HDD for the capacity.

Next, I need to define which kind of HDD I select. If I choose 4x 4TB HDD per node, I will have 16TB raw capacity per node. I need to add an additional 4TB HDD for the reserved capacity. But this solution is not good regarding the cache ratio. No cache ratio can be respected with five capacity devices. In this case I need to add an additional 4TB HDD to get a total of 6x 4TB HDD per node (24TB raw capacity) and I can respect the cache ratio with 1:2 or 1:3.

The other solution is to select 2TB HDD. I need 8x 2TB HDD to get the required raw capacity. Then I add an additional 2TB HDD for the reserved capacity. I get 9x 2TB HDD and I can respect the cache ratio with 1:3. I prefer this solution because I’m closest of the specifications.

Now we need to design the cache devices. For our solution, we need 3x cache devices for a total capacity of 1.8TB at least (10% of raw capacity per node). So I choose to buy 800GB SSD (because my favorite cache SSD, Intel S3710, exists in 400GB or 800GB :)). 800GB x 3 = 2.1TB cache capacity per node.

So, each node will be installed with 3x 800GB SSD and 9x 2TB HDD with a cache ratio of 1:3. The total raw capacity is 72TB and the reserved capacity is 8TB. The useable capacity will be 21.12TB ((72-8) x 0.33).

About Fault Domain Awareness

I have made this demonstration with a Fault Domain Awareness at the node level. If you choose to configure Fault Domain Awareness at chassis and rack level, the calculation is different. For example, if you choose to configure Fault Domain Awareness at the rack level, you need to divide the total raw capacity across the rack number. You need also the exact same number of nodes per rack. With this configuration and the above case, you need 15TB of raw capacity per rack.

The post Storage Spaces Direct: plan the storage for hyperconverged appeared first on Tech-Coffee.

Choose properly your CPU for Storage Spaces Direct

Romain Serre — Mon, 09 Jan 2017 21:54:17 +0000

When building a hyperconverged solution, the CPU is the most important part. This component defines the number of vCPU supported by your infrastructure and so, defined the number of servers to deploy. Today we can install hundreds of RAM in GB, and dozens of storage in TB. Compared to these components the number of CPU is small. So, a bigger CPU (in term of number of cores) almost always reduces the number of servers to buy. In the other hand, with the new Windows Server 2016 license model, a bigger CPU means a higher license cost.

This topic introduces a way to try to choose the right CPU to balance the infrastructure cost (hardware and software) and performance. This is a feedback from a real case study that I have done.

Theoretical case study values

For this topic, I’ll use the theoretical values to show you the cost difference between several CPUs. To make this study, I’ll use the following required values:

500 vCPUs
3TB of RAM
20TB of Storage

Remind about the Windows Server 2016 license

For this topic, I’ll focus on Windows Server 2016 Datacenter license because Storage Spaces Direct requires this edition. Windows Server 2016 is a per-core basis. The Windows Server 2016 license covers up to two sockets of 8 physical cores (16 cores per server). Beyond these 16 cores, you have to acquire license pack. Each license pack covers up to two physical cores.

For example, if you have two CPU of 14 cores, the server has 28 cores. So, you buy the Windows Server 2016 Datacenter which covers up to 16 cores. For the 12 cores remaining, you have to buy 6 additional license packs.

The public cost of Windows Server 2016 Datacenter is:

6155$ for Windows Server 2016 Datacenter
760$ for a license pack

Selected CPU

For this topic, I have Selected two CPUs:

Intel Xeon E5-2630v4 (667$): 10 Cores
- 25MB cache, 2,20GHz
Intel Xeon E5-2683v4 (1846$): 16 Cores
- 40MB cache, 2,10GHz

The Intel Xeon E5-2683v4 has more cache but 100MHz less than Intel Xeon E5-2630v4. But the higher number of cores enable to support more vCPUs.

Calculate the required hardware

For this infrastructure, 500 vCPUs are required. Because we will host server workloads, we can define the consolidation rate to 4:

Number of Cores = 500vCPUs / 4 (consolution rate)

So, we need 125 cores. For the calculation, I never consider the Hyperthreading. Below, please find the number of servers required depending of the server. The column #Node is the number of nodes (Raw) round up with an additional node for the N+1. To find the number of nodes, I have divided the required cores per the number of cores per node. As you can see, with the E5-2630v4, you need two additional nodes.

required Cores	Processor	#Cores / node	#Nodes (Raw)	#Node
125	E5-2630v4	20	6.25	7
125	E5-2683-v4	32	3.91	5

Now that we have the number of nodes, we can calculate the required memory. To calculate the required memory per node, I take the number of nodes – 1 to support a node failure. With the CPU E5-2630v4, you need 500GB per node and you’ll have 3,5TB of RAM for the cluster. With E5-2683v4, you need 750GB of RAM per node and you’ll have 3,75TB of RAM for the cluster. You need more RAM per cluster for E5-2683v4 because you have less nodes.

Required RAM (GB)	CPU	#Nodes	#Nodes – 1	Ram per node	Total RAM
3000	E5-2630v4	7	6	500	3500
3000	E5-2683v4	5	4	750	3750

Finally, for the storage, we need 20TB. We use Storage Spaces Direct and to host virtual machines, we need 3-way mirroring. I have selected the following configuration:

Intel Xeon E5-2630v4 (7 nodes) – Total 22TB with 4TB of reservation
- 4xSSD 400GB (Cache)
- 6x HDD 2TB (Capacity)
Intel Xeon E5-2683v4 (5Nodes) – Total 24TB with 8TB of reservation
- 4xSSD 400GB (Cache)
- 4xHDD 4TB (Capacity)

To make the price comparison related to the selected hardware, I have used Dell hardware (Dell R730xd):

Node E5-2630v4	Node E5-2683v4
2x E5-2630v4 512GB of RAM 2x SSD 120GB (OS) 4x SSD 400GB (Cache) 6x HDD 2TB (Capacity) 1x Mellanox 25GB dual controller	2x E5-2683v4 750GB of RAM 2x SSD 120GB (OS) 4x SSD 400GB (Cache) 4x HDD 4TB (Capacity) 1x Mellanox 25GB dual controller
27K$ per node	35K$ per node
189K$ for 7 nodes	175K$ for 5 nodes

The total hardware cost for E5-2683v4 solution is 14K$ less expensive than the solution with E5-2630v4. Now, let’s see the license cost

Windows Server 2016 Datacenter licenses

With the E5-2630v4 solution, we have 20 cores per node and with E5-2683v4 solution we have 32 cores per node.

CPU	#Cores / node	#Extra license packs	#Total cost / node	#Nodes	Total cost
E5-2630v4	20	2	$7,675.00	7	$53,725.00
E5-2683v4	32	8	$12,235.00	5	$61,175.00

As you can see, you need to spend more money for E5-2683v4 solution than E5-2630v3 in Windows Server 2016 licenses.

Total solution cost

Below you can find the total cost for both solutions:

	E5-2630v4	E5-2683v4
Hardware	$189,000.00	$175,000.00
Software	$53,725.00	$61,175.00
Total	$242,725.00	$236,175.00

So, the total cost for E5-2630v4 solution is almost 243K$ and 236K$ for E5-2683v4 solution. As you can see, even if the bigger CPU is more expensive and even if you have to acquire more Windows Server 2016 licenses, the E5-2683v4 solution is less expensive. Moreover the E5-2683v4 solution has a better consolidation because we can host more VMs in each node. So this solution is more scalable.

Conclusion

I have written this topic to show you that a bigger CPU doesn’t mean a more expensive solution. So for your hyperconverged solution with Storage Spaces Direct, you should evaluate several type of CPU. By taking time in the planning phase, you can save money and implement a more scalable solution.

The post Choose properly your CPU for Storage Spaces Direct appeared first on Tech-Coffee.

Create a VMware vSAN cluster step-by-step

Romain Serre — Thu, 25 Aug 2016 09:57:06 +0000

As Microsoft, VMware has a Software-Defined Storage solution called vSAN which is currently in version 6.2. This solution enables to aggregate local device storages as mechanical disks or SSD and create a highly available datastore. There are two deployment models: hybrid solution and full-flash solution.

In hybrid solution, you must have flash cache devices and mechanical cache devices (SAS or SATA). In full-flash solution you have only flash devices for cache and capacity. The disks either flash or capacity will be aggregated in disk groups. In each disk group, you can have 1 cache device and 7 capacity devices. Moreover, each host can handle 5 disk groups at maximum (35 capacity devices per host).

In this topic I will describe how to implement a hybrid solution in a three-nodes cluster. For the demonstration, ESXi nodes are located in virtual machines hosted by VMware Workstation. Unfortunately, Hyper-V under Windows Server 2016 handle not very well ESXi. Only IDE controllers and legacy network adapters are supported. So I can’t use my Storage Spaces Direct lab to host a vSAN cluster .

VMware vSAN lab overview

To run this lab, I have installed VMware Workstation 12.x pro on a traditional machine (gaming computer) running with Windows 10 version 1607. Each ESXi virtual machine is configured as below:

ESXi 6.0 update 2
2x CPU with 2x Cores each
16GB of memories (6GB required and more than 8GB recommended)
1x OS disk (40GB)
15x hard disks (10GB each)

Then I have deployed the vCenter server 6.0 update 2 in a single Windows Server 2012 R2 virtual machine.

I have deployed the following networks:

Management: 10.10.0.0/24 (VLAN ID: 10) – Native VLAN
vSAN traffic: 10.10.101.0/24 (VLAN ID: 12)
vMotion traffic: 10.10.102.0/24 (VLAN ID: 13)

In this topic, I assume that you have already installed your ESXi and vCenter server. I assume also that each server is reachable on the network and that you have created at least one datacenter in the inventory. All the screenshots have been taken from the vSphere Web Client.

Add ESXi host to the inventory

First of all, connect to your vSphere Web Client and navigate to Hosts and Clusters. As you can see in the following screenshot, I have already created several datacenters and folders. To add the host to the inventory, right click on a folder and select Add Host.

Next specify the host name or IP address of the ESXi node.

Then specify the credential to connect to the host. Once the connection is made, a permanent account is created and used for management and not anymore the specified account.

Then select the license to assign the the ESXi node.

On the next screen, choose if you want to prevent a user to logging directly into this host or not.

To finish, choose the VM location.

Repeat these steps to add more ESXi node to inventory. For the vSAN usage, I will add two additional nodes.

Create and configure the distributed switch

When you buy a vSAN license, a single distributed switch support is included. To support the vSAN, vMotion and management traffic, I’m going to create a distributed switch with three VMKernels. To create the distributed switch, navigate to Networking and right click on VM Network in a datacenter and choose New Distributed Switch as below.

Specify a distributed switch name and click on Next.

Choose a distributed switch version. Because I have only ESXi version 6.0, I choose the last version of the distributed switch.

Next change the number of uplinks as needed and specify the name of the port group. This port group will contain VMKernel adapters for vMotion, vSAN and management traffic.

Once the distributed switch is created, click on it and navigate to Manage and Topology. Click on the button encircled in red in the below screenshot to add physical NICs to uplink port group and to create VMKernel adapters.

In the first screen of the wizard, select Add hosts.

Specify each host name and click on Next.

Leave the default selection and click on Next. By selecting the following tasks to perform, I’ll add physical adapters to uplink port group and I’ll create VMKernel adapters.

In the next screen, assign the physical adapter (vmnic0) to the uplink port group of the distributed switch which has just been created. Once you have assigned all physical adapters, click on Next.

On the next screen, I’ll create the VMKernel adapters. To create them, just click on New adapter.

Select the port group associated to the distributed switch and click on Next.

Then select the purpose of the VMKernel adapter. For this one I choose Virtual SAN traffic.

Then specify an IP address for this virtual adapter. Click on Next to finish the creation of VMKernel adapter.

I create again a new VMKernel adapter for vMotion traffic.

Repeat the creation of VMKernel adapters for each ESXi host. At the end, you should have something like below:

Before make the configuration, the wizard analyzes the impact. Once all is ok, click on Next.

When the distributed switch is configured, it looks like that:

Create the cluster

Now the distributed switch and virtual network adapters are set, we can create the cluster. Come back to Hosts and Clusters in the navigator. Right click on your folder and select New Cluster.

Give a name to your cluster and for the moment, just turn on virtual SAN. I choose a manual disk claiming because I have set to manually which disks are flash and which disks are HDD. This is because ESXi nodes are in VMs and hard disks are detected all in flash.

Next, move the node in the cluster (drag and drop). Once all nodes are in the cluster, you should have an alert saying that there is no capacity. This is because we have selected manual claiming and no disk are for the moment suitable for vSAN.

Claim storage devices into vSAN

To claim a disk, select the cluster where vSAN is enabled and navigate to Disk Management. Click on the button encircled in red in the below screenshot:

As you saw in the below screenshot, all the disks are marked in flash. In this topic I want to implement a hybrid solution. vSphere Web Client offers the opportunity to mark manually a disk as HDD. This is possible because in production, some hardware are not well detected. In this case, you can set it manually. For this lab, I leave three disks in flash and I set 12 disks as HDD for each node. With this configuration, I will create three disk groups composing of one cache device and four capacity device.

Then you have to claim disks. For each node, select the three flash disks and claim them for cache tier. All disks that you have marked as HDD can be claimed for capacity tier.

Once the claiming wizard is finished, you should have three disk groups per node.

If you want to assign the license to your vSAN, navigate to Licensing and select the license.

Final configuration

Now that vSAN is enabled, you can Turn On vSphare HA and vSphere DRS to distribute virtual machines across the nodes.

Some vSphere HA settings must be changed in vSAN environment. You can read these recommendations in this post.

VM Storage policy

vSAN is based on VM Storage policy to configure the storage capacity. This configuration is applied by VM per VM basis with the VM Storage policy. We will discuss about VM Storage policy in another topic. But for the moment, just verify that the Virtual SAN Default Storage policy exists in the VM Storage Policies store.

Conclusion

In this topic we have seen how to create a vSAN cluster. There is no challenge in this but it is just the beginning. To use the vSAN you have to create VM Storage Policy and some of the capacity concept are not easy. We will discuss later about VM Storage policy. If you are interested by the same Microsoft solution, you can read this whitepaper.

The post Create a VMware vSAN cluster step-by-step appeared first on Tech-Coffee.