Notes From the Field: VSAN Design

NoSANWith the official release of VMware VSAN a bit of over a month ago on March 11th when ESXi 5.5 U1 dropped I am having more conversations with customers around the product and designing solutions. While with some customers it has been more of an inquisitive peek at the technology, I have had the chance to work on a few designs (OK, two) with customers looking to deploy VSAN over a “traditional” storage array for their storage needs.

For both configurations we went the “roll your own” solution over configurations available via the Node Ready program,  For this reason I leaned heavily on three key resources for the builds:

*Dell documentation listed as server/compute/storage are based on Dell platforms

    While I am not going to provide a deep dive review of VSAN as there are plenty of resources available on the internet/blogs as well as the documentation listed above that will provide the needed details. But I will give is a quick breakdown of what the storage requirements are  laid out by VMware for a VSAN deployment.
Artifacts Minimums Maximums
Disk Groups One per host Five per host
Flash Devices
SAS, SATA, PCIe SSD
One per disk group One per disk group
Magnetic Disk Devices One HDD per disk group Seven HHDs per disk group
Disk Formatting Overhead 750MB per HDD 750MB per HDD

*Table from page 3 of the “VMware Virtual SAN Design and Sizing Guide

For our specific use case and customer requirements we will be deploying a three node cluster (minimum for VSAN) with the default settings of Number of  Failures to Tolerate set to 1 and Number of Disk Stripes per Object  set to 1 as well. We are aiming for around twelve usable terabytes of space to start.

      • Number of Failures to Tolerate – This setting controls the number of replica copies of the virtual machine VMDK(s) created across the cluster. With the default value set to 1, two replicas of the VMDK(s) will be created. As you increase this value you will provide additional redundancy for the virtual machine at the cost of using additional storage for the replica copies. Maximum value is 3
      • Number of Disk Stripes per Object – The number of HDDs across which each replica of a virtual machine object is striped. A value higher than 1 might result in better performance, but also results in higher use of system resources

VSAN

The Build

As mentioned above, I will be leveraging servers from Dell for this configuration. To meet the minimum requirements defined by the customer we went with Dell R720’s as the host servers with the capability to hold 16 x 2.5 inch drives in a single chassis. Utilizing the 16 drive chassis gives us the ability to create at least two fully populated VSAN disk groups (7+1)  per host for future growth/expansion (one now, one down the road). To make room/allow for the use of all 16 slots we will be leverage redundant SD cards for the ESXi installation (Note – Remember to redirect the scratch partition!). Again, with building our own solution I checked and rechecked the VMware VSAN compatibility guide for IO devices (controllers/HDD/SSD) as well as the ESXi compatibility guide for supported servers and components.

DellR720_2

For the actual drive configuration I took to the VSAN HCL to verify the supported drives from Dell. As stated above each disk group needs to have one flash device and and one magnetic device. To meet the overall storage requirement (calculation below) of twelve usable terabytes the first disk group will be made up of 7 x 1.2TB 10K SAS drives. The flash device used for the read/write buffering will be the 400GB SSD SATA Value drive. Connectivity of all the drives will be provided by a LSI SAS 9207-8i controller. This controller was chosen as it will allow for true pass-through or JBOD mode to present the drives to VMware for the VSAN Datastore creation.

Some might ask why we decided to go with 10K SAS drives over “cheap and deep” with NL-SAS drives. The largest 2.5 inch NL-SAS drive offering from Dell is 1.0TB while the largest 2.5 inch SAS drive comes in at 1.2TB for a 10K spindle. Going with the 10K drives provided two design advantages, additional capacity per disk and the additional IOPS provided for when IO needs to come from spinning disk.

Now for the capacity. The VMware documentation breaks down the math needed to come up with sizing calculations around capacity, objects, components, etc. What I am going to show below is how the chosen configuration gets us to the target number for the customer. On page 9 of the VMware Virtual SAN Design and Sizing Guide the following formula is provided for Cluster Capacity:

Formula: Host x NumDskGrpPerHst x NumDskPerDskGrp x SzHDD = y

My Configuration: 3 x 1 x 7 x 1.2TB = 25.2TB

But that is only one step in the process. After calculating the Cluster Capacity I need to get to the number I really care about, Usable Capacity. Again from the VMware documentation on page 10 I get the following:

Formula: (DiskCapacity – DskGrp x DskPerDskGrp x Hst x VSANoverhead )/(ftt+1)

My Configuration: (25.2 – 1 x 7 x 3 x 1)/ftt+1 = 25804 –21/ftt+1 = 25783/2=  12891 or roughly 12.8TB

Now a word on flash devices. One thing to make note of, in your VSAN calculations the flash devices DO NOT participate in the calculations. Since they are only used as read/write buffering they don’t contribute to the overall storage pool. Also, you maybe wondering how/why I chose the 400GB SSD’s for my flash tier. Stated on page 7 of the VMware documentation:

The general recommendation for sizing flash capacity for Virtual SAN is to use 10 percent of the anticipated consumed storage capacity before the number of failures to tolerate is considered.

By the following statement I have oversized my flash tier as initially the customer will be using only percentage of the twelve terabytes of capacity. But I like to play things a little safer and sized my flash tier based on ten percent of the usable capacity (VMware’s original sizing guideline) as the difference in pricing from a 200GB to a 400GB SSD is nominal, in addition we have sized for the future utilization of the usable capacity to follow in-line with VMware’s statement above.

Comments

  1. Great post! Another reason I’m finding to oversize flash is the Policy for Reach Cache Reservation. I’ve still not seen anything official on this, however, some within VMware are saying don’t use it or use with caution, while VMware themselves at launch showed this feature into he product Demo.

    Either way, if you have an opportunity to provide additional cache performance to a workload that requires this, I would say you should always size your flash larger than the 10% anyway to take advantage of this policy if needed.

Speak Your Mind

*