Capacity Expansion & Disk Group Design Decisions–All Flash vSAN

VSAN_AcceptedOne of the things I like/enjoy the most in my job as a consultant is working with people to help assist in the design process to come up with a solution that solves a specific customer challenge or meets design requirements criteria. While working on these projects there usually is more then one solution or configuration that fits the stated needs, it comes down to a process of filtering the pro’s/con’s or matching the project requirements for a given solution.

This was evident recently when I was working on a VMware vSAN design for a customer. A converstation occured around the design and layout of the Disk Group(s) construct that vSAN leverages to create the underlying Datastore. Now these considerations are typically straightforward, but with the release of vSAN 6.2 and the inclusion of deduplication/compression for All Flash (AF) implementations there are both technical and operational decisions to take in account. But before we get into that, here is a quick primer on vSAN configurations.

vSAN Disk Groups – The Basics

I won’t go too far into this as there exists an extreme amount of information about vSAN and AF-vSAN in the blogosphere and in official VMware documentation. To get your learn on jump over to this LINK and review the Design and Sizing Guide along with Space Efficiency Technologies documents. Well worth the read.

So, let’s get on with it. vSAN works in the constructs of Disk Groups to provide the performance and capacity needs for a vSphere environment. In the Hybrid model a flash device (SSD/NVMe) fronts a given disk group and is leveraged for a read cache and write buffer tier. Behind the flash device traditional magnetic media is used to create the capacity tier. A given disk group can only have a single flash device and a minimum of one or up to seven capacity disks behind it. In AF-vSAN configurations a single flash device still sits in front of the disk group but is solely used as a write buffer tier. All read I/O operations are processed from the capacity tier as it is made up of flash media. Finally, a given host needs a least a single Disk Group, with a maximum of up to five Disk Groups.

vSAN Data, Operations, and Disk Failures

With the basic Disk Group construct broken out, let’s dive into how vSAN distributes/lands data on the capacity tier. Note, this is true for both Hybrid and AF configurations with Dedupe/Compression disabled. In the default/out of the box configuration vSAN “stripes” data across only a single disk (this can be altered with Storage Policies) in 255GB objects. If an object is larger than 255GB’s (think large VMDK), the object is broken into multiple parts and “striped” across a single or multiple disks.

Operationally, increasing the performance or capacity of your vSAN deployment is pretty easy to accomplish and handled via the Web UI:

· If your Disk Group(s) are not fully populated with capacity drives, just slide more drives into each given host. Then add them to an established Disk Group(s)

· If your Disk Group(s) are fully populated (and you have open slots/bays in your server) create a new Disk Group by adding a flash caching device and the needed capacity drive media

· If your Disk Groups and hosts are full, purchase additional host(s) and add them to the cluster

With either of the above methods, you will want to issue a proactive rebalance to rebalance vSAN data across the added capacity.

As you can imagine, a vSAN Disk Group represents a failure domain that needs to be considered. In the example thus far (again no Dedupe/Compression enabled) the following holds true:

· If the flash buffer/caching device fronting a Disk Group fails, the corresponding Disk Group will be marked as “Degraded” and go offline

· If a capacity disk fails, the individual disk will be marked as “Degraded”, but the VM(s) will still be accessible from the replica copy of data (Again, assuming vSAN defaults with FTT=1)

vSAN Data, Operations, & Disk Failures – AF-vSAN w/Dedupe & Compression

Alright, with the ground work set let’s start getting to the point. When enabling Dedupe/Compression on your vSAN cluster the data layout and disk failures changes a bit. First difference is, instead of the objects being written to a singular capacity drive, that data is now “wide striped” across ALL capacity disks in the Disk Group.

With this striping of data across all drives in a Disk Group, the operational tasks change a bit when trying to add capacity or performance to the cluster:

· If your Disk Group(s) are not fully populated with capacity drives, additional drives can be added. But, you will have to do a complete data migration from the Disk Group as the Disk Group will have to be deleted and recreated with the additional disk(s). Consider the operational overhead of re-syncing data across your vSAN Datastore as well as having to accomplish this across each host in your cluster.

· If your Disk Group(s) are fully populated (and you have open slots/bays in your server) create a new Disk Group by adding a flash caching device and the needed capacity drive media

· If your Disk Groups and hosts are full, purchase additional host(s) and add them to the cluster

The drawback to this striping of data, leads to the following failure considerations:

· If the flash buffer device front a Disk Group fails, the corresponding Disk Group will be marked as “Degraded” and go offline (no change here)

· If a flash capacity disk fails, the entire Disk Group will be marked as “Degraded”, but the VM(s) will still be accessible from the replica copy of data

And finally dedupe and compression are handled at the per Disk Group level, it is not “Global” in a vSAN deployment. This means there is a potential that less/larger Disk Groups have a higher dedupe/compression ratio compared to more/smaller size Disk Groups across your hosts.

Get to the Point Already

Hopefully you have stayed with me thus far as I am finally getting to the point. For the vSAN designs I typically build for customers I try to keep them streamlined and easy as possible to administer and manage. When it comes to AF-vSAN leveraging dedupe and compression my “Best Practice” is:

· Fully populate a given Disk Group

· If possible based on server format, creation of two Disk Groups per host

So, with all of this laid out what is the best approach? Well like I get to say in my job, it depends. Based on the details and ideas listed above, each configuration has its Pro’s and Con’s. You will need to weigh the operational impact/overhead of creating a single/larger sized disk group against the capacity/data savings drawback of multiple/smaller sized Disk Groups.

Single/Larger Disk Group Multiple/Smaller Disk Group
  • Potential for increased dedupe/compression ratio
  • Greater per host density
  • Incremental capacity/performance expansion
  • Potential for increased performance (more drives)
  • Rebuild/resync of data will be faster in case of device failure (less data per drive)
  • Ease of operational overhead for Disk Group expansion
  • Operational overhead to expand (if not already fully populated)
  • Higher rebuild/resync times in case of device failure
  • Potential for less performance
  • Higher cost per Disk Group
  • Per host density is lower
  • Lower dedupe/compression ratio

VSAN Accepted Everywhere Stickers & T-Shirts

VSAN_AcceptedWell another VMworld is in the books, and like years past it didn’t disappoint. It was great to catch up with new and old friends in the VMware community as well as taken in a few sessions (and even sit in on a panel). Like last year, some of the most popular sessions and Hands on Labs (HoL) focused on VMware’s Virtual SAN, or VSAN technology.

Like VMware, I took the opportunity this year to showcase VSAN and created a sticker and t-shirt to hand out. I came prepared with a couple hundred stickers and three dozen shirts to hand out to conference attendees. Needless to say they where a huge success and I quickly went through both. If I missed you here is your chance to get your hands on some stickers. Just fill out the form below and I will drop a pair of stickers to you in the mail (for US and Canada only please). If you feel so inclined, click on the “Donate” button below to help cover cost and postage. Smile


* Required
    This is a required question
    This is a required question
    This is a required question
    Never submit passwords through Google Forms.

If you are interested in order a t-shirt (above is sticker ONLY) send me an email or direct message on twitter and I will pass along ordering details:



Notes From the Field – Hybrid or All Flash VSAN?

tier-whatOver the last several months I have been involved in a few customer meetings where the customer has been looking to migrate or move away from a “traditional” storage architecture/array and the subject of Hyper Converged Infrastructure (HCI) seems to come up. Usually the topic covers “one of the big three” in the HCI space, be it Nutanix (XC for us Dell partners), Simplivity, or VMware’s VSAN product.  Recently though I have come across two projects where the customer initial started out looking at VMware VSAN in its Hybrid configuration (SSD for cache/buffer and magnetic disk for capacity) but then switched over to (or at least the conversation) of looking at the All Flash version and its corresponding data reduction technologies in VSAN 6.2 (Deduplication/Compression/Erasure Coding).

For these two scenarios (actual customer scenarios), I was interested in seeing if going All Flash could pretty much be the “standard” deployment model based on VSAN 6.2 enhancements or if cost could be still be a limiting factor. For the pricing models that follow, I used list pricing for both VMware and Dell for the software and hardware. I would not expect one to pay these prices on the street, they were used strictly for a cost comparison.

[Read more…]



Over the last 18 months or so I put together several posts around configuring/designing/implementing VMware’s VSAN hyper-converged product both in my lab and working with customers. Almost a year ago with the availability of support for VSAN to support an All-Flash (AF) configuration, I updated my lab when I could to ditch the spinning disks and move to an all flash model. I thought I was set and was good to go, but like most things you can’t leave good enough alone.  The last few months a made a few tweaks and changes to the lab, added Intel PCIe Flash devices for the write cache tier and moved from using USB drives for ESXi install to SATADOM’s on the hosts.

I Feel the Need…The Need for Speed…

First things first, everyone seems to care about IOP numbers, so we will start with PCIe flash. Smile After doing some research/digging on PCIe cards I settled on using the Intel 750 series card. In an effort not to break the bank and also not needing a large write tier I went with the 400GB cards for each of my three VSAN hosts. While the more expensive big brother of 750 series is on the VSAN HCL ( Intel P3xxx series), these cards worked without issue right out of the box. One thing of note, I did update the inbox driver to a driver provided by Intel that Pete Koehler (blog / twitter) recommended for overall performance gains.

With the drivers updated and a quick reconfiguration of the VSAN Datastore it was time to do some testing. For a testing model I leveraged three VM’s, one on each ESXi host in the cluster, and IOmeter to generate a workload. While synthetic workloads are the best method truly getting “real world” performance numbers, for the details I was wanting to capture IOmeter met those needs. For a workload metric I leveraged a configuration file that was based on 32K block size, 50% read and 50% write. I ran the workload three times on each VM at the same time and the table below details the averages:

[Read more…]