Capacity Expansion & Disk Group Design Decisions–All Flash vSAN

VSAN_AcceptedOne of the things I like/enjoy the most in my job as a consultant is working with people to help assist in the design process to come up with a solution that solves a specific customer challenge or meets design requirements criteria. While working on these projects there usually is more then one solution or configuration that fits the stated needs, it comes down to a process of filtering the pro’s/con’s or matching the project requirements for a given solution.

This was evident recently when I was working on a VMware vSAN design for a customer. A converstation occured around the design and layout of the Disk Group(s) construct that vSAN leverages to create the underlying Datastore. Now these considerations are typically straightforward, but with the release of vSAN 6.2 and the inclusion of deduplication/compression for All Flash (AF) implementations there are both technical and operational decisions to take in account. But before we get into that, here is a quick primer on vSAN configurations.

vSAN Disk Groups – The Basics

I won’t go too far into this as there exists an extreme amount of information about vSAN and AF-vSAN in the blogosphere and in official VMware documentation. To get your learn on jump over to this LINK and review the Design and Sizing Guide along with Space Efficiency Technologies documents. Well worth the read.

So, let’s get on with it. vSAN works in the constructs of Disk Groups to provide the performance and capacity needs for a vSphere environment. In the Hybrid model a flash device (SSD/NVMe) fronts a given disk group and is leveraged for a read cache and write buffer tier. Behind the flash device traditional magnetic media is used to create the capacity tier. A given disk group can only have a single flash device and a minimum of one or up to seven capacity disks behind it. In AF-vSAN configurations a single flash device still sits in front of the disk group but is solely used as a write buffer tier. All read I/O operations are processed from the capacity tier as it is made up of flash media. Finally, a given host needs a least a single Disk Group, with a maximum of up to five Disk Groups.

vSAN Data, Operations, and Disk Failures

With the basic Disk Group construct broken out, let’s dive into how vSAN distributes/lands data on the capacity tier. Note, this is true for both Hybrid and AF configurations with Dedupe/Compression disabled. In the default/out of the box configuration vSAN “stripes” data across only a single disk (this can be altered with Storage Policies) in 255GB objects. If an object is larger than 255GB’s (think large VMDK), the object is broken into multiple parts and “striped” across a single or multiple disks.

Operationally, increasing the performance or capacity of your vSAN deployment is pretty easy to accomplish and handled via the Web UI:

· If your Disk Group(s) are not fully populated with capacity drives, just slide more drives into each given host. Then add them to an established Disk Group(s)

· If your Disk Group(s) are fully populated (and you have open slots/bays in your server) create a new Disk Group by adding a flash caching device and the needed capacity drive media

· If your Disk Groups and hosts are full, purchase additional host(s) and add them to the cluster

With either of the above methods, you will want to issue a proactive rebalance to rebalance vSAN data across the added capacity.

As you can imagine, a vSAN Disk Group represents a failure domain that needs to be considered. In the example thus far (again no Dedupe/Compression enabled) the following holds true:

· If the flash buffer/caching device fronting a Disk Group fails, the corresponding Disk Group will be marked as “Degraded” and go offline

· If a capacity disk fails, the individual disk will be marked as “Degraded”, but the VM(s) will still be accessible from the replica copy of data (Again, assuming vSAN defaults with FTT=1)

vSAN Data, Operations, & Disk Failures – AF-vSAN w/Dedupe & Compression

Alright, with the ground work set let’s start getting to the point. When enabling Dedupe/Compression on your vSAN cluster the data layout and disk failures changes a bit. First difference is, instead of the objects being written to a singular capacity drive, that data is now “wide striped” across ALL capacity disks in the Disk Group.

With this striping of data across all drives in a Disk Group, the operational tasks change a bit when trying to add capacity or performance to the cluster:

· If your Disk Group(s) are not fully populated with capacity drives, additional drives can be added. But, you will have to do a complete data migration from the Disk Group as the Disk Group will have to be deleted and recreated with the additional disk(s). Consider the operational overhead of re-syncing data across your vSAN Datastore as well as having to accomplish this across each host in your cluster.

· If your Disk Group(s) are fully populated (and you have open slots/bays in your server) create a new Disk Group by adding a flash caching device and the needed capacity drive media

· If your Disk Groups and hosts are full, purchase additional host(s) and add them to the cluster

The drawback to this striping of data, leads to the following failure considerations:

· If the flash buffer device front a Disk Group fails, the corresponding Disk Group will be marked as “Degraded” and go offline (no change here)

· If a flash capacity disk fails, the entire Disk Group will be marked as “Degraded”, but the VM(s) will still be accessible from the replica copy of data

And finally dedupe and compression are handled at the per Disk Group level, it is not “Global” in a vSAN deployment. This means there is a potential that less/larger Disk Groups have a higher dedupe/compression ratio compared to more/smaller size Disk Groups across your hosts.

Get to the Point Already

Hopefully you have stayed with me thus far as I am finally getting to the point. For the vSAN designs I typically build for customers I try to keep them streamlined and easy as possible to administer and manage. When it comes to AF-vSAN leveraging dedupe and compression my “Best Practice” is:

· Fully populate a given Disk Group

· If possible based on server format, creation of two Disk Groups per host

So, with all of this laid out what is the best approach? Well like I get to say in my job, it depends. Based on the details and ideas listed above, each configuration has its Pro’s and Con’s. You will need to weigh the operational impact/overhead of creating a single/larger sized disk group against the capacity/data savings drawback of multiple/smaller sized Disk Groups.

Single/Larger Disk Group Multiple/Smaller Disk Group
Pro’s
  • Potential for increased dedupe/compression ratio
  • Greater per host density
  • Incremental capacity/performance expansion
  • Potential for increased performance (more drives)
  • Rebuild/resync of data will be faster in case of device failure (less data per drive)
  • Ease of operational overhead for Disk Group expansion
Con’s
  • Operational overhead to expand (if not already fully populated)
  • Higher rebuild/resync times in case of device failure
  • Potential for less performance
  • Higher cost per Disk Group
  • Per host density is lower
  • Lower dedupe/compression ratio