Those of you who have followed me over the years know that I’m not shy when it comes to a good competitive dust-up. I’m OK with the usual puffery and slightly exaggerated claims. All part of the fun.
I’m not OK when I believe the claims are misleading.
One startup is working very hard to convince everyone that they (and they alone) are leading the current trend in HCI — hyperconverged infrastructure. One of their spokespeople even published a thoughtful piece listing the ten reasons why they thought they deserved the “leader” mantle.
While I admire their bravado, I felt the piece did a disservice to both the industry and to customers. I thought it grossly misrepresented both the current and future state of the market.
Perhaps most importantly, there was little talk about what mattered most to customers.
So — while staying positive — I’d like to share my "ten reasons" why I think VMware is leading — and will continue to lead — the hyperconverged marketplace.
If we’re going to have a polite argument, we ought to at least define what we’re discussing.
The first wave was “converged” infrastructure: traditional compute, storage and network products engineered to be consumed and supported as a single block of infrastructure.
The fundamental appeal was a drastically simplified customer experience, which gave IT time and resources to go do other things. VCE Vblocks established the market and validated the model, with several others following suit. As we stand today, converged infrastructure is a successful and proven model that continues to grow.
Meanwhile, a few enterprising startups created a software-based storage layer that eliminated the need for an external storage array, and dubbed themselves “hyperconverged”.
Hence our discussion today ...
#1 — Hyperconverged Is About Software, Not Hardware
Hyperconverged solutions derive their value from the hypervisor being able to support all infrastructure functions in software, and without the need for separate dedicated hardware, such as a storage array or fibre channel switch. All players in this segment would mostly agree with this statement.
If hyperconverged is really about software (and not hardware), what’s the core software technology in just about every hyperconverged product available today?
It's ubiquitous in the data center, which explains why it's ubiquitous in the hyperconverged market. A key part of the story: vSphere implements the most popular infrastructure mgmt APIs in use today.
The harsh market reality is that there’s just not a lot of demand for non-vSphere-based hyperconverged solutions.
IT professionals know vSphere -- it's tried, tested and proven -- and that's what they want.
If we could convince a few industry analysts to focus on hyperconverged software vs. counting largely similar boxes with different vendor labels on them, their picture of the landscape would be quite different.
As far as claims to "market leadership" — without the power and presence of the VMware vSphere platform, there wouldn't be a converged or hyperconverged market to argue about.
#2 — Built-In Is Better Than Bolted-On
If the value proposition of hyperconverged derives from integrating infrastructure in software, it’s reasonable to argue that deeper, structural integration will be more valuable than various software assemblages that lack this key attribute.
There shouldn’t be a need for a separate management interface.
There shouldn’t be a need for a separate storage layer that runs as a guest VM with dedicated resources, consuming precious resources and demanding attention.
There shouldn’t be a need for multiple installation / patching / upgrade processes.
There shouldn’t be a need to get support from two or more vendors.
And so on.
Within VMware, we use the term “hypervisor converged” to differentiate this important architectural difference between built-in vs. bolted-on.
I'll use vSphere + VSAN as my all-software example here. One, simple integrated environment. One management experience. One upgrade process. One source of support.
If our discussion of "market leadership" includes any notion of creating a more simple experience for users, I would argue that it’s hard to compete with features that are simple extensions of the hypervisor.
#3 — Having Lots Of Hardware Choices Is A Really Good Thing
If hyperconverged is really about software, why are many so paradoxically focused on “the box”? It’s nothing more than a convenient consumption option for someone who wants a fast time-to-value over other considerations.
Ideally, hyperconverged -- as a concept -- shouldn’t be welded to specific hardware.
For those that want a convenient consumption option such as prefab appliance with a locked-down config, great! That's certainly useful to a certain segment of the market.
But others might want a bit more flexibility, with a well-defined starting point. Yet another useful option.
And for those that really want to roll their own, a list of what is supported, augmented by tools to help you design and size a config that's right for you.
There's a vast list of reasons why more hardware choice is a good thing ...
Maybe there’s an existing investment that’s already been made.
Maybe there are requirements that aren’t satisfied well by the static configs available.
Maybe you've got a great sourcing arrangement for servers.
Maybe there’s a desire to use the latest-greatest technology, without waiting for an appliance vendor to offer it. Etc. etc.
Whatever the reason, increased hardware choice makes hyperconverged more compelling and more attractive for more people.
EVO:RAIL currently has 9 qualified EVO:RAIL partners. Virtual SAN (as part of vSphere) has dozens and dozens of ReadyNodes from server partners that can be ordered as a single SKU. And for everyone else, there's an extended HCL that allows for literally millions of potential configurations - plus the tools to figure out what's right for you.
If market leadership includes any notion of hardware choice, VMware stands apart from the rest of the hyperconverged crowd. Because, after all, it's software ...
#4 — There’s More To Enterprise IT Than Just Hyperconverged
Yes, there’s that old joke that when all you have is a hammer, everything looks like a nail :)
But there’s a more serious consideration here: when it comes to even modestly-sized IT functions, hyperconverged is only one part of a broader landscape that needs to be managed.
There’s inevitably a strong desire for common tools, processes and workflows that support the entire environment, and not just an isolated portion of it.
From an enterprise IT perspective, it's highly desirable to use the same operational framework for both virtualized and hyperconverged environments.
Going back to that controversial “market leadership” thing, how about the need for enterprise-scale management tools that aren’t limited to a single hyperconverged offering?
#5 — Customer Support Matters
If extreme simplicity is an integral part of the hyperconverged value proposition, customer support has to figure in prominently.
But there’s a structural problem here.
Not all of the hyperconverged appliance vendors have elected to be vSphere OEMs. That means that they don’t have the right to distribute the VMware software used in their product. It also means that they are not entitled to provide support for VMware software.
This arrangement has the potential to put their customers in an awkward position.
While I’m very sure all us vendors use our collective best efforts to support mutual customers, this state of affairs certainly isn’t ideal. Since all of these vendors provide a critical storage software layer, it may not be obvious where a problem actually lies.
Let’s say you have a performance problem with your hyperconverged appliance — who do you call?
The appliance vendor? VMware? Ghostbusters?
When it comes to providing customer support, VMware is typically ranked at the top (or near the top) in customer satisfaction — even though there are always potential areas for improvement. One call.
No argument: the customer support model and execution should factor into our notion of “market leadership”.
#6 — Useful Things Should Just Work
Most shops have gotten accustomed to using all the cool functionality in vSphere. And, presumably, they’d like to continue doing the same in their hyperconverged environment.
But that’s not always the case. Here’s one example ...
You’re probably familiar with vSphere HA — a great feature that automatically restarts workloads on a surviving host if there’s a failure.
In a shared storage environment, vSphere HA uses the management network to ascertain the state of the cluster, and coordinate restarts if necessary. HA assumes that external storage is always available, and all hosts can see essentially the same thing.
But what if there’s no external storage, and we’re using a distributed cluster storage layer?
While it’s true that many of the newer hyperconverged appliances set up their own logical network (primarily for storage traffic), you can see the potential problem: vSphere HA doesn’t know about the vendor's storage network, and vice versa.
Imagine if, for example, the storage network partitions and the management network doesn’t. Or if they partition differently. Sure, that’s not going to happen every day, but when it does — what exactly happens?
In the case of vSphere and VSAN, vSphere HA has been redesigned to use VSAN’s network, so there is zero chance of an inconsistent state between the two.
Let’s go for two examples, shall we?
Using vMotion to balance clusters is just about ubiquitous. You’d like to be able to move things around without screwing up application performance due to slow storage.
Well, one vendor’s attempt at “data locality” didn’t help so much. Move a VM, and performance degrades due to a design decision they made. Try and move it back, more degradation.
So another cool and useful vSphere feature now has sharp edges on it.
Not to pile on, but let's consider maintenance mode.
VMware admins routinely want to put a host in maintenance mode to work on it. All the workloads are conveniently moved to other servers, and nothing gets disrupted. But in our hyperconverged world, there's now storage to be considered.
VSAN has an elegant solution as part of the standard vSphere maintenance mode workflow -- the administrator gets a choice as to what they'd like to do with the affected data, and proceeds.
All other approaches require a separate workflow to detect and evacuate potentially affected data -- which creates not only a bit more complexity, but also that special opportunity to have a really bad day.
I’m sharing these annoying nits just to illustrate a point: a good hyperconverged environment should reasonably support the same everyday virtualization functionality and workflows you already use.
And, hopefully, with a minimum of “gotcha!”
Let’s factor that into our notion of “market leadership” as well …
#7 — Don’t Forget About Networking …
If we’re *seriously* discussing hyperconverged software, we have to ultimately consider software-defined networking in addition to compute and storage.
Otherwise, our stool only has two legs :)
The ultimate goal should be to give customers the option of running all three infrastructure functions (compute, storage, network) as an integrated stack running on their hardware of choice.
No, we’re not there yet today, but …
Converge server virtualization with both SDS and SDN, and the potential exists for even more efficiency, simplicity and effectiveness. Not to mention, a whole new set of important security-related use cases, like micro segmentation.
But to integrate SDN, you’ve got to have the technology asset.
Within the VMware world, that key asset is NSX. And while no vendor can offer a seamless integration between the three disciplines today, VMware has a clear leg up in this direction.
Dig deep into VSAN internals, and you can see progress to date. For example, VSAN works closely with NIOC to be a well-behaved citizen over shared 10Gb links. More to come.
Should hyperconverged vendors who claim market leadership have a plan for SDN and security use cases? I think so.
#8 — Is There A Compatible Hybrid Cloud Option?
Not all infrastructure wants to live in a traditional data center. There are many good reasons to want an operationally compatible hybrid cloud option like vCloud Air: cost arbitrage, disaster recover, short-term demands, etc.
Ideally, customers could have access to a hyperconverged experience that works the same -- using the same tools and workflows -- whether the hardware assets are in the data center, in a public cloud, or ideally both using the same management tools, sharing behaviors, etc.
It’d be great if the industry pundits factored this into their definition of “market leadership”. I’m not hopeful, though.
#9 — Is It Efficient?
One of the big arguments in favor of virtualization and hyperconverged approaches is efficiency: doing more with less.
Not to belabor an old argument, but there’s a certain economic appeal to hyperconverged software that uses compute and memory resources judiciously. The big motivator here for customers is better consolidation ratios for server and VDI farms. Better consolidation ratios = less money spent.
A hyperconverged storage solution that demands a monster 32 GB VM and potentially a few dedicated physical cores on each and every server gets in the way of that.
#10 — Where Do You Go From Here?
I remember clearly a meeting with a customer who introduced the purpose of the meeting: “we’re here to decide what to buy for tomorrow’s legacy”. I couldn't stop smiling :)
But it's an interesting perspective: one that reflects that IT strategy is often the result of many tactical decisions made along the way.
At one level, I’ve met IT pros who have an immediate need, and want an immediate solution. They want a handful of boxes racked up ASAP, and aren’t that concerned with what happens down the road.
Trust me, I can fully appreciate that mindset.
But there are many other IT pros who see each and every technology decision as a stepping stone to bigger and better things.
There are over a half-million IT shops who have built their data center strategy around VMware and vSphere. Every one of them already owns many of the key ingredients needed for a hyperconverged solution.
More importantly, they trust VMware to take them forward into the brave new world of IT. : virtualized, converged, hyperconverged, hybrid cloud and ultimately to a software-defined data center.
And that’s a promise we intend to keep.
Like this post? Why not subscribe via email?
Looking for a great disruption story in enterprise IT tech? I think what VSAN is doing to the established storage industry deserves to be a strong candidate.
I've seen disruptions -- small and large -- come and go. If you're into IT infrastructure, this is one worth watching.
A few years ago, I moved from EMC to VMware on the power of that prediction. So far, it’s played out pretty much as I had hoped it would. There’s now clearly a new dynamic in the ~$35B storage industry, and VMware’s Virtual SAN is very emblematic of the changes that are now afoot.
There’s a lot going on here, so it’s worth sharing. In each case, you’ll see a long-held tenent around The Way Things Have Always Been Done clearly up for grabs.
See if you agree?
I began this post by making a list of changes — deep, fundamental changes — that VSAN is starting to bring about in the storage world.
To be clear, I’m not talking so much about specific technologies, or how this vendor stacks up against that other one.
I’m really far more interested in the big-picture changes around fundamental assumptions as to “how storage is done” in IT shops around the globe: how it's acquired, how it's consumed, how it's managed.
If you’re not familiar with Virtual SAN, here’s what you need to know: it’s storage software built into the vSphere hypervisor. It takes the flash and disk drives inside of servers, and turns them into a shared, resilient enterprise-grade storage service that’s fast as heck. Along the way, it takes just about every assumption we've made about enterprise storage in the last 20 years and basically turns it on its head.
Storage Shouldn’t Have To Be About Big Boxes
Most of today’s enterprise storage market is served by external storage arrays, essentially big, purpose-built hardware boxes running specialized software. Very sophisticated, but at a cost.
If your organization needs a non-trivial amount of storage, you usually start by determining your requirements, evaluating vendors, selecting one, designing a specific configuration, putting your order in, taking delivery some time later, installing it and preparing it for use.
Big fun, right?
The fundamental act of simply making capacity ready to consume — from “I need” to “I have” — is usually a long, complex and often difficult process: best measured in months. I think the most challenging part is that IT shops have to figure out what they need well before actual demand shows up. Of course, this approach causes all manner of friction and inefficiency.
We’ve all just gotten used to it — that’s just the way it is, isn’t it? Sort of like endlessly sitting in morning commute traffic. We forget that there might be a better way.
The VSAN model is completely different. Going from “I need” to “I have” can be measured in days — or sometimes less.
For starters, VSAN is software — you simply license the CPUs where you want to use it. Or use it in evaluation mode for a while. The licensing model is not capacity-based, which is quite refreshing. That makes it as easy to consume as vSphere itself.
The hardware beneath VSAN is entirely up to you, within reason. Build a VSAN environment from hand-selected components if that’s your desire. Grab a ReadyNode if you’re in a hurry. Or go for something that’s packaged the ultimate in a simplified experience: EVO:RAIL. Choice is good.
Depending on your hardware model, getting more storage capacity is about as simple as ordering some new parts for your servers. Faster, easier, smaller chunks, less drama, etc. No more big boxes.
Yes, there is a short learning curve the first time someone goes about putting together a VSAN hardware configuration (sorry!), but — after that — there’s not much to talk about.
There are some obvious and not-so-obvious consequences from this storage model.
Yes, people can save money (sometimes really big $$$) by going this way. Parts is parts. We’ve seen plenty of head-to-head quotes, and sometimes the differences are substantial.
But there’s more that should be considered …
Consider, for example, that storage technologies are getting faster/better/cheaper all the time.
Let’s say a cool new flash drive comes out — and it looks amazing. Now, compare the time elapsed between getting that drive supported with VSAN, and getting it supported in the storage arrays you currently own.
There's a big difference in time-to-usability for any newer storage tech. And that really matters to some people.
One customer told us he likes the “fungibility” of the VSAN approach, given that clusters seem to be coming and going a lot in his world. He has an inventory of parts, and can quickly build a new cluster w/storage from his stash, tear down a cluster that isn’t being used for more parts, mix and match, etc.
Sort of like LEGOs.
Just try that with a traditional storage array.
More Performance (Or Capacity) Shouldn’t Mean A Bigger Box
A large part of storage performance comes down to the storage controllers inside the array: how many, how fast.
Add more servers that drive more workload, and you’re often looking at the next-bigger box — and all the fun that entails: acquiring the new array, migrating all your applications, figuring out what to do with the old array, etc.
Yuck. But that’s the way it’s always been, right?
VSAN works differently.
As you add servers to support more virtualized applications, at the same time you’re also adding the potential for more storage performance and capacity. A maxed-out 64 node VSAN cluster can deliver ~7m cached 4K read IOPS.
Want more performance without adding more servers? Just add another disk controller and disk group to your existing servers, or perhaps just bigger flash devices, and you’ll get one heck of a performance bump.
Without having to call your storage vendor :)
Storage Shouldn’t Need To Be Done By Storage Professionals
I suppose an argument could be made about it being best to have your taxes done by tax professionals, but an awful lot of people seem to do just fine by using TurboTax software.
There certainly are parts of the storage landscape that are difficult and arcane — and that’s where you need storage professionals. There are also an awful lot of places where a simple, easy-to-use solution will suffice quite nicely, and that’s what VSAN brings to the table.
With VSAN, storage just becomes part of what a vSphere administrator does day-to-day. No special skills required. Need a VM? Here you go: compute, network and storage. Policies drive provisioning. Nothing could really be simpler.
No real need to interact with a storage team — unless there’s something special going on.
Can't We All Just Work Together?
Any time you get a team greater than a handful of people, people split up into different roles. The classic pattern in enterprise IT infrastructure has a dedicated server team, a dedicated network team, a storage team, etc.
The vSphere admins are usually dependent on the others to do basic things like provision, troubleshoot, etc. For some reason, I’ve observed particular friction between the virtualization team and the storage team. As in people on both sides pulling their hair out.
Many virtualization environments move quickly: spinning up new apps and workloads, reconfiguring things based on new requirements — every day (or every hour!) brings something new.
That’s what virtualization is supposed to do — makes things far more flexible and liquid.
When that world bumps up against a traditional storage shop that thinks in terms of long planning horizons and careful change management — well, worlds collide.
With VSAN, vSphere admins can be self-sufficient for most of their day-to-day requirements. No storage expertise required. Of course, there will always be applications that can justify an external array, and the team that manages it.
It’s just that there will be less of that.
Storage Software Is Now Not Just Another Application
The idea of doing storage in software is not new. The idea of building a rich storage subsystem into a hypervisor is new. And, when you go looking, there are plenty of software storage products that run as an application, also known as a VSA or virtual storage appliance.
In this VSA world, your precious storage subsystem is now just another application. It competes for memory and CPU like all other applications, but with one exception: when it gets slow, everything that uses it also gets slow.
We’re talking about storage, remember?
And the resource requirements needed to ensure adequate storage performance using a VSA approach can be considerable. Very healthy amounts of RAM, lots of CPU. Nom, nom -- a monster VM? That approach makes your servers bigger, your virtualization consolidation ratios poorer, or both.
Once again, VSAN does things differently.
Because it’s built into the hypervisor, its resource requirements are quite reasonable. It doesn’t have to compete with other applications, because it isn’t a standalone application like a VSA is. Your servers can be smaller, your virtualization consolidation ratios better — or both.
Why do I think this will change things going forward?
Because VSAN now establishes the baseline for what you should expect to get with your hypervisor. Any vendor selling a VSA storage product as an add-on has to make a clear case as to why their storage thingie is better than what already comes built into vSphere.
Not only in justifying the extra price, but also the extra resources as well as the extra management complexity. Clearly, there are cases where this can be done, but there aren’t as many as before.
And that’s going to put a lot of pressure on the vendors who use a VSA-based approach.
The Vendor Pecking Order Changes
The last wave of storage hardware vendors were all array manufacturers — they got all the attention. In this wave, the storage component vendors are finding some new love.
As a good example, the flash vendors such as SanDisk and Micron are starting to do a great job marketing directly to VSAN customers. Why? A decent proportion of a VSAN config goes into flash, and how these devices perform affects the entire proposition.
This new-found stardom is not lost on them — especially as we start with all-flash configurations.
At one time, there was a dogfight between FC HBA vendors who wanted to attach to all the SANs that were being built. In this world, it’s the storage IO controller vendor. Avago (formerly LSI) as well as some of their newer competitors are aware that there’s a new market forming here, and realizing they can reach end users directly vs. being buried in an OEM server configuration.
There’s A Lot Going On In Storage Right Now …
We’ve seen one shift already from disk to flash — that much is clear. Interesting, but — at the end of the day, all we were really doing was replacing one kind of storage media with another.
What I’m seeing now has the potential to be far more structural and significant. Now up for grabs is the fundamental model of "how storage is done" in IT shops large and small.
An attractive alternative to the familiar big box arrays of yesterday.
Storage being specified, acquired, consumed, delivered and managed by the virtualization team, with far less dependence on the traditional storage team.
Storage being consumed far more conveniently than before.
Storage software embedded in the hypervisor having strong architectural advantages over other approaches.
Storage being able to pick up all the advances in commodity-oriented server tech far faster than the array vendors.
Component vendors becoming far more important than before.
And probably a few things I forgot as well :)
Yes, I work for VMware. And VSAN is my baby.
But there’s a reason I chose this gig — I thought VMware and VSAN were going to be responsible for a lot of healthy disruptive changes in the storage business. Customers would win as a result.
And, so far, that’s been exactly the case.
Like this post? Why not subscribe via email?
From the time enterprise data centers sprang into existence, we’ve had this burning desire to automate the heck out of them.
From early mainframe roots to today’s hybrid cloud, the compulsion never wanes to progressively automate each every aspect of operations.
The motivations have been compelling: use fewer people, faster responses, be more efficient, make outcomes more predictable, and make services resilient.
But the obstacles have also been considerable: both technological and operational.
With the arrival of vSphere 6.0, a nice chunk of new technology has been introduced to help automate perhaps the most difficult part of the data center – storage.
It's worth digging into these new storage automation features: why they are needed, how they work, and why they should be seriously considered.
Automating storage in enterprise data centers is most certainly not a new topic.
Heck, it's been around as least as long as I have, and that's a long time :)
Despite decades of effort by both vendors and enterprise IT users, effective storage automation still is an elusive goal for so many IT teams.
When I'm asked "why is this so darn hard?", here's what I point to:
- Storage devices had very limited knowledge of applications: their requirements, and their data boundaries. Arrays had to be explicitly told what to do, when to do it and where it needed to be done.
- Cross-vendor standards failed to emerge that facilitated basic communications between the application’s requirements and the storage array’s capabilities.
- Storage arrays (and their vendors) present a storage-centric view of their operations, making it difficult for non-storage groups to easily request new services, and ascertain if end-to-end application requirements were being met.
Here's the message: the new storage capabilities available in vSphere 6.0 show strong progress towards addressing each of these long-standing challenges.
Towards Application Centricity
Data centers exist solely to deliver application services: capacity, performance, availability, security, etc.
To the extent that each aspect infrastructure can be made programmatically aware of individual application requirements, far better automation can be achieved.
However, when it comes to storage, there have been significant architectural challenges in achieving this.
The first challenge is that applications themselves typically don’t provide specific instructions on their individual infrastructure requirements. And asking application developers to take on this responsibility can lead to all sorts of unwanted outcomes.
At a high level, what is needed is a convenient place to specify application policies that can be bound to individual applications, instruct the infrastructure as to what is required, and be conveniently changed when needed.
The argument is simple: the hypervisor is in a uniquely privileged position to play this role. It not only hosts all application logic, but abstracts that application from all of infrastructure: compute, network and storage.
While these policy concepts have been in vSphere for a while, in vSphere 6.0 a new layer of storage policy based management (SPBM) is introduced. This enables administrators to describe specific storage policies, associate them with groups of applications, and change them if needed.
But more is needed here.
Historically, storage containers have not aligned with application boundaries. External storage arrays have historically presented LUNs or file systems - large chunks of storage shared by many applications.
Storage services (capacity, performance, protection, etc.) were specified at the large container level, with no awareness of individual application boundaries.
This mismatch has resulted in both increased operational effort and reduced efficiency.
Application and infrastructure teams need to go continually back and forth with the storage team regarding application requirements. And storage teams are forced to compromise by creating storage service buckets specified in excess of what is actually required by applications. Better to err on the side of safety, right?
No longer. vSphere 6.0 introduces a new storage container – Virtual Volumes, or VVOLs – that precisely aligns application boundaries and the storage containers they use. Storage services can now be specified on a per-application, per-container basis.
We now have two key pieces of the puzzle: the ability to conveniently specify per-application storage policy (as part of overall application requirements), and the ability to create individualized storage containers that can precisely deliver the requested services without affecting other applications.
So far, so good.
Solving The Standards Problem
Periodically, the storage industry attempts to define meaningful, cross-vendor standards that facilitate external control of storage arrays. However, practical success has been difficult to come by.
Every storage product speaks a language of one: not only in the exact set of APIs it supports, but how it assigns meaning to specific requests, and communicates results. Standard definitions what exactly a snap means, for example, are hard to come by.
The net result is that achieving significant automation of multi-vendor storage environments has been extremely difficult for most IT organizations to achieve.
To be clear, the need for heterogeneous storage appears to be increasing, and not decreasing: enterprise data centers continue to be responsible for supporting an ever-widening range of application requirements: from transaction processing to big data to third platform applications. No one storage product can be expected meet every application requirement (despite vendor's best intents) multiple types are frequently needed.
De-facto standards can be driven by products that are themselves de-facto standards in the data center, and here vSphere stands alone with regards to hypervisor adoption. When VMware defines a new standard for interacting with the infrastructure (and customers adopt it), vendors typically respond well.
vSphere 6.0 introduces a new set of storage APIs (VASA 2.0) that facilitate a standard method of application-centric communication with external storage arrays. VMware’s storage partners have embraced this standard enthusiastically, with several implementations available today and more coming.
Considering VASA 2.0 together with SPBM and VVOLs, one can see that many of the technology enabling pieces are now in place for an entirely new storage automation approach. Administrators can now specify application-centric storage policies via SPBM, communicate them to arrays via VASA 2.0, and receive a perfectly aligned storage container – a VVOL. Nice and neat.
Who Should (Ideally) Manage Storage?
It’s one thing to conveniently specify application requirements, it’s another thing to ensure that the requested service levels are being met, and – more importantly – how to fix things quickly when that’s not happening.
Historically, the storage management model has evolved in many IT organizations to be essentially a largely self-contained organizational “black box”. Requests and trouble tickets are submitted with poor visibility to other teams who depend greatly on the storage team’s services.
Although this silo model routinely causes unneeded friction and inefficiency (not to mention frustration all around), it can be particularly painful is in resolving urgent performance problems: is the problem in the application logic, the server, the network – or storage?
The storage management model created by vSphere 6.0 is distinctly different than traditional models: storage teams are still important, but more information (and responsibility) is given to the application and infrastructure teams in controlling their destiny.
Virtual administrators now see “their” abstracted storage resources: what’s available, what it can do, how it’s being used, etc. There should be no need to directly interact with the storage team for most day-to-day provisioning requirements. Policies are defined, VVOLs are consumed, storage services are delivered.
Through vCenter and the vRealize suite, virtual administrators now have enough storage-related information to ascertain the health and efficiency of their entire environments, and have very focused conversations with their storage teams if there’s an observed issue.
Storage teams still have an important role, although somewhat different than in the past. They now must ensure sufficient storage services are available (capacity, performance, protection, etc.), and resolve problems if the services aren’t working as advertised.
However, operational and organizational models can be highly resistant to change. That's the way the world works -- unless there is a forcing function that makes the case compelling to all parties.
And VSAN shows every sign of being a potential change accelerator.
How Virtual SAN Accelerates Change
As part of vSphere 5.5U1, VMware introduced Virtual SAN, or VSAN. Storage services can now be delivered entirely using local server resources -- compute, flash and disk – using native hypervisor capabilities. There is no need for an external storage array when using VSAN – nor a need for a dedicated storage team, for that matter.
VSAN is designed to be installed and managed entirely by virtual administrators independently of interaction with the storage team. These virtualization teams can now quickly configure storage resources, create policies, tie them to applications, monitor the results and speedily resolve potential problems – all without leaving the vSphere world.
As an initial release, VSAN 5.5 had limited data services, and thus limited use cases. VSAN 6.0 is an entirely different proposition: more performance (both using a mix of flash and disk, or using all-flash), new enterprise-class features, and new data services that can significantly encroach on the turf held by traditional storage arrays.
Empowered virtualization teams now have an interesting choice with regards to storage: continue to use external arrays (and the storage team), use self-contained VSAN, or most likely an integrated combination depending on requirements.
Many are starting to introduce VSAN alongside traditional arrays, and have thus seen the power of a converged, application-centric operational model. And it’s very hard to go back to the old way of doing things when the new way is so much better -- and readily at hand.
The rapid initial growth of VSAN shows the potential of putting a bit of pressure on traditional storage organizations to work towards a new operational model, with improved division of responsibilities between application teams, infrastructure teams and storage teams. And they'll need the powerful combination of SPBM, VASA 2.0 and VVOLs to make that happen.
Change Is Good -- Unless It's Happening To You
I have spent many, many years working with enterprise storage teams. They have a difficult, thankless job in most situations. And there is no bad day in IT quite like a bad storage day.
Enterprise IT storage teams have very specific ways of doing things, arguably built on the scar tissue of past experiences and very bad days. You would too, if you were them.
That being said, there is no denying the power of newer, converged operational models and the powerful automation that makes them so compelling. The way work gets done can -- and will -- change.
Enterprise storage teams can view these new automation models as either a threat, or an opportunity.
I know which side of that debate I'd be on.
Like this post? Why not subscribe via email?
In the IT biz, all forms of converged infrastructure are now the rage.
Rightfully so: their pre-integrated nature and single-support model eliminates much of the expensive IT drudgery that doesn’t usually create significant value: selecting individual components, integrated them, supporting them, upgrading them, etc.
How much easier is it to order a block, brick, node, etc. of IT infrastructure as a single supportable product, and move on to more important matters?
A lot easier, it seems ...
Reference architectures have been around for ages. I think of them as a blueprints for building a car, and not like buying one. Some assembly required. Useful, yes, but there’s room for more.
VCE got the party started years back with Vblocks: pre-integrated virtualized infrastructure, sold and supported as a single product — with their success to be quickly followed by other vendors who saw the same opportunity.
A group of smaller vendors took the same idea, but did storage in software vs. requiring an external array, dubbing themselves “hyper-converged”: Nutanix, Simplivity and others. They, too, have seen some success.
Last August, VMware got into this market in a big way by introducing EVO:RAIL — an integrated software product that — when combined with a specific hardware reference platform from an OEM partner — delivered an attractive new improvement over the first round of hyper-converged solutions.
While EVO:RAIL had several partners who offered immediate availability, EMC decided to take their time, and do something more than simply package EVO:RAIL with the reference hardware platform.
Today, we get to see what they’ve been working on — VSPEX BLUE. It’s not just another EVO:RAIL variant, it’s something more.
And, from where I sit, it’s certainly been worth the wait …
The Lure Of Convergence
Think of IT as a black box: money and people go in one side, IT services come out the other side. To the people who actually pay for IT organizations, this is not an entirely inaccurate representation.
If you’re responsible for that black box, you continually ask yourself the question: what should I have my people be spending time on? That’s the expensive bit, after all.
While there are many technical staffers out there who thoroughly enjoy hand-crafting IT infrastructure with the “best” components, it’s hard to point to situations where that activity actually creates meaningful value. Worse, those hand-crafted environments create a long integration and support shadow, lasting many years after the decision has been made.
More and more IT leaders are coming around to the perspective that they want their teams to be focused on using IT infrastructure to deliver IT services, and not excessively bound up in design, integration, support and maintenance.
If it doesn't add value, don't do it.
This is the the fundamental appeal of all forms of converged infrastructure: spend less time and effort on the things that don’t matter much, so your team can spend more time on the things that do matter. The approach is now broadly understood, and the market continues to grow and expand.
Of course, IT infrastructure isn’t one-size-fits-all. Any debate around what’s “best” really should be “what’s best for you?”.
I think of EVO:RAIL as a fully-integrated vSphere stack with value-added management tools, packaged for use as an appliance. The idea is to make the user experience as simple and predictable as possible.
Software is only as good as its hardware, so the EVO:RAIL program provides a narrow specification for hardware partners: 4-node bricks, prescribed processors, memory and controllers, disk bays, etc. Partner value-add and differentiation comes in the form of additional software and services above and beyond a stock EVO:RAIL configuration.
Strategically, I think of EVO:RAIL as well within the “software defined” category: compute, network and storage. In particular, EVO:RAIL is built on VSAN — the product I’ve been deeply involved with recently.
The hardware is an EMC-specific design, and not a rebadge. If you've been following EVO:RAIL, the design and specs will look very familiar.
VSPEX BLUE will be sold through distributors, who set street pricing. For those of you who are interested in pricing, you'll have to contact a reselling partner.
When evaluating any new EVO:RAIL-based offering, the key question becomes — what’s unique? After all, the base hardware/software is specified to be near-identical across different partner offerings.
In EMC’s case, there is quite a list of differentiated features to consider, so let’s get started, shall we?
VSPEX BLUE Manager
While EVO:RAIL itself has a great user-centric manager to simplify things, EMC has taken it one step further by making a significant investment in their own VSPEX BLUE manager that complements and extends the native EVO:RAIL one in important ways.
First, the VSPEX BLUE Manager provides a portal to any and all support services: automatic and non-disruptive software upgrades, online chat with EMC support, knowledge base, documentation and community.
All in one place.
Second, the VSPEX BLUE Manager is “hardware aware”. It can display the physical view of your VSPEX BLUE appliances, and indicate — for example — which specific component might need to be replaced.
It also directly monitors low-level hardware status information, and feeds it upwards: to vCenter and LogInsight, as well as to EMC support if needed.
More importantly, this feature is tied into EMC’s legendary proactive “phone home” support (ESRS) which means you might be notified of a problem directly by EMC, even though you haven’t checked the console in a while :)
The VSPEX BLUE Manager also manages the software inventory. It discovers available updates, and non-disruptively installs them if you direct it to. More intriguing, there’s the beginnings of a “software marketplace” with additional software options from EMC, with presumably more coming from other vendors as well.
Part of any potential EVO:RAIL value-add is additional software, and here EMC has been very aggressive. Three fully functional and supported software packages are included with every VSPEX BLUE appliance, but with a few restrictions.
First up is EMC’s RecoverPoint for VMs. Those of you who have followed me for a while know that I’m a huge fan of RecoverPoint: great technology and functionality, very robust and now runs nicely in vSphere (and on top of VSAN) protecting on a per-VM basis.
The limitation is 15 protected VMs per four-node appliance. More appliances, more protected VMs. Since full functionality is provided, your choice of protection targets is wide open: another VSPEX BLUE appliance, or something else entirely.
Next up is the CloudArray cloud storage extender, based on EMC’s recent TwinStrata acquisitions. CloudArray can present file shares or iSCSI to applications requiring extra capacity, or potentially a file share between multiple applications — something VSAN doesn’t do today.
The back-end store can be any compatible object storage: your choice of cloud, an on-prem object store, etc. The included license is for 10TB of external capacity per appliance, not including the actual physical capacity.
And finally, VMware’s VDP backup software (based on DataDomain/Avamar technology) is included. The upgrade is to the full-boat VDP-A. Stay tuned, though.
What I Like
For starters, EMC’s offering is entirely based on VSAN for storage. There is no packaging with an external array, as NetApp did. Since my world is very VSAN-centric these days, that’s a huge statement, coming from the industry’s largest and most successful storage vendor.
VSPEX BLUE Manager is a great piece of work, and adds significant value. The fact that EMC supports the entire environment online (just like their arrays and other products) is a big differentiator in the real world. The software bundle is attractive as well; demonstrating EMC’s commitment to the product and making it stand out in the market.
And then there’s the fact that EMC is obviously jumping into the hyper-converged marketplace with both feet. Thousands of trained field people and a hugely influential partner network. Global distribution and support. An army of skilled professional services experts. A very proficient marketing machine. A large and successful VSPEX channel. The proven ability to move markets.
Those that just focus on the product itself will miss the bigger context here.
What’s Next For The Hyper-Converged Market?
If we’re being honest, the smaller startups had the nascent hyper-converged market to themselves in the early days.
Good for them.
But now the big boys are starting to jump in with vigor: first VMware with EVO:RAIL, and now EMC itself with VSPEX BLUE.
One thing is for sure, the future won’t be like the past :)
Like this post? Why not subscribe via email?
As part of the vSphere 6.0 announcement festivities, there’s a substantially updated new version of Virtual SAN 6.0 to consider and evaluate.
Big news in the storage world, I think.
I have been completely immersed in VSAN for the last six months. It's been great. And now I get to publicly share what’s new and — more importantly — what it means for IT organizations and the broader industry.
If there was a prize for the most-controversial storage product of 2014, VSAN would win. In addition to garnering multiple industry awards, it’s significantly changed the industry's storage discussion in so many ways.
Before VSAN, shared storage usually meant an external storage array. Now there’s an attractive alternative — using commodity components in your servers, with software built into the world’s most popular hypervisor.
While the inevitable “which is better?” debate will continue for many years, one thing is clear: VSAN is now mainstream.
This post is a summary of the bigger topics: key concepts, what’s new in 6.0, and a recap of how customers and industry perspective has changed. Over time, I’ll unpack each one in more depth — as there is a *lot* to cover here.
The Big Idea
Virtual SAN extends the vSphere hypervisor to create a cluster-wide shared storage service using flash and disk components resident in the server.
It does not run “on top of” vSphere, it’s an integral component. All sorts of design advantages result.
In addition to a new hardware and software model, VSAN introduced a new management model: policy-based storage management. Administrators specify the per-VM storage policy they’d like (availability, performance, thin provisioning, etc.), and VSAN figures out the rest. Change the policy, VSAN adapts if the resources are there.
The marketing literature describes it as “radically simple”, and I’d have to agree.
During 2014, VSAN 5.5 exceeded each and every internal goal we set out: performance, availability, reliability, customer adoption, roadmap, etc. A big congratulations to a stellar team!
Of course, now we need to set much more aggressive goals :)
Quite a bit, really. The engineering team is accelerating their roadmap, and I couldn’t be more pleased. All of this should be available when vSphere 6.0 goes GA.
VSAN 6.0 now supports all-flash configurations using a two-tiered approach. The cache tier must a be write-endurant (e.g. not cheap) flash device; capacity can be more cost-effective (and less write-endurant) flash.
With all-flash configurations, cache is not there to accelerate performance; it’s there to minimize write activity to the capacity layer, extending its life.
Note: IOPS quoted here are 4K, 70r/30r mixes. As always, your mileage may vary.
Performance is extreme, and utterly predictable as you’d expect. Sorry, no dedupe or compression quite yet. However -- and this is a big however -- the write-caching scheme permits the use of very cost-effective capacity flash without burning it out prematurely. Everyone's numbers are different, but it's a close call as to which approach is more cost-effective: (a) more expensive capacity flash with dedupe/compression, or (b) more inexpensive capacity flash without dedupe/compression.
Please note that all-flash support is a separate license for VSAN.
New file system, new snaps
Using the native snapshots in vSphere 5.5 was, well, challenging. A new on-disk filesystem format is introduced in VSAN 6.0 that’s faster and more efficient, derived from the Virsto acquisition. Snaps are now much faster, more space-efficient and perform better when used -- and you can take many more of them. Up to 32 snaps per VM are now supported, if you have the resources.
VSAN 5.5 users can upgrade to 6.0 bits without migrating the file format, but won’t get access to some of the new features until they do. The disk format upgrade is rolling and non-disruptive, one disk group at a time. Rollbacks and partial migrations are supported. Inconvenient, but unavoidable.
Bigger and better — faster, too!
Support is included for VMDKs up to 62TB, same as the vSphere 6.0 max. VSAN clusters can have as many as 64 nodes, same as vSphere 6.0.
The maximum number of VMs on a VSAN node is now 200 for both hybrid and all-flash configs (twice that of VSAN 5.5), with a new maximum of 6400 VMs per VSAN cluster.
More nodes means more potential capacity, up to 64 nodes, 5 disk groups per node and 7 devices per disk group, or 2240 capacity devices per cluster. Using 4TB drives, that’s a humble ~9 petabytes raw or so.
The marketing statement is that VSAN 6.0 in a hybrid configuration (using flash as cache and magnetic disk for capacity) offers twice the performance than VSAN 5.5. I want to come back later and unpack that assertion in more detail, but VSAN 6.0 is noticeably faster and more efficient for most workloads. Keep in mind that VSAN 5.5 was surprisingly fast as well.
And, of course, the all-flash configurations are just stupidly fast.
Go nuts, folks.
Fault domains now supported
VSAN 5.5 was not rack-aware, VSAN 6.0 is. When configuring, you define a minimum of 3 fault domains to represent which server is in which rack. After this step, VSAN will be smart enough to distribute redundancy components across racks (rather than within racks) if you tell it to.
Note: this is not stretched clusters — yet.
New VSAN Health Services
Not surprisingly, VSAN is dependent on having the correct components (hardware, driver, firmware) as well as a properly functioning network. The majority of our support requests to date have been related to these two external issues.
VSAN 6.0 now includes support for a brand new Health Services tool that can be used to diagnose most (but not all) of the external environmental issues we've encountered to date. Log collection is also simplified in the event you need VMware customer service support.
A must for any serious VSAN user.
Lots and lots of little things now make day-to-day life with VSAN easier.
The UI now does a better job of showing you the complete capacity picture at one go. There’s a nifty screen that helps you map logical devices to physical servers. You can blink drive LEDs to find the one you’re interested in. Individual capacity drives or disk groups can be evacuated for maintenance or reconfiguration. There’s a new proactive rebalancing feature.
A default storage policy is now standard, which of course can be edited to your preferences. There’s a new screen that shows resynchronization progress of rebuilding objects. There’s a new “what if” feature that helps you decide the impact of a new storage policy ahead of time.
For larger environments, vRealize Automation and vRealize Operations integration has been substantially improved — VSAN is now pre-configured to raise selected status alarms (VOBs) to these tools if present.
And much more.
Behind the scenes
There’s been a whole raft of behind-the-scenes improvements that aren’t directly visible, but are still important.
VSAN is now even more resource-efficient (memory, CPU, disk overhead) than before, allowing higher consolidation ratios, among other things. VSAN’s resource efficiency is a big factor for anyone looking at software-delivered storage solutions as part of their virtualized environment, helping folks achieve even higher consolidation ratios.
The rebuild prioritization could be a bit aggressive in VSAN 5.5; it now plays much more nicely with other performance-intensive applications. Per-node component counts have been bumped up from 3000 to 9000, and there’s a new quorum voting algorithm that uses far fewer witnesses than before. As a result, there’s much less need to keep an eye on component usage.
VSAN 6.0 requires a minimum of vCenter 6.0 and ESXi 6.0 on all hosts. As mentioned before, you can defer the file system format conversion until later, but no mix and matching other than that.
If you’d like to shortcut the build-your-own experience, that’s what VSAN ReadyNodes are for. Many new ones will be announced shortly, some with support for hardware checksums and/or encryption.
ReadyNodes can either be ordered directly from the vendor using a single SKU, or use them as a handy reference in creating your own configurations.
Or skip all the fun, and go directly to EVO:RAIL
See something missing?
Inevitably, people will find a favorite feature or two that's not in this release. I have my own list. But don't be discouraged ...
I can't disclose futures, but what I can point to is the pace of the roadmap. VSAN 5.5 has been out for less than a year, and now we have a significant functionality upgrade in 6.0. Just goes to show how quickly VMware can get new VSAN features into customers' hands.
The Customer Experience
By the end of 2014, there were well over 1000 paying VSAN 5.5 customers. Wow. Better yet, they were a broad cross-sample of the IT universe: different sizes, different industries, different use cases and different geographies.
For the most part, we succeeded in exceeding their expectations: radically simple, cost-effective, blazing performance, reliable and resilient, etc.
One area we took some heat on with VSAN 5.5 was being a bit too conservative on the proposed initial use cases: test and dev, VDI, etc.
Customers wondered why we were holding back, since they were having such great experiences in their environment.
OK, call us cautious :)
With VSAN 6.0, there are almost no caveats: bring on your favorite business-critical workloads with no reservations.
Another area where we’re working to improve? In the VSAN model, the customer is responsible for sizing their environment appropriately, and sourcing components/drivers/firmware that are listed on the VMware Compatibility Guide (VCG) for VSAN.
Yes, we had a few people who didn’t follow the guidelines and had a bad experience as a result. But a small number of folks did their best, and still had some unsupported component inadvertently slip into their configuration. Not good. Ideally, we’d be automatically checking for that sort of thing, but that’s not there — yet.
So the admonishments to religiously follow the VSAN VCG continue.
If I’m being critical, we probably didn’t do a great job explaining some of the characteristics of three-node configurations, which are surprisingly popular. In a nutshell, a three node config can protect against one node failure, but not two. If one node fails, there are insufficient resources to re-protect (two copies of data plus a witness on the third node) until the problem is resolved. This also means you are unprotected from a failure during maintenance mode when only two nodes are available.
Some folks are OK with this, some not.
Four-node configs (e.g. EVO: RAIL) have none of these constraints. Highly recommended :)
The Industry Experience
Storage — and storage technology — has been around for a long time, so there’s an established orthodoxy that some people adhere to. VSAN doesn’t necessarily follow that orthodoxy, which is why it’s disruptive — and controversial.
There was a lot of head-scratching and skepticism when VSAN was introduced, but I think by now most of the industry types have gotten their head wrapped around the concept.
Yes, it’s highly available. Yes, it offers great performance. No, the world won’t end because people are now using server components and hypervisor software to replace the familiar external storage array. And there is plenty of real-world evidence that it works as advertised.
However, a few red herrings did pop up during 2014 that are worth mentioning.
One thread was around why people couldn’t use any hardware that might be handy to build a VSAN cluster. The rationale is the same as why array vendors won’t let you put anything unsupported in their storage arrays — the stuff might not work as expected, or — in some cases — is already known not to work properly.
If you don’t follow the VCG (vSphere Compatibility Guide), we’re awfully limited in the help we can provide. And there are some truly shoddy components out there that people have tried to use unsuccessfully.
Another thread was from a competitor around the attractiveness of data locality.
The assertion was that it made performance sense to keep data and application together on the same node, with absolutely no evidence to support the claim.
Keep in mind that, even with this scheme, writes still have to go to a separate node, and any DRS or vMotion will need its data to follow. And that’s before you consider the data management headaches that could result by trying to hand place the right data on the right node all the time.
Hogwash, in my book.
VSAN creates a cluster resource that is uniformly accessible by all VMs. DRS and/or vMotions don’t affect application performance one bit. Thankfully, the competitor dropped that particular red herring and went on to other things.
A related thread was the potential attractiveness of client-side caching vs. VSAN’s server-side caching. A 10Gb network is plenty fast, and by caching only one copy of read data cluster-wide, it’s far more space efficient and thus there’s a much greater likelihood that a read request will come from cache vs. disk. Our internal benchmarks continually validate this design decision.
A more recent thread from the networking crowd wasn’t happy with the fact that VSAN uses multicast for cluster coordination, e.g. for all nodes to stay informed on current state. It’s not used to move data.
Do we have cases where customers haven’t set up multicast correctly? Yes, for sure. Once it gets set up correctly, does it usually run without a problem? Yes, for sure.
There was also the predictable warning that VSAN 5.5. was a “1.0” product, which was essentially a true statement. That being said, I’ve been responsible for bringing dozens of storage products to market over the decades, and — from my perspective — there was very little that was “1.0” about it.
And I’ve got the evidence to back it up.
Perhaps the most perplexing thread was the specter of the dreaded “lock in” resulting from VSAN usage. To be fair, most external storage arrays support all sorts of hypervisors and physical hosts, and Virtual SAN software is for vSphere, pure and simple.
Enterprise IT buyers are quite familiar with products that work with some things, but not others. This is not a new discussion, folks. And it seems like the vSphere-only restriction is AOK for many, many people.
The Big Picture?
Using software and commodity server components to deliver shared storage services is nothing new in the industry. We’ve been talking about this in various forms for many, many years.
But if I look back, this was never an idea that caught on — it was something you saw people experimenting with here and there, with most of the storage business going to the established storage array vendors.
With VSAN’s success, I’d offer this has started to change.
During 2014, VSAN has proven itself to be an attractive, mainstream concept that more and more IT shops are embracing with great success.
And with VSAN 6.0, it just gets even more compelling.