Considering The Next Wave Of Storage Automation and more...


Considering The Next Wave Of Storage Automation

HotheadFrom the time enterprise data centers sprang into existence, we’ve had this burning desire to automate the heck out of them.

From early mainframe roots to today’s hybrid cloud, the compulsion never wanes to progressively automate each every aspect of operations.

The motivations have been compelling: use fewer people, faster responses, be more efficient, make outcomes more predictable, and make services resilient.

But the obstacles have also been considerable: both technological and operational.

With the arrival of vSphere 6.0, a nice chunk of new technology has been introduced to help automate perhaps the most difficult part of the data center – storage.

It's worth digging into these new storage automation features: why they are needed, how they work, and why they should be seriously considered.


Background

Automation_oldAutomating storage in enterprise data centers is most certainly not a new topic.

Heck, it's been around as least as long as I have, and that's a long time :)

Despite decades of effort by both vendors and enterprise IT users, effective storage automation still is an elusive goal for so many IT teams.

When I'm asked "why is this so darn hard?", here's what I point to:

  • Storage devices had very limited knowledge of applications: their requirements, and their data boundaries. Arrays had to be explicitly told what to do, when to do it and where it needed to be done.
     
  • Cross-vendor standards failed to emerge that facilitated basic communications between the application’s requirements and the storage array’s capabilities.

     
  • Storage arrays (and their vendors) present a storage-centric view of their operations, making it difficult for non-storage groups to easily request new services, and ascertain if end-to-end application requirements were being met.

Here's the message: the new storage capabilities available in vSphere 6.0 show strong progress towards addressing each of these long-standing challenges.

Towards Application Centricity

Application_centerData centers exist solely to deliver application services: capacity, performance, availability, security, etc.

To the extent that each aspect infrastructure can be made programmatically aware of individual application requirements, far better automation can be achieved.

However, when it comes to storage, there have been significant architectural challenges in achieving this.

The first challenge is that applications themselves typically don’t provide specific instructions on their individual infrastructure requirements.  And asking application developers to take on this responsibility can lead to all sorts of unwanted outcomes.

At a high level, what is needed is a convenient place to specify application policies that can be bound to individual applications, instruct the infrastructure as to what is required, and be conveniently changed when needed.

Whatisvirt21The argument is simple: the hypervisor is in a uniquely privileged position to play this role. It not only hosts all application logic, but abstracts that application from all of infrastructure: compute, network and storage.

While these policy concepts have been in vSphere for a while, in vSphere 6.0 a new layer of storage policy based management (SPBM) is introduced. This enables administrators to describe specific storage policies, associate them with groups of applications, and change them if needed.

But more is needed here.

Storage_containersHistorically, storage containers have not aligned with application boundaries.  External storage arrays have historically presented LUNs or file systems - large chunks of storage shared by many applications.

Storage services (capacity, performance, protection, etc.) were specified at the large container level, with no awareness of individual application boundaries.

This mismatch has resulted in both increased operational effort and reduced efficiency.

Application and infrastructure teams need to go continually back and forth with the storage team regarding application requirements. And storage teams are forced to compromise by creating storage service buckets specified in excess of what is actually required by applications.  Better to err on the side of safety, right?

VVOLs-ArchNo longer. vSphere 6.0 introduces a new storage container – Virtual Volumes, or VVOLs – that precisely aligns application boundaries and the storage containers they use. Storage services can now be specified on a per-application, per-container basis.

We now have two key pieces of the puzzle: the ability to conveniently specify per-application storage policy (as part of overall application requirements), and the ability to create individualized storage containers that can precisely deliver the requested services without affecting other applications.

So far, so good.

Solving The Standards Problem

Periodically, the storage industry attempts to define meaningful, cross-vendor standards that facilitate external control of storage arrays. However, practical success has been difficult to come by.

Standards2Every storage product speaks a language of one: not only in the exact set of APIs it supports, but how it assigns meaning to specific requests, and communicates results.  Standard definitions what exactly a snap means, for example, are hard to come by.

The net result is that achieving significant automation of multi-vendor storage environments has been extremely difficult for most IT organizations to achieve.

To be clear, the need for heterogeneous storage appears to be increasing, and not decreasing: enterprise data centers continue to be responsible for supporting an ever-widening range of application requirements: from transaction processing to big data to third platform applications. No one storage product can be expected meet every application requirement (despite vendor's best intents) multiple types are frequently needed.

GreatDe-facto standards can be driven by products that are themselves de-facto standards in the data center, and here vSphere stands alone with regards to hypervisor adoption.  When VMware defines a new standard for interacting with the infrastructure (and customers adopt it), vendors typically respond well.

vSphere 6.0 introduces a new set of storage APIs (VASA 2.0) that facilitate a standard method of application-centric communication with external storage arrays. VMware’s storage partners have embraced this standard enthusiastically, with several implementations available today and more coming.

Considering VASA 2.0 together with SPBM and VVOLs, one can see that many of the technology enabling pieces are now in place for an entirely new storage automation approach. Administrators can now specify application-centric storage policies via SPBM, communicate them to arrays via VASA 2.0, and receive a perfectly aligned storage container – a VVOL.  Nice and neat.

Who Should (Ideally) Manage Storage?

It’s one thing to conveniently specify application requirements, it’s another thing to ensure that the requested service levels are being met, and – more importantly – how to fix things quickly when that’s not happening.

ControllingHistorically, the storage management model has evolved in many IT organizations to be essentially a largely self-contained organizational “black box”. Requests and trouble tickets are submitted with poor visibility to other teams who depend greatly on the storage team’s services.

Although this silo model routinely causes unneeded friction and inefficiency (not to mention frustration all around), it can be particularly painful is in resolving urgent performance problems: is the problem in the application logic, the server, the network – or storage?

The storage management model created by vSphere 6.0 is distinctly different than traditional models: storage teams are still important, but more information (and responsibility) is given to the application and infrastructure teams in controlling their destiny.

PullingVirtual administrators now see “their” abstracted storage resources: what’s available, what it can do, how it’s being used, etc. There should be no need to directly interact with the storage team for most day-to-day provisioning requirements. Policies are defined, VVOLs are consumed, storage services are delivered.

Through vCenter and the vRealize suite, virtual administrators now have enough storage-related information to ascertain the health and efficiency of their entire environments, and have very focused conversations with their storage teams if there’s an observed issue.  

Storage teams still have an important role, although somewhat different than in the past. They now must ensure sufficient storage services are available (capacity, performance, protection, etc.), and resolve problems if the services aren’t working as advertised.

However, operational and organizational models can be highly resistant to change.  That's the way the world works -- unless there is a forcing function that makes the case compelling to all parties.  

And VSAN shows every sign of being a potential change accelerator.

How Virtual SAN Accelerates Change

As part of vSphere 5.5U1, VMware introduced Virtual SAN, or VSAN. Storage services can now be delivered entirely using local server resources -- compute, flash and disk – using native hypervisor capabilities. There is no need for an external storage array when using VSAN – nor a need for a dedicated storage team, for that matter.

VSAN-SizingVSAN is designed to be installed and managed entirely by virtual administrators independently of interaction with the storage team. These virtualization teams can now quickly configure storage resources, create policies, tie them to applications, monitor the results and speedily resolve potential problems – all without leaving the vSphere world.

As an initial release, VSAN 5.5 had limited data services, and thus limited use cases. VSAN 6.0 is an entirely different proposition: more performance (both using a mix of flash and disk, or using all-flash), new enterprise-class features, and new data services that can significantly encroach on the turf held by traditional storage arrays.

Empowered virtualization teams now have an interesting choice with regards to storage: continue to use external arrays (and the storage team), use self-contained VSAN, or most likely an integrated combination depending on requirements.  

Many are starting to introduce VSAN alongside traditional arrays, and have thus seen the power of a converged, application-centric operational model. And it’s very hard to go back to the old way of doing things when the new way is so much better -- and readily at hand.

The rapid initial growth of VSAN shows the potential of putting a bit of pressure on traditional storage organizations to work towards a new operational model, with improved division of responsibilities between application teams, infrastructure teams and storage teams.  And they'll need the powerful combination of SPBM, VASA 2.0 and VVOLs to make that happen.

Change Is Good -- Unless It's Happening To You

Change3I have spent many, many years working with enterprise storage teams.  They have a difficult, thankless job in most situations.  And there is no bad day in IT quite like a bad storage day.

Enterprise IT storage teams have very specific ways of doing things, arguably built on the scar tissue of past experiences and very bad days.  You would too, if you were them.

That being said, there is no denying the power of newer, converged operational models and the powerful automation that makes them so compelling.  The way work gets done can -- and will -- change.

Enterprise storage teams can view these new automation models as either a threat, or an opportunity. 

I know which side of that debate I'd be on. 

------------------

Like this post?  Why not subscribe via email?

 


Enter VSPEX BLUE

In the IT biz, all forms of converged infrastructure are now the rage.

VB_1Rightfully so: their pre-integrated nature and single-support model eliminates much of the expensive IT drudgery that doesn’t usually create significant value: selecting individual components, integrated them, supporting them, upgrading them, etc.

How much easier is it to order a block, brick, node, etc. of IT infrastructure as a single supportable product, and move on to more important matters?  

A lot easier, it seems ...

Reference architectures have been around for ages.  I think of them as a blueprints for building a car, and not like buying one.  Some assembly required.  Useful, yes, but there’s room for more.

VCE got the party started years back with Vblocks: pre-integrated virtualized infrastructure, sold and supported as a single product — with their success to be quickly followed by other vendors who saw the same opportunity.

VB_2A group of smaller vendors took the same idea, but did storage in software vs. requiring an external array, dubbing themselves “hyper-converged”: Nutanix, Simplivity and others.  They, too, have seen some success.

Last August, VMware got into this market in a big way by introducing EVO:RAIL — an integrated software product that — when combined with a specific hardware reference platform from an OEM partner — delivered an attractive new improvement over the first round of hyper-converged solutions.

While EVO:RAIL had several partners who offered immediate availability, EMC decided to take their time, and do something more than simply package EVO:RAIL with the reference hardware platform.

Today, we get to see what they’ve been working on — VSPEX BLUE.  It’s not just another EVO:RAIL variant, it’s something more.

And, from where I sit, it’s certainly been worth the wait …

The Lure Of Convergence

Think of IT as a black box: money and people go in one side, IT services come out the other side.  To the people who actually pay for IT organizations, this is not an entirely inaccurate representation.

Black-Box-Testing-QAInsightsIf you’re responsible for that black box, you continually ask yourself the question: what should I have my people be spending time on?  That’s the expensive bit, after all.

While there are many technical staffers out there who thoroughly enjoy hand-crafting IT infrastructure with the “best” components, it’s hard to point to situations where that activity actually creates meaningful value.  Worse, those hand-crafted environments create a long integration and support shadow, lasting many years after the decision has been made.

More and more IT leaders are coming around to the perspective that they want their teams to be focused on using IT infrastructure to deliver IT services, and not excessively bound up in design, integration, support and maintenance.  

If it doesn't add value, don't do it.

This is the the fundamental appeal of all forms of converged infrastructure: spend less time and effort on the things that don’t matter much, so your team can spend more time on the things that do matter.  The approach is now broadly understood, and the market continues to grow and expand.

Of course, IT infrastructure isn’t one-size-fits-all.  Any debate around what’s “best” really should be “what’s best for you?”.

EVO:RAIL

Image50I think of EVO:RAIL as a fully-integrated vSphere stack with value-added management tools, packaged for use as an appliance.  The idea is to make the user experience as simple and predictable as possible.

Software is only as good as its hardware, so the EVO:RAIL program provides a narrow specification for hardware partners: 4-node bricks, prescribed processors, memory and controllers, disk bays, etc.  Partner value-add and differentiation comes in the form of additional software and services above and beyond a stock EVO:RAIL configuration.

Strategically, I think of EVO:RAIL as well within the “software defined” category: compute, network and storage.  In particular, EVO:RAIL is built on VSAN — the product I’ve been deeply involved with recently.

VSPEX BLUE

VB_3The hardware is an EMC-specific design, and not a rebadge.  If you've been following EVO:RAIL, the design and specs will look very familiar.  

VSPEX BLUE will be sold through distributors, who set street pricing.  For those of you who are interested in pricing, you'll have to contact a reselling partner.

When evaluating any new EVO:RAIL-based offering, the key question becomes — what’s unique?  After all, the base hardware/software is specified to be near-identical across different partner offerings.

In EMC’s case, there is quite a list of differentiated features to consider, so let’s get started, shall we?

VSPEX BLUE Manager

VB_10While EVO:RAIL itself has a great user-centric manager to simplify things, EMC has taken it one step further by making a significant investment in their own VSPEX BLUE manager that complements and extends the native EVO:RAIL one in important ways.

First, the VSPEX BLUE Manager provides a portal to any and all support services: automatic and non-disruptive software upgrades, online chat with EMC support, knowledge base, documentation and community.  

All in one place.

VB_6Second, the VSPEX BLUE Manager is “hardware aware”.  It can display the physical view of your VSPEX BLUE appliances, and indicate — for example — which specific component might need to be replaced.

It also directly monitors low-level hardware status information, and feeds it upwards: to vCenter and LogInsight, as well as to EMC support if needed. 

More importantly, this feature is tied into EMC’s legendary proactive “phone home” support (ESRS) which means you might be notified of a problem directly by EMC, even though you haven’t checked the console in a while :)

The VSPEX BLUE Manager also manages the software inventory.  It discovers available updates, and non-disruptively installs them if you direct it to.  More intriguing, there’s the beginnings of a “software marketplace” with additional software options from EMC, with presumably more coming from other vendors as well.

Bundled Software

VB_9Part of any potential EVO:RAIL value-add is additional software, and here EMC has been very aggressive.  Three fully functional and supported software packages are included with every VSPEX BLUE appliance, but with a few restrictions.

First up is EMC’s RecoverPoint for VMs.  Those of you who have followed me for a while know that I’m a huge fan of RecoverPoint: great technology and functionality, very robust and now runs nicely in vSphere (and on top of VSAN) protecting on a per-VM basis.

The limitation is 15 protected VMs per four-node appliance.  More appliances, more protected VMs.  Since full functionality is provided, your choice of protection targets is wide open: another VSPEX BLUE appliance, or something else entirely.

VB_8Next up is the CloudArray cloud storage extender, based on EMC’s recent TwinStrata acquisitions.  CloudArray can present file shares or iSCSI to applications requiring extra capacity, or potentially a file share between multiple applications — something VSAN doesn’t do today.

The back-end store can be any compatible object storage: your choice of cloud, an on-prem object store, etc.  The included license is for 10TB of external capacity per appliance, not including the actual physical capacity.

And finally, VMware’s VDP backup software (based on DataDomain/Avamar technology) is included.  The upgrade is to the full-boat VDP-A.  Stay tuned, though.

What I Like

VB_4For starters, EMC’s offering is entirely based on VSAN for storage.  There is no packaging with an external array, as NetApp did.  Since my world is very VSAN-centric these days, that’s a huge statement, coming from the industry’s largest and most successful storage vendor.

VSPEX BLUE Manager is a great piece of work, and adds significant value.  The fact that EMC supports the entire environment online (just like their arrays and other products) is a big differentiator in the real world.  The software bundle is attractive as well; demonstrating EMC’s commitment to the product and making it stand out in the market.

And then there’s the fact that EMC is obviously jumping into the hyper-converged marketplace with both feet.  Thousands of trained field people and a hugely influential partner network.  Global distribution and support.   An army of skilled professional services experts.  A very proficient marketing machine.  A large and successful VSPEX channel. The proven ability to move markets.

Those that just focus on the product itself will miss the bigger context here.

What’s Next For The Hyper-Converged Market?

Whats-nextIf we’re being honest, the smaller startups had the nascent hyper-converged market to themselves in the early days. 

Good for them.

But now the big boys are starting to jump in with vigor: first VMware with EVO:RAIL, and now EMC itself with VSPEX BLUE.

One thing is for sure, the future won’t be like the past :)

 

-------------------

Like this post?  Why not subscribe via email?

 

      

VSAN 6.0 — The Second Chapter Begins

VSAN6-1As part of the vSphere 6.0 announcement festivities, there’s a substantially updated new version of Virtual SAN 6.0 to consider and evaluate. 

Big news in the storage world, I think.

I have been completely immersed in VSAN for the last six months.  It's been great.  And now I get to publicly share what’s new and — more importantly — what it means for IT organizations and the broader industry.

If there was a prize for the most-controversial storage product of 2014, VSAN would win.  In addition to garnering multiple industry awards, it’s significantly changed the industry's storage discussion in so many ways.

VSAN6-5Before VSAN, shared storage usually meant an external storage array.  Now there’s an attractive alternative — using commodity components in your servers, with software built into the world’s most popular hypervisor. 

While the inevitable “which is better?” debate will continue for many years, one thing is clear: VSAN is now mainstream.

This post is a summary of the bigger topics: key concepts, what’s new in 6.0, and a recap of how customers and industry perspective has changed.  Over time, I’ll unpack each one in more depth — as there is a *lot* to cover here.

The Big Idea

VSAN6-3Virtual SAN extends the vSphere hypervisor to create a cluster-wide shared storage service using flash and disk components resident in the server. 

It does not run “on top of” vSphere, it’s an integral component.  All sorts of design advantages result.

In addition to a new hardware and software model, VSAN introduced a new management model: policy-based storage management.  Administrators specify the per-VM storage policy they’d like (availability, performance, thin provisioning, etc.), and VSAN figures out the rest.  Change the policy, VSAN adapts if the resources are there.  

The marketing literature describes it as “radically simple”, and I’d have to agree.

During 2014, VSAN 5.5 exceeded each and every internal goal we set out: performance, availability, reliability, customer adoption, roadmap, etc.  A big congratulations to a stellar team!

Of course, now we need to set much more aggressive goals :)

What’s New

Quite a bit, really.  The engineering team is accelerating their roadmap, and I couldn’t be more pleased.  All of this should be available when vSphere 6.0 goes GA.

All-flash

VSAN6-2VSAN 6.0 now supports all-flash configurations using a two-tiered approach.  The cache tier must a be write-endurant (e.g. not cheap) flash device; capacity can be more cost-effective (and less write-endurant) flash. 

With all-flash configurations, cache is not there to accelerate performance; it’s there to minimize write activity to the capacity layer, extending its life.

Note: IOPS quoted here are 4K, 70r/30r mixes.  As always, your mileage may vary.

Performance is extreme, and utterly predictable as you’d expect.  Sorry, no dedupe or compression quite yet.   However -- and this is a big however -- the write-caching scheme permits the use of very cost-effective capacity flash without burning it out prematurely.   Everyone's numbers are different, but it's a close call as to which approach is more cost-effective: (a) more expensive capacity flash with dedupe/compression, or (b) more inexpensive capacity flash without dedupe/compression. 

Please note that all-flash support is a separate license for VSAN.

New file system, new snaps

VSAN6-8Using the native snapshots in vSphere 5.5 was, well, challenging.  A new on-disk filesystem format is introduced in VSAN 6.0 that’s faster and more efficient, derived from the Virsto acquisition.  Snaps are now much faster, more space-efficient and perform better when used -- and you can take many more of them. Up to 32 snaps per VM are now supported, if you have the resources.

VSAN 5.5 users can upgrade to 6.0 bits without migrating the file format, but won’t get access to some of the new features until they do.  The disk format upgrade is rolling and non-disruptive, one disk group at a time.  Rollbacks and partial migrations are supported.  Inconvenient, but unavoidable.

Bigger and better — faster, too!

VSAN6-7Support is included for VMDKs up to 62TB, same as the vSphere 6.0 max.  VSAN clusters can have as many as 64 nodes, same as vSphere 6.0. 

The maximum number of VMs on a VSAN node is now 200 for both hybrid and all-flash configs (twice that of VSAN 5.5), with a new maximum of 6400 VMs per VSAN cluster.

More nodes means more potential capacity, up to 64 nodes, 5 disk groups per node and 7 devices per disk group, or 2240 capacity devices per cluster. Using 4TB drives, that’s a humble ~9 petabytes raw or so.

The marketing statement is that VSAN 6.0 in a hybrid configuration (using flash as cache and magnetic disk for capacity) offers twice the performance than VSAN 5.5.  I want to come back later and unpack that assertion in more detail, but VSAN 6.0 is noticeably faster and more efficient for most workloads.  Keep in mind that VSAN 5.5 was surprisingly fast as well.

And, of course, the all-flash configurations are just stupidly fast.

Go nuts, folks.

Fault domains now supported

VSAN 5.5 was not rack-aware, VSAN 6.0 is. When configuring, you define a minimum of 3 fault domains to represent which server is in which rack.  After this step, VSAN will be smart enough to distribute redundancy components across racks (rather than within racks) if you tell it to. 

Note: this is not stretched clusters — yet.

New VSAN Health Services

VSAN6-9Not surprisingly, VSAN is dependent on having the correct components (hardware, driver, firmware) as well as a properly functioning network.   The majority of our support requests to date have been related to these two external issues.

VSAN 6.0 now includes support for a brand new Health Services tool that can be used to diagnose most (but not all) of the external environmental issues we've encountered to date.  Log collection is also simplified in the event you need VMware customer service support.  

A must for any serious VSAN user.

Operational improvements

VSAN6-6Lots and lots of little things now make day-to-day life with VSAN easier.  

The UI now does a better job of showing you the complete capacity picture at one go.  There’s a nifty screen that helps you map logical devices to physical servers.  You can blink drive LEDs to find the one you’re interested in.  Individual capacity drives or disk groups can be evacuated for maintenance or reconfiguration.  There’s a new proactive rebalancing feature.  

A default storage policy is now standard, which of course can be edited to your preferences. There’s a new screen that shows resynchronization progress of rebuilding objects. There’s a new “what if” feature that helps you decide the impact of a new storage policy ahead of time.

For larger environments, vRealize Automation and vRealize Operations integration has been substantially improved — VSAN is now pre-configured to raise selected status alarms (VOBs) to these tools if present.

And much more.

Behind the scenes

BehindThere’s been a whole raft of behind-the-scenes improvements that aren’t directly visible, but are still important.  

VSAN is now even more resource-efficient (memory, CPU, disk overhead) than before, allowing higher consolidation ratios, among other things.  VSAN’s resource efficiency is a big factor for anyone looking at software-delivered storage solutions as part of their virtualized environment, helping folks achieve even higher consolidation ratios.

The rebuild prioritization could be a bit aggressive in VSAN 5.5; it now plays much more nicely with other performance-intensive applications. Per-node component counts have been bumped up from 3000 to 9000, and there’s a new quorum voting algorithm that uses far fewer witnesses than before. As a result, there’s much less need to keep an eye on component usage.

VSAN 6.0 requires a minimum of vCenter 6.0 and ESXi 6.0 on all hosts.  As mentioned before, you can defer the file system format conversion until later, but no mix and matching other than that.

New ReadyNodes

VSAN6-10If you’d like to shortcut the build-your-own experience, that’s what VSAN ReadyNodes are for. Many new ones will be announced shortly, some with support for hardware checksums and/or encryption.

ReadyNodes can either be ordered directly from the vendor using a single SKU, or use them as a handy reference in creating your own configurations.

Or skip all the fun, and go directly to EVO:RAIL

See something missing?

Inevitably, people will find a favorite feature or two that's not in this release.  I have my own list.  But don't be discouraged ...

I can't disclose futures, but what I can point to is the pace of the roadmap.  VSAN 5.5 has been out for less than a year, and now we have a significant functionality upgrade in 6.0.  Just goes to show how quickly VMware can get new VSAN features into customers' hands.  

The Customer Experience

By the end of 2014, there were well over 1000 paying VSAN 5.5 customers.  Wow.  Better yet, they were a broad cross-sample of the IT universe: different sizes, different industries, different use cases and different geographies.  

For the most part, we succeeded in exceeding their expectations: radically simple, cost-effective, blazing performance, reliable and resilient, etc.

VSAN6-4One area we took some heat on with VSAN 5.5 was being a bit too conservative on the proposed initial use cases: test and dev, VDI, etc.  

Customers wondered why we were holding back, since they were having such great experiences in their environment.  

OK, call us cautious :)

With VSAN 6.0, there are almost no caveats: bring on your favorite business-critical workloads with no reservations.

Another area where we’re working to improve?  In the VSAN model, the customer is responsible for sizing their environment appropriately, and sourcing components/drivers/firmware that are listed on the VMware Compatibility Guide (VCG) for VSAN.

ChooseWisely_Fullpic_2Yes, we had a few people who didn’t follow the guidelines and had a bad experience as a result.  But a small number of folks did their best, and still had some unsupported component inadvertently slip into their configuration.  Not good.  Ideally, we’d be automatically checking for that sort of thing, but that’s not there — yet.  

So the admonishments to religiously follow the VSAN VCG continue.

If I’m being critical, we probably didn’t do a great job explaining some of the characteristics of three-node configurations, which are surprisingly popular.  In a nutshell, a three node config can protect against one node failure, but not two.  If one node fails, there are insufficient resources to re-protect (two copies of data plus a witness on the third node) until the problem is resolved.  This also means you are unprotected from a failure during maintenance mode when only two nodes are available.  

Some folks are OK with this, some not.

Four-node configs (e.g. EVO: RAIL) have none of these constraints.  Highly recommended :)

The Industry Experience

Storage — and storage technology — has been around for a long time, so there’s an established orthodoxy that some people adhere to.  VSAN doesn’t necessarily follow that orthodoxy, which is why it’s disruptive — and controversial.

End_is_nearThere was a lot of head-scratching and skepticism when VSAN was introduced, but I think by now most of the industry types have gotten their head wrapped around the concept. 

Yes, it’s highly available.  Yes, it offers great performance.  No, the world won’t end because people are now using server components and hypervisor software to replace the familiar external storage array.  And there is plenty of real-world evidence that it works as advertised.

However, a few red herrings did pop up during 2014 that are worth mentioning.

One thread was around why people couldn’t use any hardware that might be handy to build a VSAN cluster. The rationale is the same as why array vendors won’t let you put anything unsupported in their storage arrays — the stuff might not work as expected, or — in some cases — is already known not to work properly.

If you don’t follow the VCG (vSphere Compatibility Guide), we’re awfully limited in the help we can provide.  And there are some truly shoddy components out there that people have tried to use unsuccessfully.

Another thread was from a competitor around the attractiveness of data locality.  

Fud1The assertion was that it made performance sense to keep data and application together on the same node, with absolutely no evidence to support the claim. 

Keep in mind that, even with this scheme, writes still have to go to a separate node, and any DRS or vMotion will need its data to follow.  And that’s before you consider the data management headaches that could result by trying to hand place the right data on the right node all the time.

Hogwash, in my book.

VSAN creates a cluster resource that is uniformly accessible by all VMs.  DRS and/or vMotions don’t affect application performance one bit.  Thankfully, the competitor dropped that particular red herring and went on to other things.

A related thread was the potential attractiveness of client-side caching vs. VSAN’s server-side caching.  A 10Gb network is plenty fast, and by caching only one copy of read data cluster-wide, it’s far more space efficient and thus there’s a much greater likelihood that a read request will come from cache vs. disk.   Our internal benchmarks continually validate this design decision.

A more recent thread from the networking crowd wasn’t happy with the fact that VSAN uses multicast for cluster coordination, e.g. for all nodes to stay informed on current state.  It’s not used to move data.

Do we have cases where customers haven’t set up multicast correctly?  Yes, for sure.  Once it gets set up correctly, does it usually run without a problem?  Yes, for sure.  

There was also the predictable warning that VSAN 5.5. was a “1.0” product, which was essentially a true statement.  That being said, I’ve been responsible for bringing dozens of storage products to market over the decades, and — from my perspective — there was very little that was “1.0” about it.  

And I’ve got the evidence to back it up.

Perhaps the most perplexing thread was the specter of the dreaded “lock in” resulting from VSAN usage.  To be fair, most external storage arrays support all sorts of hypervisors and physical hosts, and Virtual SAN software is for vSphere, pure and simple.

Enterprise IT buyers are quite familiar with products that work with some things, but not others.  This is not a new discussion, folks.  And it seems like the vSphere-only restriction is AOK for many, many people.

The Big Picture?

ChangesUsing software and commodity server components to deliver shared storage services is nothing new in the industry.  We’ve been talking about this in various forms for many, many years.

But if I look back, this was never an idea that caught on — it was something you saw people experimenting with here and there, with most of the storage business going to the established storage array vendors.

With VSAN’s success, I’d offer this has started to change. 

During 2014, VSAN has proven itself to be an attractive, mainstream concept that more and more IT shops are embracing with great success.

And with VSAN 6.0, it just gets even more compelling.


The New Cloud Supply Chain

Looking_at_cloudsI admit I had my obsession with All Things Cloud back in the day.

Like so many others, I found the industry move to cloud fascinating on so many levels: new technology models, new operational models, new application models, new consumption models, etc.  

I wrote endless, lengthy blog posts attempting to explore every nook and cranny.  Even to this day, the topic continues to intrigue me.

One of the things I spent much time considering was what I dubbed the "cloud supply chain".  

As an example, supply chains in the physical world are responsible for transforming raw materials into finished goods we all consume.   Every company along the way specializes at what they do best at, and cooperates with others who are good at other things.  It's rare when you see a single company responsible for everything from raw materials to customer service. 

Supply_chainCloud services should be no different, I thought.  

Specialized players -- each with different strengths -- could and should combine into supply chains to create more value than any single player alone.

Today, VMware's vCloud Air service announced a strategic partnership with Google's Cloud Platform Services.  Customers of vCloud Air can now use select Google Cloud Platform services relatively transparently: with a single contract, and a single point of support.

It's a great example of a cloud supply chain -- but will we see more? 

And the bigger question, still not answered: what will be the dominant industry model that serves enterprise IT?

The News

Cloud_connectionYou can go read all about it here, here and here -- but if you're pressed for time: vCloud Air customers can now use selected Google Cloud Platform services directly from vCloud Air, using their existing contract and support agreements.

The services announced under this agreement include:

• Google Cloud Storage– distributed low-cost object storage service
• Google BigQuery – a real-time analytics service suitable for ad-hoc business intelligence queries across billions of rows of data in seconds
• Google Cloud Datastore– a schema-less NoSQL database service
• Google Cloud DNS– a globally-distributed low-latency DNS service

As VMware and Google typically sell to different customers, both parties clearly benefit.  VMware gets the benefit of Google's services for vCloud Air, and Google gets a convenient route to market to enterprise IT.   Customers get the convenience of a single provider selling and supporting an expanded set of services.

Why This Is Interesting To Me

BeadsIn one sense, I see this as a battle in strategic alternatives.  

Amazon's basic message is "we can do it all for you".  There is no notion of a cloud services supply chain here: Amazon produces, sells and delivers all of the cloud services in its portfolio.  And, historically, this has done well for them.

On the other hand, cloud service providers such as VMware and Google whose message is "here is what we do well, here is what our partners do well, we'll make it easy for you use it all together".

Without getting into a debate as to which approach is "better", an argument can be made for the most likely long-term successful model: a cloud services supply chain.  Why?  As in the physical world, specialization and focus matters.  It's very hard for any company -- regardless of size -- to service every potential customer with everything they might need -- and to be really good at every aspect.

Why should cloud services be any different?

Where You Are In The Supply Chain Matters

SupplychainIf you look at supply chains in the physical world, you'll quickly realize the most influential position is being closest to the ultimate consumer or customer.  

For example, big retailers (e.g. Walmart) have largely dictated supply chain terms to consumer goods manufacturers for decades.

When it comes to delivering enterprise IT services, being close to the customer really matters.  So much of enterprise IT is "high touch" -- skilled people on both sides of the table working towards a common goal.

In the case of VMware, there's more than that.  VMware products define a data center computing experience: the technology as well as the operational model.  The design goal for vCloud Air is to replicate and improve that familiar experience using a cloud services model.  Put differently, VMware is not only close to the customer, they're close to the operational model that much of IT is using today.

And, thus, I would argue occupying a fortunate spot in the cloud services supply chain market that is starting to emerge.

What We Might See In The Future

Iphone-6-cameraI recently succumbed to upgrading my iPhone from a battle-scarred 4S to a new iPhone 6.  The migration?  Completely painless.  The user experience?  Utterly familiar but with a few new cool features.  

Just what I wanted -- exactly what I was familiar with, but better.

I never really considered an Android-based phone, or any other alternative.  Too much hassle, and I already liked what I had.  Based on Apple's recent quarterly results, it seems I was not alone in my thinking.

I can make a strong argument that as enterprise IT groups start to adopt more cloud services, many will be looking for the same thing: what they're familiar with, only better.  

And an awful lot of IT professionals are familiar with VMware products and the experience they deliver.

---------------------------

Like this post?  Why not subscribe via email?

 

 

 

 

 

 

 

 


2015 — The 3rd Platform Gets Real For IT

I used to regularly do my list of New Year predictions.  My success rate has been reasonable, but this year is different.

MeteorWhy?  Because this year, I believe there is one vastly important trend that will begin to drive more change across the IT landscape than all the other possible candidates combined.

And that driving force is the 3rd platform — and the new breed of applications it supports.

We’ve all been talking about it for a few years.  It’s not a contentious discussion, although it's been rather abstract for many.  But, in 2015, it shows every sign of getting very real for many more IT groups.

The required ingredients are now in place.  The spark is beginning to ignite the mixture.  And the changes should come very quickly as a result.

Like a meteor hitting the earth — the IT world is going to look very different before too long.

IT Reality 101

AppsThose of us buried in the trenches can lose sight of a fundamental truth: IT exists solely for the purpose of delivering the applications people want to use.  

When the desired application model changes, the IT world is forced to change around it.

Quite often, you read a rant from an infrastructure person (network, storage, security, etc.) about how darn inconvenient it can be when applications demand certain things outside the established historical norm.

My advice?  Now would be a good time to fasten your seat belts …

The Last Wave

Once you’ve seen an industry tsunami, you have an idea of what to look for.  I had the privilege of seeing one up close and personal.

MainframeI entered the workforce when mainframes and minicomputers were king.  My newly-minted UNIX and C skills were not in demand yet -- but COBOL, JCL and MVS were certainly marketable. 

Most applications were monolithic — green screens anyone?  Networks existed primarily to connect terminals to servers.  Storage was all direct-attached.

The first ingredient behind the demise of the first platform was a new form of low-cost computing — UNIX running on microprocessors.  The combination delivered an entirely different cost structure than what came before.  And with it, a new management and operations style that was more agile and responsive than the previous mainframe world.  

All you needed was the root password :)

IBM_AT_System_s1At roughly the same time, the preferred user experience was was starting to change as well — PCs were quickly finding their way onto most knowledge worker’s desktops.

These two ingredients set the stage for a dramatic shift in the desired application model — from traditional monolithic to a more progressive client-server approach.  

Once client-server applications became the preferred approach, the resulting industry changes started to come fast and furiously.

A entirely new IT ecosystem emerged, with new players.  People needed local area networks and routers, and Cisco ended up doing quite well.  Mainframe databases gave way to UNIX-based (and later NT-based) products, with Oracle coming out on top.  Microsoft owned the user experience, and leveraged that to establish a valuable position in server operating systems, quickly eliminating NetWare’s reign.

And much more.

Looking back, it can be hard to appreciate just how dramatically things changed during this period.  Long gone are the most of the mainframe and minicomputer titans of yore.  The application vendors who thrived in the monolithic world are largely no longer with us.  While some of the legacy skills are still in demand, the market continues to dwindle.

When a new application paradigm hits, the structural changes come quickly.

And now a new one is upon us.

The New Wave

At a high level, the two required ingredients we saw previously are clearly evident, although slightly different.

3rd-platformA new form of efficient and cost-effective computing is becoming popular.  Combine commodity components with a cloud operational model, and you’re playing at an entirely new level of efficiency and flexibility than before.

Whether you decide to label it public cloud, private cloud, hybrid cloud, converged, hyper-converged, etc. is up to you — the central thought is there is a new model associated with infrastructure and operations, and it’s very different than the old one.

Needless to say, the preferred user experience has now permanently changed to mobile for many use cases. “It doesn’t work well on my iPhone” can be a death knell for so many applications these days.

Yes, we’ll still have desktops and web browsers and all that, but that’s not the dominant investment model any more.  Slick mobile experiences can trump all alternatives, so — in the long term — this is what people will generally prefer.

Digging deeper, there's a fascinating world of data fabrics, analytical tools, app frameworks, etc. that gather, analyze and help people act on data in entirely new ways.

It is important to note that the majority of 3rd platform applications are revenue-generating, strategic — or both.  So they typically come with a lot of firepower, and can thus drive a lot of change very quickly.

The Spark That Ignites?

An air-gasoline mixture can be quite stable — until a spark is present.  The ingredients for the 3rd platform have been around for a while: cloud, mobile, big data, tools, frameworks, etc. — so it is fair to ask: what will ignite it?  And why now?

Digital_businessFor me, the answer is simple yet subtle: a fundamental change in business strategy.

Most company leadership teams are now frantically re-imagining themselves as digital entities vs. based in the physical world.  Every new business process, offering, service, etc. is now thought of digitally.  

And this new digital business model demands a new application factory to create an endless stream of new, modern software widgets.

It’s important to note: this is not an IT thing, this is a business thing.  Actually, a digital business thing.

If I look at 2014, the pace of customers I met who were clearly shifting to a digital business model picked up rapidly.  As 2014 came to a close, it became commonplace.  I would only expect this to accelerate through 2015.  Once a critical mass is achieved, the tipping point is passed, and the ball starts to roll downhill very quickly.

The Prediction

So I’m willing to go out on a limb and flatly state that — in 2015 — the 3rd platform starts to clearly move from powerpoint to production — driven by a new generation of applications that are needed to power modern digital business models.

3rd-platform (1)Few will be left untouched: IT vendors, IT professionals — just about everyone in our cozy world. And by the end of 2015, it should be clear to most everyone involved that “that was then, this is now”.

When critical applications are king, all others must obey.

We in the IT infrastructure world sometimes have difficulty internalizing this harsh reality, especially if it’s inconvenient to our traditional ways of doing things.  For example, I remember clearly the mainframe guys thinking UNIX was absolute heresy, not robust enough, inefficient, etc.

To be clear, the legacy world doesn’t disappear overnight.  But all the serious attention will turn to the new 3rd platform and its new requirements.  The 2nd platform soldiers on, (as does the 1st platform) but it won’t be the investment center any more.

Why Software Defined Really Matters Now

Much as a digital new business model drives the need for the 3rd platform, those 3rd platform applications will drive the need for software-defined infrastructure and services in all of its flavors.

Software-defined-data-centerOne of the central premises of software-defined anything (networks, storage, security, etc.) is that — by abstracting functionality from hardware — it can place all aspects of infrastructure services under consistent programmatic control if desired.

Services can be dynamically composed without manual human involvement.

Based on observations to date, 3rd platform applications are sprawling, dynamic beasts with multiple morphing components and associated hard-to-predict behavior.  Things move fast, and change all the time.

Whether supporting infrastructure services are directly under application control, or controlled by a next-generation management framework (e.g. devops), there’s no room for the previous technology management model.  Every aspect must be under software control, and automated as much as possible.

Through_hoopsJust like mainframe management techniques didn’t work for the new world of UNIX-based client-server, I’m arguing that our familiar silos of manual-intensive service delivery and management just won’t work for the new world of 3rd platform applications.  They are too slow, too inflexible and too expensive.

As a result, the growing wave of (revenue-generating and strategic) 3rd platform applications will be forcing functions to rapidly adopt software-defined management techniques, and quickly.

Consider modern aviation: highly automated, with humans providing exception management as needed.  Every aspect of a modern airliner is entirely under software control.  Yes, the model is not without its occasional shortcomings, but there's no going back.

How This Typically Plays Out

I’ve been in meetings that have gone something like this …

We__re_On_A_Mission_From_God_by_lubyelfearsOn one side of the table is the application lead with their business partner.  They’re on a mission to stand up an Important New Thing.  Maybe it’s a big data thing, or a new mobile application, or perhaps something similar.

They are using new tools and new agile techniques to iterate fast and learn as they go. They have adequate funding and executive support.  You don’t want to say “no” to these people.

On the other side of the table is the infrastructure team(s): server, network, storage, security.  They have their established ways of doing things, aimed at predictable delivery of services for all applications, and not just the new thing.

They are justifiably concerned that there are no well-defined requirements, things can change at a moment’s notice, most of their questions can’t be answered, what they want to do isn’t compatible with existing processes, etc.

Anxious looks all around.

Sandbox_kidsA compromise is typically reached: a dedicated environment defined by the app team, supported by the infrastructure team.  Call it a sandbox, a private cloud — whatever.

Here is where the software-defined technology is finding its beachhead today.  Here is where application developers and devops people will collaborate to continually improve the effectiveness of operations.

The presumption — of course — is that every aspect of IT infrastructure delivery is under software control: dynamically composed and abstracted from specific hardware implementations.

History Does Repeat Itself

When UNIX and client-server first appeared in data centers, it was managed off to the side by a dedicated team. They had a handful of applications they were responsible for, and the worked closely together to deliver the results.

HistoryApplication and business owners liked what they saw.  Things got done faster, more flexibly — and more cost-effectively — than working with the mainframe team.  Relatively quickly, the UNIX footprint expanded, and the mainframe footprint declined.

One of the historical turning points: when SAP changed the way that companies ran their business operations, it usually ended up on a UNIX platform, and not a mainframe.

As we enter 2015, it’s happening again.  Many private clouds are already up and running in one form or another.  New 3rd platform applications are starting to arrive fast and furiously, and they generally end up on a platform tailored for their business requirements: fast, flexible and very automatable.

Their footprint will grow, and rapidly.  They will be thought of separately at first, but — before long — they will become the new core of the application landscape.  And they will drive the software-defined model as an integral component.

I suppose we’ll have to wait until 2016 to learn whether this is the year that we all feel the shift in our industry.

------------

Like this post?  Why not subscribe via email?