Hyper-Converged Infrastructure

Hyper-converged infrastructure is a software-centric architecture that tightly integrates compute, storage and virtualization resources in a single system that usually consists of x86 hardware.

Modern businesses rely on the data center to provide the computing, storage, networking and management resources that are necessary to host vital enterprise workloads and data. But data centers can be notoriously complex places where a multitude of vendors compete to deliver myriad different devices, systems and software. This heterogeneous mix often struggles to interoperate -- and rarely delivers peak performance for the business without careful, time-consuming optimizations. Today, IT teams simply don't have the time to wrestle with the deployment, integration and data center management challenges posed by traditional heterogeneous environments.

The notion of convergence originally arose as a means of addressing the challenges of heterogeneity. Early on, a single vendor would gather the systems and software of different vendors into a single pre-configured and optimized set of equipment and tools that was sold as a package. This was known as converged infrastructure, or CI. Later, convergence vendors took the next step to design and produce their own line of prepackaged and highly integrated compute, storage and network gear for the data center. It was an evolutionary step now called hyper-converged infrastructure, or HCI.

Converged and hyper-converged infrastructures are possible through a combination of virtualization technology and unified management. Virtualization allows compute, storage and networking resources to be treated as pooled resources. Unified management allows all those resources to be discovered, organized into pools, divided into performance tiers and then seamlessly provisioned to workloads regardless of where those resources are physically located. Unified management offers a quantum leap over traditional heterogeneous data center environments that might rely on multiple disparate management tools, which often didn't discover or manage all resources.

Today, the combination of virtualized hardware and associated management tooling is often treated as a standalone appliance that can operate as a single, complete subsystem in the data center, or be combined with other HCI appliances to quickly and easily scale up a hyper-converged infrastructure deployment.

Let's take a closer look at hyper-converged infrastructure technology, consider its use cases and implementation, evaluate its tradeoffs, examine some current vendors and product offerings, and look ahead to the future of the technology.

How does hyper-converged infrastructure work?

Too often, eclectic mixes of hardware from varied vendors have been tied together with inadequate networking gear and prove impossible to provision and manage through a single tool. The result is almost always a hodgepodge of diverse gear and software that results in confusion, oversights, needless firefighting and wasted time on the part of IT administrators.

Hyper-converged infrastructure is founded on the two essential premises of integration and management, which arose as a means of solving two of the most perplexing problems of traditional heterogeneous data centers: suboptimal performance and fractured -- problematic -- systems management. The goal of HCI is to deliver virtualized scalable compute, storage and network resources that are all discoverable and managed through a single platform.

Evaluate hyper-converged infrastructure use cases, benefits and challenges before adoption.

Beyond that basic premise, however, there are numerous variations and options available for hyper-converged infrastructure. It's important to understand the most common considerations found in HCI technology.

Hardware or software deployment. Hyper-converged infrastructure can be implemented through hardware or software.

Hardware deployment. HCI technology arose as a hardware platform that puts compute, storage -- and sometimes network -- resources into a dedicated device often referred to as an appliance. Hardware-based HCI enables high levels of integration and optimization, which can vastly enhance key performance in vital areas, such as storage-to-CPU data transfers. Hardware HCI can be ideal when high performance is important for workloads -- such as real-time data analytics tasks -- and when hardware modularity and scalability are important for the business. However, hardware-based HCI tends to be proprietary to the vendor, raising costs and risking some amount of vendor lock-in. For example, hardware HCI allows a business to easily add new HCI appliances as needed without concern over management software compatibility and support.
Software deployment. HCI can also be implemented as a software layer that is intended to discover, virtualize and manage existing hardware components. The software approach allows a business to gain HCI benefits without the need for extensive new hardware investments. The disadvantage here is that existing hardware devices -- including servers and storage subsystems -- won't benefit from the tight integration and optimizations found in hardware-based HCI. The software approach can also impose architectural modifications and new monitoring demands that the HCI software layer may not provide. Thus, software-based HCI can be more complicated for businesses to implement and maintain.

Integrated or disaggregated. Hyper-converged infrastructure can follow two different approaches in terms of hardware design.

Integrated HCI. The integrated approach is more traditional, where an HCI appliance contains a balanced mix of compute resources, including processors, memory and storage. Each appliance is termed a node, and an HCI deployment can be scaled up to include numerous nodes. Integrated HCI hardware is easy to understand and offers high performance -- everything is in the same box. But resources are finite, and workloads usually don't utilize resources evenly. When a node runs short of a resource -- such as CPUs -- a business would need to add an entirely new node, even though the other resources in the new node -- such as the memory and storage in this example -- might not yet be used or needed. This poses the potential for wasted investment.
Disaggregated HCI. Hyper-converged infrastructure has more recently started adopting a disaggregated architecture. Rather than putting CPUs, memory and storage all together in the same box -- appliance -- the idea is to provide different resource components in different modules. Thus, a disaggregated HCI hardware deployment would put CPUs and memory in one compute box, and storage in a separate storage box. All the different resource modules are tied together across a network. Although it's technically possible to also separate CPUs and memory for greater disaggregation, it's better for performance to keep CPUs and memory together in the same device, as CPUs and memory are tightly coupled in a compute environment. Disaggregated approaches promise additional flexibility, allowing the business to focus HCI investments on the resources that are most beneficial. Several HCI vendors have disaggregated HCI offerings.

Compare and contrast traditional IT vs. HCI vs. dHCI vs. composable infrastructure.

Deployment. Hyper-converged infrastructure is usually regarded as a disruptive technology -- it typically displaces existing data center hardware. Anytime that HCI is being introduced to the data center, it's important to consider how that technology will be implemented or operated. There are basically three ways to add HCI to a traditional heterogeneous data center.

Full replacement HCI deployment. The first option is a complete replacement of the traditional environment with a hyper-converged infrastructure product. In practice, this is probably the least-desirable option because it poses the maximum possible displacement of hardware -- as well as the highest potential costs. Few organizations have the capital or technological need for such an undertaking. It's more likely that a new HCI deployment will be adopted for greenfield projects, such as a second – backup or remote -- data center construction or other new build, using reference architecture where equipment capital can be invested in HCI without displacing existing hardware.
Side-by-side HCI deployment. The second -- and far more palatable -- approach is a side-by-side deployment where an HCI platform is deployed in the existing data center along with traditional heterogeneous infrastructure. This approach allows businesses to migrate workloads to HCI over time and can be performed over the long term. Displaced hardware can be repurposed or decommissioned in smaller, more manageable portions. It is likely that HCI will run in tandem with traditional infrastructures over the long term, and the two can easily coexist.
Per-application HCI deployment. The third approach also brings hyper-converged infrastructure into the existing data center environment. But rather than migrate existing workloads to the new infrastructure, the HCI is intended only to support specific new applications or computing initiatives, such as a new virtual desktop infrastructure deployment or a new big data processing cluster; previous workloads are left intact on the existing infrastructure.

Why is hyper-convergence important?

Heterogeneous data centers evolved because every enterprise has computing problems that must be solved, but the answer is rarely the same for every company or every problem. The promise of heterogeneity frees an enterprise to choose between low-cost product options and best-of-breed ones -- and everything in between -- all while keeping the data center largely free of vendor lock-in.

But heterogeneity has a price. Constructing an effective heterogeneous data center infrastructure requires time and effort. Hardware and software must be individually procured, integrated, configured, optimized -- if possible -- and then managed through tools that are often unique to a vendor's own products. Thus, managing a diverse infrastructure usually requires expertise in multiple tools that IT staff must master and maintain. This causes additional time and integration challenges when the infrastructure had to be changed or scaled up. Traditional heterogeneous IT simply isn't all that agile.

Today, business changes at a much faster pace. IT must respond to the demands of business much faster; provisioning new resources for emerging workloads on-demand and adding new resources often just in time to keep enterprise applications running and secure, yet IT must also eliminate systems management errors and oversights that might leave critical systems vulnerable. And all of this must be accomplished with ever-shrinking IT budgets and staff. Hyper-converged infrastructure is all about deployment speed and agility.

HCI draws on the same benefits that made homogeneous data center environments popular: single-vendor platforms that ensured compatibility, interoperability and consistent management, while providing "one vendor's throat to choke" when something went wrong. But HCI goes deeper to deliver compute, storage and network resources that are organized using software-defined and virtualization technologies. The resources are tightly integrated and pre-optimized, and the hardware and software are packaged into convenient appliances -- or nodes -- which can be deployed singularly to start and then quickly and easily scaled out as resource demands increase.

In short, HCI products are basically data centers in a box. If a business needs more data center, just add more boxes. But the appeal of HCI extends beyond the data center. The compact, highly integrated offerings are easily installed and can be managed remotely, and HCI technology has become important for remote office/branch office (ROBO) and edge computing deployments.

As an example, consider a typical big data installation where petabytes of data arrived from an army of IoT devices. Rather than rely on a network to send raw data back to a data center for processing, the data can be collected and stored locally at the edge -- where the data originates -- and an HCI deployment can readily be installed at the edge to remotely process and analyze the raw data, eliminating network traffic congestion by sending only the resulting analysis to the main data center.

Why companies are moving to hyper-convergence

In 2020, the hyper-converged infrastructure market is generating about $2 billion in sales per quarter. This tremendous investment has taken HCI from a niche or SMB platform to a viable enterprise alternative. Although HCI might not be ideal for all workloads, HCI is able to tackle a greater range of applications and use cases than ever before.

HCI started as a point platform -- a means of simplifying and accelerating modest IT deployments in ROBO, as well as a limited number of enterprise-class environments, such as VDI. Early on, large businesses used HCI to support mission-specific goals separate from the main production environment, which could be left alone to continue doing the heavy lifting.

Today, HCI offerings benefit from the radical improvements that have taken place in processors, memory and storage devices, as well as dramatic advances in software-defined technologies that re-define how businesses perceive and handle resources and workloads. Examples include the following:

HCI support for container clusters. Vendors such as Dell EMC/VMware, Nutanix and Cisco now offer HCI configurations optimized for popular container software, such as Kubernetes.
Support for machine learning and deep learning algorithms. The ability to support a huge volume of scalable containers makes HCI a natural fit for machine learning and artificial intelligence workloads, which demand enormous numbers of compute instances.
The emergence of streaming data analytics. Streaming analytics is an expression of big data, allowing an HCI system to ingest, process and report on data and metrics collected from a wide array of sources in real time. Such analytics can be used to yield valuable business insights and predict impending problems or faults.

Hyper-converged infrastructure has had a profound effect on edge computing. Today's unparalleled proliferation of IoT devices, sensors, remote sites and mobile accessibility is demanding that organizations reconsider the gathering, storing and processing of enormous data volumes. In most cases, this requires the business to move data processing and analysis out of the central data center and relocate those resources closer to the source of the data: the edge. The ease and versatility provided by HCI offerings makes remote deployment and management far easier than traditional IT infrastructures.

Finally, the speed and flexibility in HCI has made it well-suited to rapid deployment, and even rapid repurposing. The emergence of the COVID-19 pandemic has forced a vast number of users to suddenly work from home. This has made organizations have to suddenly deploy additional resources and infrastructure to support the business computing needs of users now working remotely. HCI systems have played a notable role in such rapid infrastructure adjustments.

There are several business drivers pushing IT shops to adopt HCI technology.

Hyper-converged infrastructure vs. converged infrastructure

Today's hyper-converged infrastructure technologies didn't spring into being overnight. The HCI products available today are the result of decades of data center -- and use case -- evolution. To appreciate the journey, it's important to start with traditional data center approaches where compute, storage and network equipment were all selected, deployed and usually managed individually. The approach was tried and true, but it required careful integration to ensure that all of the gear would interoperate and perform adequately -- optimization, if possible at all, was often limited.

As the pace of business accelerated, organizations recognized the deployment and performance benefits of integration and optimization. If it were possible to skip the challenges of integrating, configuring and optimizing new gear, deployments could be accomplished faster and with fewer problems.

This gave rise to the notion of convergence, enabling vendors to create sets of server, storage and network gear that had already been prepackaged and pre-integrated, and were already validated to function well together. Although converged infrastructure was basically packaged gear from several different vendors, the time-consuming integration and optimization work had already been accomplished. In most cases, a software layer was also included, which could manage the converged infrastructure products collectively -- basically providing a single pane of glass for the CI package.

Eventually, vendors realized that convergence could provide even greater levels of integration and performance by foregoing multiple vendors' products in favor of a single-vendor approach that combined compute, storage and network components into a single product. The concept was dubbed hyper-convergence and led to the rise of hyper-converged infrastructure.

HCI products are often denoted by a modular architecture, enabling compute, storage and network components to be built as modules that were installed into a specialized rack. The physical blade form factor proved extremely popular for HCI modules and racks, enabling rapid installation and hot swap capabilities for modules. More compute, storage and network blades could be added to the blade rack. When the rack was filled, a new rack could be installed to hold more modules -- further scaling up the deployment. From a software perspective, the HCI environment is fully virtualized and includes unified management tools to configure, pool, provision and troubleshoot HCI resources.

HCI 1.0 vs. HCI 2.0

Hyper-converged infrastructure continues to evolve, expressing new features and capabilities while working to overcome perceived limitations and expand potential use cases. Today, there is no commonly accepted terminology to define the evolution of HCI, but the technology is colloquially termed HCI 1.0 and HCI 2.0. The principal difference in these designations is the use of disaggregation.

The original premise of HCI was to provide tightly integrated and optimized sets of virtualized CPU, memory, storage and network connectivity in prepackaged nodes. When more resources are needed, it's a simple matter to just add more nodes. Unified management software discovered, pooled, configured, provisioned and managed all the virtualized resources. The point here was that hyper-converged infrastructure relied on the use of aggregation, putting everything in the same box, which could be deployed easily and quickly. It's this underlying use of aggregation that made HCI 1.0 products so appealing for rapid deployment in ROBO and edge use cases.

The major complaint about HCI 1.0 products is the workload resource use and the potential for resource waste. A typical HCI product provides a finite and fixed amount of CPU, memory and storage. The proportion of those resources generally reflects more traditional, balanced workloads. But workloads that place uneven or disproportionate demands on resources can ultimately exhaust some resources quickly -- forcing the business to add more costly nodes to cover resource shortages, yet leave the remaining resources underutilized.

Disaggregation is increasingly seen as a potential answer to the problem of HCI resource waste. The introduction of disaggregated hyper-converged infrastructure -- dHCI or HCI 2.0 -- essentially separates compute resources from storage. HCI 2.0 puts CPU and memory in one device, and storage in another device, and both devices can be added separately as needed. This approach helps businesses target the HCI investment in order to support less-traditional workloads that might pose more specific resource demands.

But the evolution of HCI has not stopped with disaggregation, and the HCI industry is starting to embrace the notion of composable infrastructure. In theory, a composable infrastructure separates all resources into independently scalable components, which can be added as needed and interconnected with a specialized network fabric that supports fast, low-latency communication between components. At the same time, unified management software continues to discover, pool, configure -- tier -- provision and manage all of the resources available.

Today, composable infrastructure is still far from such an ideal scenario, but vendors are starting to deliver HCI devices with greater versatility in hardware selection and deployment -- such as allowing other non-vendor storage to be added. The key to success in any composable infrastructure is management software that must focus on resource discovery, pooling, tiering -- organizing resources based on their relative level of performance -- and almost total dependence on software-defined behaviors to provision resources.

Hyper-converged infrastructure and the cloud

It's easy to confuse HCI and cloud technology. Both rely on virtualization and a mature software management layer that can define, organize, provision and manage a proliferation of hardware resources that enterprise workloads can operate within. Although HCI and cloud can interoperate well together, they aren't the same thing, and there are subtle but important differences to consider.

HCI is fundamentally a centralized, software-driven approach to deploying and using a data center infrastructure. The underlying hardware is clearly defined, and the amount of hardware resources is finite. Virtualization abstracts the resources, while software organizes and defines the ways resources are provisioned to workloads.

A cloud is intended to provide computing as a utility, shrouding vast amounts of virtualized resources that users can provision and release as desired through software tools. The cloud not only provides a vast reservoir of resources, but also a staggering array of predefined services -- such as load balancers, databases and monitoring tools -- that users can choose to implement.

Essentially, the difference between HCI and cloud is the difference between hardware and software. HCI is merely one implementation of hardware that can be deployed in a data center. A cloud is really the software and constituent services -- the cloud stack -- built to run atop the available hardware. Thus, an HCI deployment can be used to support a cloud, typically a private cloud or a private cloud integrated as part of a hybrid cloud, aiding in digital transformation. Conversely, a cloud software stack will run on an HCI deployment within the data center.

HCI benefits and drawbacks

Hyper-converged infrastructure might not be appropriate for every IT project or deployment. Organizations must take the time to evaluate the technology, perform proof-of-concept testing and carefully evaluate the tradeoffs involved before committing to HCI.

The main advantage of HCI is simplicity, and the notion of simplicity expresses itself in several different ways. For example, the modular nature of an HCI offering simplifies deployment. Installation time is vastly reduced, and the time needed to configure the installed system and make resources available to workloads is also potentially shorter than traditional hardware deployments. When additional resources are needed, it's a simple matter to install another HCI node, and HCI vendors often offer several different node types -- such as nodes with additional storage or additional compute -- to suit different workload needs. Typical integration problems and optimization challenges are significantly reduced because the HCI system is designed from the ground up for interoperability and optimization.

Simplicity also expresses itself in terms of faster and more efficient HCI management. The use of a unified management platform ensures that all HCI resources are discovered, pooled, configured properly -- following the organization's preferred guidelines for security and business process -- and provisioned efficiently, with all resources visible and monitored. HCI systems lend well to automation in provisioning and maintenance, which is a reason that HCI has become popular with private cloud, VDI and other types of IT projects that benefit from automation. This can be more effective than heterogeneous environments, which might demand multiple management tools and can inadvertently overlook resources if not properly discovered.

Ultimately, HCI deployment and time to production is easier and faster than traditional heterogeneous deployments, allowing HCI to be deployed and supported by smaller businesses with a smaller IT staff and fewer technical skills. Such simplicity also translates into ongoing service and support. Because HCI comes from a single vendor, support is also part of the package, ensuring that one call to technical support should yield tangible solutions to pressing problems; there is no other vendor to blame. This also helps smaller and less skilled IT staff. Taken together, the simplicity and ease promised by HCI can lower the costs of deploying and operating an HCI.

But HCI also poses several potential drawbacks that deserve careful consideration. For example, scalability comes at the cost of vendor lock-in. It's a simple matter to add another node, but only if the new node comes from the same vendor. There are no open standards for physical or logical interoperability between nodes of different HCI systems or vendors. Conventional HCI -- HCI 1.0 -- systems also aggregate CPU, memory and storage into the same node, so buying more of one resource almost always means buying more of all resources -- even if other resources aren't currently needed -- which can be a waste of resources and capital investment. The push toward disaggregation -- HCI 2.0 or dHCI -- promises to help ease such waste.

HCI can pose problems with power density. Packing huge amounts of hardware into such relatively tight physical spaces can wreak havoc with power distribution systems and demand point cooling in data centers that have long eased power and cooling density concerns. This means HCI installation is simple, if power and cooling demands are met.

The scale of HCI is still small. Although this isn't an issue for many workloads, larger deployments that involve many compute servers and tens of terabytes of storage might be better served with separate compute and storage subsystems. This traditional approach more closely resembles disaggregation and allows demanding subsystems to be scaled and optimized separately. Large volumes of server or storage purchases can also benefit from economies of scale that HCI products can't provide.

Some high-end features of HCI -- such as high availability -- might not be available without additional purchases. For example, an HCI node isn't really redundant unless there's at least a second HCI node running parallel workloads alongside the first one. This represents additional investment that the organization might not be able to make, at least initially.

One of the recurring drawback themes is HCI costs. HCI products are vendor-centric and typically carry a premium price tag because there is no unified interoperability between vendor offerings. Aggregated nodes can impose unwanted capital expenses for resources that might not all be needed. Software licensing and maintenance contract costs can drive up recurring HCI costs. And costs might also fluctuate with node types. For example, HCI vendors typically provide several different node options to offer various combinations of CPU, memory, storage and network connectivity to accommodate varied workload needs. Selecting nodes with high-end CPUs, non-volatile storage, 10 Gigabit Ethernet and other options can drive up the price tag for an HCI deployment.

HCI management and implementation

Although HCI brings an array of powerful benefits to the enterprise, there are also numerous management and implementation considerations that must be carefully evaluated and understood before an HCI investment is ever made.

One critical issue is resiliency. HCI simplifies deployments and operation, but the simplicity that users see hides tremendous complexity. Errors, faults and failures can all conspire to threaten critical business data. And while HCI offerings can support resiliency, the feature is never automatic, and it can require detailed understandings of the HCI system's inner workings, such as write acknowledgement, RAID levels or other storage techniques used for resiliency.

Key hyper-converged appliance features include security and interoperability.

To understand the ways that an HCI offering actually handles data resiliency, IT leaders must evaluate the offering and consider how the system handles node or hardware failures, the default operations of data resiliency operations, the workload performance effect of using resiliency options and the overhead resource capacity that is used to provide resiliency.

When it comes to HCI systems management, users can often benefit from third-party management tools -- such as DataOn HCI systems support for Windows Admin Center. By exposing APIs, third-party tools and services can connect to HCI to provide a wider array of services or integrations.

HCI deployments and management also benefit from a clear use case, so it's important to understand any specific or tailored roles that an HCI system plays in the environment and manage those roles accordingly. For example, an HCI deployment intended for backup and DR is likely to be managed and supported differently than an HCI deployment used for everyday production workloads.

Take advantage of any automated policy management and/or enforcement that the HCI system provides. Emerging software-defined tools can enable network and policy management that helps to speed provisioning while enforcing best practices for the business, such as configuration settings adjustment and application security implementation. Such capabilities are often adept at reporting and alerting variations from policy, allowing businesses to maintain careful control over resource use and configurations.

Finally, use care in choosing management tools. The tools that accompany HCI systems are usually proprietary and typically might not interoperate easily -- if at all -- with other servers, storage and network elements across the data center. Organizations that must monitor and manage broader environments without the silos that accompany multiple management tools might benefit from adopting third-party tools