JNT Visual - Fotolia
Converged infrastructure is designed to be easy to buy, deploy and operate. However, it appears the reality is that the operation of a large estate of CI is not as simple as we might like.
A colleague works at a large multinational manufacturing business that is replacing dozens of converged infrastructure (CI) deployments with hyper-converged infrastructure (HCI) to reduce their operating costs. They are not completely replacing CI, but the majority of their locations are moving to hyper-converged computing to take advantage of HCI's benefits.
The company has IT infrastructures in more than half of the countries around the world. It has a mix of manufacturing plants and corporate offices, all with their own local IT operations. The regulated nature of the industry the business is in severely limits its ability to adopt public cloud technologies, so on-premises IT will remain for the foreseeable future.
The CI product was easy to buy and deploy, but has proven to be complicated to update. Updating a CI rack to a new hypervisor release, or even just deploying hypervisor security patches, is a week-long process. Not only does the process take time, but it also requires on-site technical expertise.
Often, the upgrade needs to be performed by the professional services arm of the CI vendor. The result is that these procedures are hard to schedule and manage.
While the CI racks are easy to buy, they are not cheap to buy. The smallest unit is a SAN with four server nodes -- nearly half a rack with all the supporting switches. For a manufacturing site that required a couple dozen virtual machines (VMs), that is a lot of expensive infrastructure.
Remote management means less IT overhead
One of the benefits of hyper-converged computing is that the whole system is designed for remote management. Only the initial deployment needs any specialist IT knowledge on site. Once there is an HCI cluster, all of the ongoing operations are managed remotely.
Even when additional nodes are deployed, the on-site element is simply to rack and cable the new node. The new nodes are added to the cluster using remote management tools. The clusters are easy to upgrade remotely, a process that typically takes under 30 minutes and that does not involve any downtime for the cluster and its workload VMs.
A single engineer might be able to upgrade 20 HCI clusters in a week. Compare that to a week of on-site professional services to update a single CI cluster.
A side benefit of an HCI deployment is that hyper-converged computing is designed to be failure-tolerant. A failed disk is not something that requires urgent attention. The HCI cluster will rebuild its data protection using spare capacity on the remaining disks. Over time, the HCI cluster could experience multiple disk failures and still operate.
By comparison, the SAN within the CI product has a limited supply of hot standby disks; when the supply is exhausted, a further failure can result in data loss. The SAN requires disk failure to be treated as an urgent problem, and it is designed to tolerate a very limited amount of failure. An urgent problem in a factory in another country, or even another time zone, is great to avoid.
The main data centers at the manufacturing business have retained and upgraded their CI infrastructure. These data centers have workloads that suit the isolation of resources that CI offers. The business has trained staff on site to handle the more complex management of the CI systems.
The higher capacity servers in the CI implementation also suit applications that must scale up, as they can have more than two CPU sockets and, consequently, more RAM than an HCI node. For this company, hyper-converged computing is not a silver bullet to its infrastructure problems; it is a useful solution for some of its requirements.
The thorns in the roses
A hyper-converged computing deployment is not without its own challenges. After a year in operation, there are HCI clusters that need expansion.
In theory, adding new nodes to an existing HCI cluster is a simple process. However, it is a best practice to have some uniformity of nodes within an HCI cluster.
In particular, having servers from different vendors in the same cluster is not recommended. There can also be challenges with the balance of compute and storage in the HCI nodes not matching the requirements of the workload. Adding nodes for storage capacity without adding compute capacity can lead to operational complexity and can affect performance. Usually, a different hypervisor is used for the storage nodes, leading to new operational processes to support that hypervisor.
This multinational manufacturing business did not see the operational simplification that CI promised them. To gain operational simplicity, they replaced some converged infrastructure with HCI.
The benefits from hyper-converged computing have been greatest at smaller sites where there is no staff with specialist CI skills. The new HCI structure is supported by specialist staff in central locations, which simplifies the operation of the remote sites.
Lock-in may keep HCI vendors from innovating
Looking back at the HCI market in 2016
A look forward to HCI in 2017