What are the benefits of a multicloud strategy?

When hosting services in a public cloud, all organisations want their data to be protected in the event of any disaster. Hosting all the applications with a single cloud service provider (CSP) may invite trouble even if distinct data centres, availability zones or regions are chosen to host these applications with a specific CSP. Hence, multicloud.

In case of a central component failure within a CSP infrastructure, it will affect all or most of the regions. This will lead to a major disaster for an organisation which has hosted its applications with that single CSP. There are also some use cases of regulatory compliance as well which may prohibit companies from hosting services in a specific geography or region. They would hence prefer to host their applications with a different CSP.

My colleague, Mohit, has previously written about the key criteria to adopt multicloud.

How to choose a cloud service provider?

  • Hosting all production applications with one CSP
  • Hosting non-prod/dev applications with another CSP
  • Split both prod and non-prod applications by half between the two CSPs

Competitors

There are a number of CSPs in the market with AWS (Amazon Web Services) and Microsoft Azure leading the pack offering cutting edge infrastructure. Google Cloud and Alibaba too are top players that provide competitive advantages.

Costs

Costs for the hosting component and various other elements such as network connection, bandwidth, public IP, load balancer, VMs, gateways, etc vary between different CSPs.

While one provider may be more cost-effective for a particular service another CSP can be more cost-effective for other types of services. It is important to carry out a detailed cost-benefit analysis with a number of applications, VMs needed, network connectivity required, bandwidth requirement and all other critical components amongst various CSPs before deciding on which CSP to go with.

When it comes to cost, there are a number of hidden factors that are generally overlooked when making a cloud hosting decision. One such example is bandwidth cost, which is mostly different for inbound and outbound traffic. It is necessary to estimate the total bandwidth requirement for both inbound and outbound traffic. Another such thing to consider is to shutdown VMs during non-business hours or when not required and bring them up only when needed through schedulers to keep the costs down.

High Availability

When designing for cloud hosting, it is imperative to factor for high availability so that business functions are continued upon failure in one or the other infrastructure components.

A CSP generally has different regions and availability zones. The regions are geographically dispersed locations and usually several miles apart from each other and the availability zones are discrete datacentres within a region, usually closer to each other but still physically separate having redundant power supplies, cooling systems, etc. When planning for high availability and fault tolerance, greater consideration should be given to host applications in different availability zones within a region. It is also important to note that not all services may be provided in every availability zone by a specific CSP. So, a thorough analysis should be done prior to choosing a service provider to host your applications.

Scalability

When a company decides to migrate or host services in the cloud, it should look at scalability options for various infrastructure components such that they don’t become a roadblock in future for expansion.

For example, AWS provides up to 500Mbps of bandwidth per circuit of Direct Connect. Azure can provide up to 1Gbps with Express Route circuits and 10Gbps with Express Route Direct circuits. One should also look at if a specific region or availability zone has any constraint in terms of resources. In 2018, the UK South region of Microsoft Azure hit capacity issues and couldn’t host any further VMs for specific series of processors. This affected go-live events for several companies. Later, further resources were added to expand the capacity in this region by Microsoft.

Performance

A key deciding factor when choosing a CSP is performance. The performance of applications is greatly affected if a hosted application is far from on-prem datacentres.

Of course, not all CSPs may provide direct termination to on-prem DCs and hence routing it via an authorised telco provider such as Colt, BT, etc. is a must. Termination of all circuits of different CSPs in a CNF facility may provide greater flexibility of having removed dependency on a specific telco provider. You may want AWS and Azure both coming at a single CNF and if tomorrow you want to move away from either AWS or Azure to Google or Alibaba then it will be a less complicated switchover.

Whilst designing for connectivity, it should also be looked at what gateway services (such as VPN) are provided by the CSPs. Some VPN gateways may be incompatible with on-prem VPN devices, for example, IKEv1 VPN gateways at on-prem will not be compatible with IKEv2 VPN gateways of a CSP.

Azure provides Express Route circuits to connect its public cloud to your on-prem DCs. Express Routes are highly available network circuits mandatorily bifurcating into two separate physical primary and secondary circuits. AWS Direct Connect circuit is single link only.

Connectivity

When choosing a CSP, note all the options and offerings — do you want circuits to be directly terminated to your on-prem DCs from a CSP or do you want them to be terminated at a central location such as CNF (Career Neutral Facility)?

Of course, not all CSPs may provide direct termination to on-prem DCs and hence routing it via an authorised telco provider such as Colt, BT, etc. is a must. Termination of all circuits of different CSPs in a CNF facility may provide greater flexibility of having removed dependency on a specific telco provider. You may want AWS and Azure both coming at a single CNF and if tomorrow you want to move away from either AWS or Azure to Google or Alibaba then it will be a less complicated switchover.

Whilst designing for connectivity, it should also be looked at what gateway services (such as VPN) are provided by the CSPs. Some VPN gateways may be incompatible with on-prem VPN devices, for example, IKEv1 VPN gateways at on-prem will not be compatible with IKEv2 VPN gateways of a CSP.

Azure provides Express Route circuits to connect its public cloud to your on-prem DCs. Express Routes are highly available network circuits mandatorily bifurcating into two separate physical primary and secondary circuits. AWS Direct Connect circuit is single link only.

Security

Application security should definitely be considered when hosting it in a public cloud. Traffic traversing over the public links should be secured to prevent any intrusion. Some of the options for securing network transmission are through IPSec VPN tunnels, MACSec, et al. It is worth noting that these services may or may not be supported by every CSP and each type of circuit offerings. For example, AWS doesn’t support MACSec while Azure supports MACSec only on Express Route Direct circuits.

There should be firewalls between public cloud and on-prem connections and also Egresss/Ingress connections to the cloud. It is a good practice to have a separate set of firewalls in the cloud for Egress and Ingress.

SAAS and Public Connections

When accessing SaaS (Software-as-a-Service) applications from your own public cloud, there could be a possibility of those SaaS applications being hosted by the same CSP and you could benefit from having private connectivity from your own public cloud network to SaaS.

For example, MongoDB SaaS service can have private peering within the AWS region from your own AWS public cloud VPC. This will increase the performance of your application and provide enough security and traffic routing over the internet and passing through the multiple hops.

Tools

There are a number of tools to monitor the health of cloud infrastructure and to provide proactive tracking of application failure. Companies should leverage on a combination of freely provided tools by CSP and tools sold by other companies for such health checks and monitoring.

Some of the good tools are Amazon CloudWatch, Splunk, All-In-One-Cloud monitoring, App Dynamics, BMC TrueSight pulse, Solarwinds, Azure network watch, etc.

Management and Operation

IaaS infra can be managed remotely via an offshore and an onshore team which is skilled enough in various technologies such as VM, Database, Network, Security, Storage, etc. The team should be skilled on DevOps for day to day jobs and good, resilient connectivity from public cloud to network operation centre is needed for continued support. Management of cloud-hosted applications can also take place over public internet connections.

Inter-Cloud

When companies host their applications with multiple CSPs, it is necessary to have inter CSP connections. Inter CSP connections can route via your central locations, such as CNF or your on-prem DCs.