High Performance Computing on Microsoft Azure

Our guest blogger, Ilai Bavati, explains the differences between High Performance Computing and Traditional Cloud Computing.

Written by Ilai Bavati • Last Updated: • Cloud •

Digital image of cloud with world as binary

[Image Source]

High Performance Computing (HPC) is a process that enables high-speed processing and performance. HPC is typically achieved through the use of supercomputers. The performance of supercomputers is measured in floating-point operations per second (FLOPS) and can reach over a hundred quadrillion FLOPS.

In the past, supercomputers were mythological creatures available only for the highest bidder. Today, cloud vendors are making supercomputers accessible and affordable through cloud-based HPC. Read on to learn that is the difference between traditional cloud computing and HPC, and how cloud-based HPC is achieved on Microsoft Azure.

What Is the Difference Between HPC and Traditional Cloud Computing?

Companies usually turn to cloud-based HPC resources when their IT department’s capacity is fully utilized. Rather than expanding on-premises infrastructure, they can pay for HPC resources on an on-demand basis.

The costs of cloud-based HPC resources are much higher than conventional virtual machines. Organizations need to learn how to use these resources efficiently to leverage their gains. There are three fundamental aspects that determine when you should use conventional cloud resources and when you need specific HPC technologies: volume of processed data, available time, and complexity of the process. A high degree of all three parameters indicates a need for specialized HPC resources.

Unfortunately, there is no simple way to compare the effectiveness of conventional public cloud vs high performance computing tools. Public cloud providers can help you choose the most suitable option according to their expertise. If needed, they can allow you to set up a proof of concept to test performance and ROI of a traditional vs HPC approach.

Benefits of HPC

High Performance Computing has several key advantages:

  • Speed—more powerful chips require less time to conduct experiments or run through massive data collections.
  • Performance—distributed pooled resources enables near 100% resource utilization.
  • Volume—enhanced storage and memory hardware enables processing larger quantities of information and running more complex analysis on larger data volumes.
  • Price—bulk purchasing and cloud usage models can reduce prices. Higher performance and increased processing efficiency provide positive ROI.

HPC systems were created to permit organizations with limited resources to access computing power comparable to a supercomputer. On the Microsoft Azure cloud, organizations can actually gain direct access to a managed Cray supercomputer.

HPC in the Public Cloud

HPC is commonly practiced in public or private clouds. All large cloud suppliers provide HPC solutions, either as a bundle of services or multiple components you can use to construct your own solution. Azure and other clouds provide options for hybrid HPC implementations leveraging on-premise and public cloud resources.

Here are components commonly used to operate HPC in the cloud:

  • Batch scheduling
  • Optimized userspace communication
  • Bare-metal service
  • Clustered virtual machines
  • High-speed interconnects

Azure HPC Platform Services

Azure provides a complete HPC platform including the following services:

  • Compute—Azure provides compute resources at virtually unlimited scale. H-series VMs are available for memory-intensive workloads, N-series VMs use Graphic Processing Units (GPUs) for rich media processing and CUDA/OpenCL.
  • Network—you can use Azure ExpressRoute to create protected tunnels for high-performance hybrid cloud connectivity. It also supports Linux RDMA with InfiniBand for MPI workloads within your data center.
  • Storage—Azure supports direct, fast access to data stored in on-prem NAS devices, via the HPC Cache service. Several providers offer high-performance file storage on Azure, which enables you to achieve very high I/O with sub-millisecond latency. 
  • Workflow Services—Azure Batch lets you manage large numbers of compute nodes, install software you want to run and schedule tasks. Azure CycleCloud proactively sets up HPC Azure clusters and orchestrate information and jobs for cloud-based and hybrid workflows. The HPC platform also integrates with the Azure Kubernetes Service (AKS).
  • Analytics Services—Azure Data Lake Analytics lets you run sophisticated analyses and computations on massive HPC data, with access to a portfolio of Machine Learning and Deep Learning algorithms.

Azure Managed Cray Supercomputers

Microsoft Azure has partnered with Cray, the veteran supercomputer manufacturer, to provide very high levels of scalability and elasticity for the most demanding high-energy computing workloads. 

Azure provides a personal Cray supercomputer delivered as a managed service, integrating with other Azure services. Azure offers dedicated Cray® XC™ or Cray® CS™ supercomputers with attached Cray® ClusterStor™ hosted at an Azure datacenter. 

Cray managed supercomputers are the natural growth path after using H-series and N-series virtual machines. A Cray supercomputer runs workflows in multi-stages, eliminating waiting periods when data moves between on-premises data centers. 

Conclusion

HPC is often mentioned as a solution to big data and Artificial Intelligence (AI) challenges. The more big data you need to work with, the more speed you need. AI relies on big data and high speeds for its continual learning process. As AI and big data penetrate more fields and markets, the need for supercomputers increases. 

Hopefully, this article has helped you understand the importance of cloud-based HPC, and also provided you with key information about HPC on Azure. If you’re new to Azure, you can set up a free account. You’ll get a few credits, which you can use to experiment and ensure that this is the right solution for you.

Are you utilizing High Performance Computing in the cloud? Post your comments below and let's discuss.

Did you like this content? Show your support by buying me a coffee.

Buy me a coffee  Buy me a coffee
Picture of Ilai Bavati

I'm a technology writer and editor based in Tel Aviv. I cover topics ranging from machine learning and cybersecurity to cloud computing and the Internet of Things. I'm interested in the real-world application of emerging technologies, and I see our increasingly connected reality as both disruptive and potentially life-saving.

comments powered by Disqus