Axiom Demonstrates HAFS Cloud Performance on AWS platform

Axiom Demonstrates HAFS Cloud Performance on AWS Platform

Axiom Consultants, Inc., through the AWS Partnership Program, has successfully ported the Unified Forecast System (UFS) Hurricane Analysis and Forecast System (HAFS) to the cloud. This work was undertaken as a proof of concept to understand the cost and effort required to move the UFS Application from its traditional high performance computing environments like NOAA’s on-premises, bare-metal Research and Development High Performance Computing Systems (RDHPCS) to a cloud environment – a transition that potentially breaks down barriers to research and development for many UFS Community contributors outside NOAA. 

We began with a cost assessment for using AWS to run HAFS. We ran a basic HAFS forecast test case to test AWS Parallel Cluster, choosing instance types that fit the needs of the components of the HAFS system (preprocessing, atmospheric forecast, and postprocessing). 

We compared:

  • Intel-based Parallel Cluster configuration to an alternative AMD-based configuration.
  • Used instance types leveraging Intel Skylake-SP processors that had been memory optimized (r5.24xlarge) for memory intensive pre- and post-processing components.
  • We chose compute optimized versions (c5n.18xlarge) with Elastic Fabric Adaptor (EFA) enabled for the forecast component. When running the same experiments with AMD processors, we used Hpc6a.48xlarge instances featuring EFA-enabled 3rd generation AMD EPYC 7003 series processors for all HAFS components. We followed up with a preliminary assessment of containerizing HAFS with Singularity and running the workflow on the Intel- and AMD-configured Parallel Clusters.

Preliminary results:

  • AMD processors speed up pre-processing steps by a factor of 2.6 – 3, with a cost savings factor of 5.5-6.3 over the r5, memory optimized Intel Skylake-SP instances.
  • Comparing the forecasts over 10 instances on HPC6a and c5n.18xlarges clusters, the AMD platform performed 3x faster, for a computational cost savings of 5x that of c5n instances.

Future work:

  • Extend the work to assess cost associated with more complicated HAFS configurations.
  • Port to additional cloud platforms
  • Perform scaling studies for HAFS components
  • Contribute portability changes and container recipes to the UFS Community

We recognized a challenge at NOAA, so we prototyped an innovative solution by partnering with AWS. Now we’ve positioned ourselves to contribute portability changes and container recipes back to the UFS Community and the NWS repository. — “We care about the mission…”