Amii HPC Documentation#

Welcome to Amii’s HPC documentation. This resource provides the core concepts, procedures, and best practices necessary to execute computation-intensive workloads efficiently on HPC clusters, specifically those managed by Amii and its partners.

What this documentation covers#

  • Access & Setup: Navigating account configuration with DRAC.

  • Architecture: An overview of HPC cluster design and infrastructure.

  • Workload Management: Mastering the Slurm scheduler for job submission and resource allocation.

  • Optimization: Techniques for monitoring and tuning performance on HPC systems.

  • Practical Resources: Reusable templates, proven workflows, and examples developed by Amii’s Engineering and Performance team.

Throughout these guides, we utilize Amii’s Vulcan cluster as a primary reference and example environment.

Target Audience#

While primarily designed for researchers and students with access to Amii-managed clusters, this documentation serves as a broad knowledge base. Anyone interested in the following areas will find these resources valuable:

  • Distributed and parallel computing

  • Slurm workload management

  • GPU optimization for AI workflows

  • DevOps and MLOps integration

Note

A significant portion of this content covers general HPC concepts applicable to systems beyond those managed by Amii.

Table of Contents#

Useful Resources