.. _install_hpc: :fas:`server` HPC ========================= This page provides reference information for compiling and running Alamo on high-performance computing (HPC) clusters. .. WARNING:: These instructions are for reference only and may not always work. The Alamo developers do not manage the software on these clusters, and configurations may change over time. If you encounter outdated instructions, please `open an issue on GitHub `_. Managing dependencies on an HPC cluster --------------------------------------- Compiling and running Alamo on an HPC cluster differs slightly from doing so on a local machine, primarily due to dependency management. HPC clusters often provide multiple versions of software dependencies for users. To manage these dependencies efficiently, HPC clusters commonly use `Environment Modules `_ (or simply *modules*), which help users to load and unload software as needed. Most HPC clusters provide the required modules for compiling Alamo. To load an environment module: .. code-block:: bash module load module_1 module_2 ... To unload an environment module: .. code-block:: bash module unload module_1 module_2 ... To unload all modules: .. code-block:: bash module purge While Environment Modules are widely used, another tool called `Spack `_ was specifically developed for dependency management in shared computing environments. You may encounter either system while working on an HPC cluster. The instructions on this page assume the use of Environment Modules. Configuring Alamo on an HPC cluster ----------------------------------- Configuring Alamo on an HPC cluster is similar to configuring it on a local machine. However, you must ensure that a compiler, :code:`mpich`, and Python 3 are available via modules or other means. Additionally, Alamo relies on the `Eigen library `_, which can either be loaded as a module or installed during configuration (preferred) with the :code:`--get-eigen` flag: .. code-block:: bash ./configure --get-eigen ... Compiling Alamo on an HPC cluster --------------------------------- Compiling code can be resource-intensive and time-consuming. Running large, multithreaded operations on the `login node `_ of an HPC cluster is generally discouraged. To avoid this, compile within an interactive job or by submitting a batch job. If the cluster uses the `Slurm Workload Manager `_, an interactive job can be started with `salloc `_: .. code-block:: bash salloc --nodes=1 --cpus-per-task=16 --mem=16G --time=10:00 This command requests a single node with 16 cores and 16 GB of memory for 10 minutes. Once the resources are allocated, your shell will reload, and you can compile with: .. code-block:: bash make -j16 # Replace 16 with the number of cores requested Alternatively, if interactivity is not required, submit a non-interactive compilation job using `srun `_: .. code-block:: srun --nodes=1 --cpus-per-task=16 --mem=16G --time=10:00 make -j16 Adjust resource requests as needed. To reduce wait times, specify a shorter duration using the :code:`--time` flag. Running Alamo on an HPC cluster ------------------------------- To verify that a simulation starts correctly, an interactive job may suffice. However, for full simulations, you should submit a batch job to the cluster's workload manager. For Slurm-based clusters, use the `sbatch `_ command to submit a job script: .. code-block:: sbatch /path/to/job_script The sections below have examples of job scripts that can be modified to suit your needs. Reference scripts for Nova -------------------------- .. NOTE:: For more information on Nova or HPC in general, Iowa State University provides an `extensive online guide `_. By default, Python and GCC modules are loaded when you log in to Nova. However, additional modules are required for certain tasks: +------------------------+--------------------+ | Task | Required Module(s) | +========================+====================+ | configuring w/ g++ | mpich | +------------------------+--------------------+ | configuring w/ clang++ | mpich llvm | +------------------------+--------------------+ | compiling w/ g++ | mpich | +------------------------+--------------------+ | compiling w/ clang++ | mpich llvm | +------------------------+--------------------+ | running Alamo | mpich | +------------------------+--------------------+ The scripts below automatically handle module management. The configuration and compilation scripts below can be run line-by-line or in a Bash script. If you do the latter, remember to make the file executable (:code:`chmod +x /path/to/file`). gcc Configure and Compile Script ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash #!/usr/bin/env bash module purge module load mpich ./configure --get-eigen srun --nodes=1 --cpus-per-task=16 --mem=16G --time=10:00 make -j16 clang++ Configure and Compile Script ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash #!/usr/bin/env bash module purge module load mpich llvm ./configure --get-eigen --comp clang++ srun --nodes=1 --cpus-per-task=16 --mem=16G --time=10:00 make -j16 Alamo Simulation Slurm Job Script ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash #!/usr/bin/env bash #SBATCH --time=24:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=36 #SBATCH --mem-per-cpu=1000 #SBATCH --job-name="alamo" #SBATCH --output="%x-%j-log.txt" #SBATCH --mail-user=your_email@iastate.edu #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL module purge module load mpich srun --mpi=pmi2 ./path/to/alamo/executable /path/to/input/file The script starts a parallel job on Nova. Modify these parameters as needed: * :code:`--time`: The *wall clock time limit* (maximum job duration) * :code:`--nodes`: Number of nodes requested * :code:`--ntasks-per-node`: Number of tasks per node * :code:`--mem-per-cpu`: Memory allocated per core (MB) * :code:`--job-name`: The job name that will be displayed when running `squeue` * :code:`--output`: The name of the log file that will be output (see the `Slurm documentation `_ for filename pattern specifications) * :code:`--mail-user`: Email for notifications (remove if not needed) * **Executable path**: Example: :code:`./bin/alamo-2d-clang++` * **Input file path**: Specify the input file for Alamo .. NOTE:: Iowa State University provides a `Slurm job script generator for Nova `_, which can help generate job scripts. Nova uses a *fair-share scheduling system* to prioritize job execution based on requested resources and past usage. To reduce wait times, request only necessary resources and set reasonable time limits. Slurm automatically determines the number of cores based on the :code:`--nodes` and :code:`--ntasks-per-node` values. Refer to the `Nova hardware guide `_ for appropriate values. Once modifications are made, submit the job with `sbatch `_: .. code-block:: bash sbatch /path/to/job_script.sh