Compare commits

...

2 Commits

Author SHA1 Message Date
Anthony Berg
1056ecea67 feat: add containr Slurm job and docs 2025-04-01 14:52:30 +02:00
Anthony Berg
22563df94f fix: removed unused variable in directory 2025-04-01 14:52:13 +02:00
3 changed files with 47 additions and 6 deletions

View File

@ -0,0 +1,26 @@
#!/bin/bash -l
#SBATCH --job-name=lumi
#SBATCH --account=project_4650000xx
#SBATCH --time=00:10:00
#SBATCH --partition=dev-g
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --gpus-per-node=8
#SBATCH --output=%x-%j.out
#SBATCH --exclusive
N=$SLURM_JOB_NUM_NODES
echo "--nbr of nodes:", $N
echo "--total nbr of gpus:", $SLURM_NTASKS
MyDir=/project/project_4650000xx
MyApplication=${MyDir}/FiniteVolumeGPU_HIP/mpiTesting.py
Container=${MyDir}/FiniteVolumeGPU_HIP/my_container.sif
CPU_BIND="map_cpu:49,57,17,25,1,9,33,41"
export MPICH_GPU_SUPPORT_ENABLED=1
srun --cpu-bind=${CPU_BIND} --mpi=pmi2 \
apptainer exec "${Container}" \
python ${MyApplication} -nx 1024 -ny 1024 --profile

View File

@ -13,9 +13,9 @@ N=$SLURM_JOB_NUM_NODES
echo "--nbr of nodes:", $N
echo "--total nbr of gpus:", $SLURM_NTASKS
Mydir=/project/${project}
Myapplication=${Mydir}/FiniteVolumeGPU_HIP/mpiTesting.py
CondaEnv=${Mydir}/FiniteVolumeGPU_HIP/MyCondaEnv/bin
MyDir=/project/project_4650000xx
MyApplication=${MyDir}/FiniteVolumeGPU_HIP/mpiTesting.py
CondaEnv=${MyDir}/FiniteVolumeGPU_HIP/MyCondaEnv/bin
export PATH="${CondaEnv}:$PATH"
@ -24,4 +24,4 @@ CPU_BIND="map_cpu:49,57,17,25,1,9,33,41"
export MPICH_GPU_SUPPORT_ENABLED=1
srun --cpu-bind=${CPU_BIND} --mpi=pmi2 \
python ${Myapplication} -nx 1024 -ny 1024 --profile
python ${MyApplication} -nx 1024 -ny 1024 --profile

View File

@ -17,19 +17,34 @@ conda-containerize new --prefix MyCondaEnv conda_environment_lumi.yml
where the file `conda_environment_lumi.yml` contains packages to be installed.
### Step 1 alternative: Convert to a singularity container with cotainr
Load the required modules first
```shell
ml CrayEnv
ml cotainr
```
Then build the Singularity/Apptainer container
```shell
cotainr build my_container.sif --system=lumi-g --conda-env=conda_environment_lumi.yml
```
### Step 2: Modify Slurm Job file
Update the contents of [`Jobs/job_lumi.slurm`](Jobs/job_lumi.slurm) to match your project allocation,
and the directories of where the simulator and Conda container is stored.
Depending on your build method, update [`Jobs/job_lumi.slurm`](Jobs/job_lumi.slurm) if `conda-containerize` was used, or [`Jobs/job_apptainer_lumi.slurm`](Jobs/job_apptainer_lumi.slurm) if `containr` was used.
In the job file, the required changes is to match your project allocation,
and the directories of where the simulator and container is stored.
### Step 3: Run the Slurm Job
If `conda-containerize` was used for building:
```shell
sbatch Jobs/job_lumi.slurm
```
Otherwise, if `containr` was used for building:
```shell
sbatch Jobs/job_apptainer_lumi.slurm
```
### Troubleshooting
#### Error when running MPI.