Using Anaconda for python environments
Python is a commonly used programming language for scientific workloads. Among the ways to use Python libraries, the Anaconda binary distribution is a very popular means for managing python packages and environments. On Mines’ HPC system users, we recommend they use our centrally managed anaconda installation for setting up Python enivronments.
We will go through how to setup several common anaconda environments.
Setting up base environment
For all environments listed below, we will need to setup a base environment. There are a few different ways to do this, with several advanced options that can be viewed here.
Loading the Python module
Wendian and Mio
module load apps/python3
Creating and activating a conda environment under your home directory
To create a conda environment using a environment name of your choice requires the following command:
conda create --name name_of_your_environment
You then will be asked to confirm by typing ‘y’ and press enter.
This will create your new environment and will be saved under under $HOME/.conda/envs/name_of_your_environment
To activate the environment, type the command
conda activate name_of_your_environment
In your terminal, you should now see the following to the left of your username
(name_of_your_environment) username@hpc$
Creating conda environment in a custom location
To create a conda environment in a custom location, use the —prefix
flag when using conda create
:
conda create --prefix=/your/custom/path/
You will be asked to confirm by typing ‘y’ like above. To activate the environment, type the command
conda activate /your/custom/path
Appliciation-specific conda environment
All instructions here we be assumed to be created under `$HOME/.conda
NumPy + SciPy + Matplotlib
For this environment, you can set it up in one command when initially creating the environment:
conda create --name scientific_python numpy scipy matplotlib
Tensorflow (GPU)
For Tensorflow with NVIDIA GPU support, GPU support is included with Tensorflow:
create create --name tf_env tensorflow
Tensorflow (CPU-only)
If you want tensorflow with CPU support only, use the following command when creating your environment:
conda create --name tf_cpu_only_env tensorflow-cpu
Cleaning up conda packages
After installing packages into a conda environment, there will many packages left as cache under $HOME/.conda/pkgs
by default. These cached packages can be reused when creating new environments that depend on packages that were previously installed. However, this cache can become quite large and often cause you to reach the 20 GB limit of $HOME
. There are few different options for cleaning up the package cache.
Remove all cached packages
conda clean --all
This will remove all cached packages under $HOME/.conda/pkgs
Proper mpi4py setup to use system MPI on Wendian/Mio
Load the required modules
module load apps/python3
module load compilers/gcc
module load mpi/openmpi/gcc-cuda
Set the environment variable CC
to mpicc
:
export CC=$(which mpicc)
You can double check it was set correctly by typing:
echo $CC
And you should see something like:
/sw/mpi/openmpi/4.1.4/gnu-cuda/bin/mpicc
Next, create your conda environment:
conda create -n mpi4py_test python=3.11 -y
and activate it:
conda activate mpi4py_test
Now install mpi4py
using pip not conda:
pip install --no-cache-dir mpi4py numpy
You can double check it works by launching python and importing the module:
$ python
>>> from mpi4py import MPI
>>> MPI.Comm
<class 'mpi4py.MPI.Comm'>