Workflow on GPU Clusters with Python and PyCharm

Setting up Python on a GPU cluster

Benjamin Hou

2 minute read

Note: [in progress of updating… 28 May 2019]

Part 1: Network Python

As DoC machines are Ubuntu Linux, python should be preinstalled. Mostly likely location would be the /usr/bin/ folder:

bh1511@kythnos:~$ which python
/usr/bin/python
bh1511@kythnos:~$ which pip
/usr/local/bin/pip

This is the system python, calling python straight away will call the binary /usr/bin/python on that machine.

For Virtual Environments, see https://realpython.com/python-virtual-environments-a-primer/ or https://www.geeksforgeeks.org/python-virtual-environment/

Problems starts arising when virtual environments are created on the network (e.g. on /homes/$USER or /vol) using the system python. It is not guaranteed that every base python is the same. Dependencies may be missing or compiled libraries may be slightly different. Many factors can come into play. The ideal way to circumvent this problem is to have a network python that is machine agnostic, and use that as the base python to create further virtual environments.

Any python would do, recommended to install intelpython2 or intelpython3. To install, download the intelpython linux package, extract it and run setup_intel_python.sh. It installs IN PLACE, so the path where the extracted package lives is where it is installed. This should be on a networked location so it is accessibe from any cluster machine (e.g. /vol/medic01).

Source the intelpython virtual environment, this will override so that the networked intelpython becomes the default python.

For example:

bh1511@kythnos:~$ cd /vol/medic01/users/bh1511/_opt/intelpython3/
bh1511@kythnos:/vol/medic01/users/bh1511/_opt/intelpython3$ source bin/activate 
(base) bh1511@kythnos:/vol/medic01/users/bh1511/_opt/intelpython3$ 
(base) bh1511@kythnos:/vol/medic01/users/bh1511/_opt/intelpython3$ which python
/vol/medic01/users/bh1511/_opt/intelpython3/bin/python
(base) bh1511@kythnos:/vol/medic01/users/bh1511/_opt/intelpython3$ which pip
/vol/medic01/users/bh1511/_opt/intelpython3/bin/pip
(base) bh1511@kythnos:/vol/medic01/users/bh1511/_opt/intelpython3$ which conda
/vol/medic01/users/bh1511/_opt/intelpython3/bin/conda

Part 2: Creating virtual environments with intelpython

Creating virtual environments with intelpython is a 4 step process:

  1. Source the intelpython root environment
  2. Create the virtual environment
  3. Deactivate the intelpython root environment
  4. Source the newly created virtualenv

NOTE: virtualenv binary isn’t packaged by default with intelpython. Install by sourcing the root intelpython environment and run conda install virtualenv. intelpython default package manager is conda.

# Sourcing the intelpython base virtual environment
bh1511@kythnos:~$ source /vol/medic01/users/bh1511/_opt/intelpython3/bin/activate 

# Creating a virtual environment "myenv"
(base) bh1511@kythnos:~$ virtualenv myenv
Using base prefix '/vol/medic01/users/bh1511/_opt/intelpython3'
New python executable in /homes/bh1511/myenv/bin/python
copying /vol/medic01/users/bh1511/_opt/intelpython3/bin/python => /homes/bh1511/myenv/bin/python2
copying /vol/medic01/users/bh1511/_opt/intelpython3/bin/../lib/libpython3.6m.so.1.0 => /homes/bh1511/myenv/lib/libpython3.6m.so.1.0
Installing setuptools, pip, wheel...done.
Overwriting /homes/bh1511/myenv/bin/activate.fish with new content

# Deactivating intelpython base environment
(base) bh1511@kythnos:~$ source deactivate

# sourcing the newly created virtual environment
bh1511@kythnos:~$ source myenv/bin/activate
(myenv) bh1511@kythnos:~$ which python
/homes/bh1511/myenv/bin/python
(myenv) bh1511@kythnos:~$ which pip
/homes/bh1511/myenv/bin/pip
(myenv) bh1511@kythnos:~$ 

“myenv” here now depends on /vol/medic01/users/bh1511/_opt/intelpython3, which is independent of any one specific machine.

Part 3: Remote deployment with PyCharm

[coming soon?]