IPython parallell setup on Carver at NERSC
IPython parallel is one of the easiest ways to spawn several Python sessions on a Supercomputing cluster and process jobs in parallel.
On Carver, the basic setup is running a controller on the login node, and submit engines to the computing nodes via PBS.
First create your configuration files running:
ipython profile create --parallel
Therefore in the ~/.config/ipython/profile_default/ipcluster_config.py, just need to set:
c.IPClusterStart.controller_launcher_class = 'LocalControllerLauncher'
c.IPClusterStart.engine_launcher_class = 'PBS'
c.PBSLauncher.batch_template_file = u'~/.config/ipython/profile_default/pbs.engine.template'
You also need to allow connections to the controller from other hosts, setting in ~/.config/ipython/profile_default/ipcontroller_config.py:
c.HubFactory.ip = '*'
With the path to the pbs engine template.
Next a couple of examples of pbs templates, for 2 or 8 processes per node:
IPython configuration does not seem to be flexible enough to add a parameter for specifying the processes per node.
So I just created a bash script that get as parameters the processes per node and the total number of nodes:
ipc 8 2 # 2 nodes with 8ppn, 16 total engines
ipc 2 3 # 3 nodes with 2ppn, 6 total engines
Once the engines are running, jobs can be submitted opening an IPython shell on the login node and run:
from IPython.parallel import Client
rc = Client()
lview = rc.load_balanced_view() # default load-balanced view