Deploy scalable Jupyterhub with Kubernetes on Jetstream

Tested in June 2018 with Ubuntu 18.04 and Kubernetes 1.10
Updated in February 2018 with newer version of kubeadm-bootstrap, Kubernetes 1.9.2

Introduction

The best infrastructure available to deploy Jupyterhub at scale is Kubernetes. Kubernetes provides a fault-tolerant system to deploy, manage and scale containers. The Jupyter team released a recipe to deploy Jupyterhub on top of Kubernetes, Zero to Jupyterhub. In this deployment both the hub, the proxy and all Jupyter Notebooks servers for the users are running inside Docker containers managed by Kubernetes.

Kubernetes is a highly sophisticated system, for smaller deployments (30/50 users, less then 10 servers), another option is to use the Docker Swarm mode, I covered this in a tutorial on how to deploy it on Jetstream.

If you are not already familiar with Kubernetes, better first read the section about tools in Zero to Jupyterhub.

In this tutorial we will be installing Kubernetes on 2 Ubuntu instances on the XSEDE Jetstream OpenStack-based cloud, configure permanent storage with the Ceph distributed filesystem and run the "Zero to Jupyterhub" recipe to install Jupyterhub on it.

Setup two virtual machines

First of all we need to create two Virtual Machines from the Jetstream Atmosphere admin panelI tested this on XSEDE Jetstream Ubuntu 16.04 image (with Docker pre-installed), for testing purposes "small" instances work, then they can be scaled up for production. You can name them master_node and node_1 for example. Make sure that port 80 and 443 are open to outside connections.

Then you can SSH into the first machine with your XSEDE username with sudo privileges.

Install Kubernetes

The "Zero to Jupyterhub" recipe targets an already existing Kubernetes cluster, for example on Google Cloud. However the Berkeley Data Science Education Program team, which administers one of the largest Jupyterhub deployments to date, released a set of scripts based on the kubeadm tool to setup Kubernetes from scratch.

This will install all the Kubernetes services and configure the kubectl command line tool for administering and monitoring the cluster and the helm package manager to install pre-packaged services.

SSH into the first server and follow the instructions at https://github.com/data-8/kubeadm-bootstrap to "Setup a Master Node" this will install a more recent version of Docker.

Once the initialization of the master node is completed, you should be able to check that several containers (pods in Kubernetes) are running:

zonca@js-xxx-xxx:~/kubeadm-bootstrap$ sudo kubectl get pods --all-namespaces
NAMESPACE     NAME                                                    READY     STATUS    RESTARTS   AGE
kube-system   etcd-js-169-xx.jetstream-cloud.org                      1/1       Running   0          1m
kube-system   kube-apiserver-js-169-xx.jetstream-cloud.org            1/1       Running   0          1m
kube-system   kube-controller-manager-js-169-xx.jetstream-cloud.org   1/1       Running   0          1m
kube-system   kube-dns-6f4fd4bdf-nxxkh                                3/3       Running   0          2m
kube-system   kube-flannel-ds-rlsgb                                   1/1       Running   1          2m
kube-system   kube-proxy-ntmwx                                        1/1       Running   0          2m
kube-system   kube-scheduler-js-169-xx.jetstream-cloud.org            1/1       Running   0          2m
kube-system   tiller-deploy-69cb6984f-77nx2                           1/1       Running   0          2m
support       support-nginx-ingress-controller-k4swb                  1/1       Running   0          36s
support       support-nginx-ingress-default-backend-cb84895fb-qs9pp   1/1       Running   0          36s

Make also sure routing is working by accessing with your web browser the address of the Virtual Machine js-169-xx.jetstream-cloud.org and verify you are getting the error message default backend - 404.

Then SSH to the other server and set it up as a worker following the instructions in "Setup a Worker Node" at https://github.com/data-8/kubeadm-bootstrap,

Once the setup is complete on the worker, log back in to the master and check that the worker joined Kubernetes:

zonca@js-169-xx:~/kubeadm-bootstrap$ sudo kubectl get nodes
NAME                             STATUS    ROLES     AGE       VERSION
js-168-yyy.jetstream-cloud.org   Ready     <none>    1m        v1.9.2
js-169-xx.jetstream-cloud.org    Ready     master    2h        v1.9.2

Setup permanent storage for Kubernetes

The cluster we just setup has no permament storage, so user data would disappear every time a container is killed. We woud like to provide users with a permament home that would be available across all of the Kubernetes cluster, so that even if a user container spawns again on a different server, the data are available.

First we want to login again to Jetstream web interface and create 2 Volumes (for example 10 GB) and attach them one each to the master and to the first node, these will be automatically mounted on /vol_b, with no need of rebooting the servers.

Kubernetes has capability to provide Permanent Volumes but it needs a backend distributed file system. In this tutorial we will be using Rook which sets up the Ceph distributed filesystem across the nodes.

We can first use Helm to install the Rook services (I ran my tests with v0.6.1):

sudo helm repo add rook-alpha https://charts.rook.io/alpha
sudo helm install rook-alpha/rook

Then check that the pods have started:

zonca@js-xxx-xxx:~/kubeadm-bootstrap$ sudo kubectl get pods
NAME                            READY     STATUS    RESTARTS   AGE
rook-agent-2v86r                1/1       Running   0          1h
rook-agent-7dfl9                1/1       Running   0          1h
rook-operator-88fb8f6f5-tss5t   1/1       Running   0          1h

Once the pods have started we can actually configure the storage, copy this rook-cluster.yaml file to the master node. Better clone all of the repository as we will be using other files later.

The most important bits are:

dataDirHostPath: this is a folder to save the Rook configuration, we can set it to /var/lib/rook
storage: directories: this is were data is stored, we can set this to /vol_b which is the default mount point of Volumes on Jetstream. This way we can more easily back those up or increase their size.
versionTag: make sure this is the same as your rook version (you can find it with sudo helm ls)

Then run it with:

sudo kubectl create -f rook-cluster.yaml

And wait for the services to launch:

zonca@js-xxx-xxx:~/kubeadm-bootstrap$ sudo kubectl -n rook get pods
NAME                              READY     STATUS    RESTARTS   AGE
rook-api-68b87d48d5-xmkpv         1/1       Running   0          6m
rook-ceph-mgr0-5ddd685b65-kw9bz   1/1       Running   0          6m
rook-ceph-mgr1-5fcf599447-j7bpn   1/1       Running   0          6m
rook-ceph-mon0-g7xsk              1/1       Running   0          7m
rook-ceph-mon1-zbfqt              1/1       Running   0          7m
rook-ceph-mon2-c6rzf              1/1       Running   0          6m
rook-ceph-osd-82lj5               1/1       Running   0          6m
rook-ceph-osd-cpln8               1/1       Running   0          6m

This step launches the distributed file system Ceph on all nodes.

Finally we can create a new StorageClass which provides block storage for the pods to store data persistently, get rook-storageclass.yaml from the same repository we used before and execute with:

sudo kubectl create -f rook-storageclass.yaml

You should now have the rook storageclass available:

sudo kubectl get storageclass
NAME         PROVISIONER
rook-block   rook.io/block

(Optional) Test Rook Persistent Storage

Optionally, we can deploy a simple pod to verify that the storage system is working properly.

You can copy alpine-rook.yaml from Github and launch it with:

sudo kubectl create -f alpine-rook.yaml

It is a very small pod with Alpine Linux that creates a 2 GB volume from Rook and mounts it on /data.

This creates a Pod with Alpine Linux that requests a Persistent Volume Claim to be mounted under /data. The Persistent Volume Claim specified the type of storage and its size. Once the Pod is created, it asks the Persistent Volume Claim to actually request Rook to prepare a Persistent Volume that is then mounted into the Pod.

We can verify the Persistent Volumes are created and associated with the pod, check:

sudo kubectl get pv
sudo kubectl get pvc
sudo kubectl get logs alpine

We can get a shell in the pod with:

sudo kubectl exec -it alpine  -- /bin/sh

access /data/ and make sure we can write some files.

Once you have completed testing, you can delete the pod and the Persistent Volume Claim with:

sudo kubectl delete -f alpine-rook.yaml

The Persistent Volume will be automatically deleted by Kubernetes after a few minutes.

Setup HTTPS with letsencrypt

We need kube-lego to automatically get a HTTPS certificate from Letsencrypt, For more information see the Ingress section on the Zero to Jupyterhub Advanced topics.

First we need to customize the Kube Lego configuration, edit the config_kube-lego_helm.yaml file from the repository and set your email address, then:

sudo helm install stable/kube-lego --namespace=support --name=lego -f config_kube-lego_helm.yaml

Then after you deploy Jupyterhub if you have some HTTPS trouble, you should check the logs of the kube-lego pod. First find the name of the pod with:

sudo kubectl get pods -n support

Then check its logs:

sudo kubectl logs -n support lego-kube-lego-xxxxx-xxx

Install Jupyterhub

Read all of the documentation of "Zero to Jupyterhub", then download config_jupyterhub_helm.yaml from the repository and customize it with the URL of the master node (for Jetstream js-xxx-xxx.jetstream-cloud.org) and generate the random strings for security, finally run the Helm chart:

sudo helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
sudo helm repo update
sudo helm install jupyterhub/jupyterhub --version=v0.6 --name=jup \
    --namespace=jup -f config_jupyterhub_helm.yaml

Once you modify the configuration you can update the deployment with:

sudo helm upgrade jup jupyterhub/jupyterhub -f config_jupyterhub_helm.yaml

Test Jupyterhub

Connect to the public URL of your master node instance at: https://js-xxx-xxx.jetstream-cloud.org

Try to login with your XSEDE username and password and check if Jupyterhub works properly.

If something is wrong, check:

sudo kubectl --namespace=jup get pods

Get the name of the hub pod and check the logs:

sudo kubectl --namespace=jup logs hub-xxxx-xxxxxxx

Check that Rook is working properly:

sudo kubectl --namespace=jup get pv
sudo kubectl --namespace=jup get pvc
sudo kubectl --namespace=jup describe pvc claim-YOURXSEDEUSERNAME

Administration tips

Add more servers to Kubernetes

We can create more Ubuntu instances (with a volume attached) and add them to Kubernetes by repeating the same setup we performed on the first worker node. Once the node joins Kubernetes, it will be automatically used as a node for the distributed filesystem by Rook and be available to host user containers.

Remove a server from Kubernetes

Launch first the kubectl drain command to move the currently active pods to other nodes:

sudo kubectl get nodes
sudo kubectl drain <node name>

Then suspend or delete the instance on the Jetstream admin panel.

Configure a different authentication system

"Zero to Jupyterhub" supports out of the box authentication with:

XSEDE credentials with CILogon
Many Campuses credentials with CILogon
Globus
Google

See the documentation and modify config_jupyterhub_helm_v0.5.0.yaml accordingly.

Acknowledgements

The Jupyter team, in particular Yuvi Panda, for providing a great software platform and a easy-to-user resrouce for deploying it and for direct support in debugging my issues
XSEDE Extended Collaborative Support Services for supporting part of my time to work on deploying Jupyterhub on Jetstream and providing computational time on Jetstream
Pacific Research Platform, in particular John Graham, Thomas DeFanti and Dmitry Mishin (SDSC) for access to their Kubernetes platform for testing
XSEDE Jetstream's Jeremy Fischer for prompt answers to my questions on Jetstream