Python is often the first choice for prototyping research ideas, but scaling that prototype to thousands of cores and multi‑node workflows needs a different toolkit. This post shares a webinar I authored for the Advanced HPC-CI Webinar series that walks through a practical path: start with idiomatic Python, isolate hotspots, accelerate them, then scale out while keeping the development loop fast.
I’m Andrea Zonca, Lead of the Scientific Computing Applications Group at the San Diego Supercomputer Center. In the session I focus on pragmatic techniques that have repeatedly worked for moving Python workloads onto supercomputers. About me: researcher profile.
Key chapters:
- Environment management (4:00) – Jupyter workflow; two deployment patterns: (1) pre-packaged Conda env tarballs staged to fast storage; (2) Singularity / Docker containers for portability and reproducibility.
- AI code assistants (9:17) – Using tools like GitHub Copilot plus terminal assistants to speed iteration; keep a tight human review loop.
- Threads vs processes (17:12) – GIL implications; when threads are fine (I/O, native extensions) and when to switch to multiprocessing or compiled sections; memory trade‑offs.
- Numba optimization (32:54) – JIT hotspots, type specialization, parallel=True, cache usage, and when to stop micro‑optimizing.
- Dask for scaling out (55:24) – Task graphs, scheduler behavior, choosing between Array / Delayed / DataFrame, minimizing data movement, dashboard-driven tuning.
My group, the Scientific Computing Applications Group, helps scientists in the US optimize their code on supercomputers. Feel free to contact me via my contact info.
If you have any problems or feedback, please open an issue in the repository.