HPC & Scientific Computing · C++/CUDA · Numerical Methods · MOX Laboratory
HPC and GPU software engineer working on scientific applications and scalable computing systems at the MOX Laboratory (Politecnico di Milano), contributing to research across applied mathematics, physics, and engineering.
Work focuses on C++/CUDA development, numerical methods, and performance engineering, including performance portability across heterogeneous CPU/GPU architectures and the design of scalable simulation codes.
Background in computational physics and numerical methods, with experience in large-scale simulations and performance-critical applications. Contributions include peer-reviewed publications and the development of scientific software.
Experience in developing and optimising numerical methods for large-scale simulations and scientific applications.
Focus on GPU optimisation, parallel algorithms, and development of portable, high-performance scientific software for research applications.
Experience in scientific software development and optimisation, as well as physics-based modelling and large-scale simulations, combining numerical methods with high-performance computing techniques.
Also involved in MSc-level teaching and training activities in scientific computing and HPC, including hands-on laboratories and support to large student cohorts.
Leading HPC infrastructure development and scientific computing strategy. Designing GPU-enabled systems and developing high-performance scientific software, while coordinating technical activities and supporting research projects.
Work includes development of scientific software from scratch, later adopted and extended in subsequent research projects, as well as analysis and extension of existing codebases and adaptation of HPC software stacks (e.g. OpenPBS) to heterogeneous Linux environments.
Delivering MSc-level laboratories in scientific computing (C++, MPI, GPU programming) and supervising students on HPC and GPU-related topics. Co-supervision of a PhD student.
Developing massively parallel numerical methods and GPU-accelerated simulation codes with performance portability across architectures, alongside dissemination activities including publications, conference presentations, and organisation of minisymposia.
Design and deployment of modern GPU-based HPC systems to support research workloads.
Implementation of a massively parallel Material Point Method in C++20 using TBB and (roc)Thrust, targeting CPU/GPU architectures with performance portability.
Adopted for continued development within an ongoing PhD project.
Plasma edge modelling for nuclear fusion using SOLPS-ITER. Background strengthened through participation in the Max Planck Institute for Plasma Physics Summer School.
Implementation of a neural network from scratch in C++ using Eigen and STL.
Repository →
paolojoseph dot baioni at polimi dot it
paolojoseph dot baioni at gmail dot com