Pablo Casares is a quantum algorithm scientist with nine years of experience applying fault-tolerant quantum methods to chemistry, materials, and optimization problems, currently developing Trotter-based Hamiltonian simulation at Xanadu while directing the International Programme on AI Evaluation: Capabilities and Safety. He created GradDFT, a JAX library for differentiable exchange-correlation design, and led early quantum approaches to protein folding (QFold), lithium-ion battery simulation, and quantum interior-point algorithms with best-known scaling. Complementing his quantum work, Pablo has contributed to large-scale LLM evaluation efforts such as ADELE and helped build a multi-step arithmetic task for Google’s BIG-bench, bridging rigorous benchmarking and safety assessment. He holds a PhD in quantum algorithms, an Oxford MSc in theoretical physics, and combines deep technical research with public engagement, including award-winning journal reviewing and a public commitment to effective altruism. An unusual strength is his dual fluency in fault-tolerant quantum algorithm design and practical AI evaluation, enabling cross-disciplinary insights that inform safer, more useful systems.
9 years of coding experience
BSc in Physics, Physics, BSc in Physics, Physics at Universidad de Extremadura
MSc, Theorical and Mathematical Physics, MSc, Theorical and Mathematical Physics at University of Oxford
PhD in Quantum algorithms, Physics, PhD in Quantum algorithms, Physics at Universidad Complutense de Madrid
Degree in Physics, Physics, Degree in Physics, Physics at Uniwersytet Zielonogórski
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Role in this project:
ML Engineer
Contributions:8 reviews, 19 commits, 1 PR in 1 month
Contributions summary:Pablo primarily contributed to developing a task within the `google/big-bench` repository, focusing on multi-step arithmetic. Their contributions involved creating a Python-based task, integrating evaluation results, and refining the task's functionality. The user made several iterative improvements, addressing issues such as the generation of parenthesis strings and correcting minor errors to ensure the task functioned correctly. They also made adjustments like changing default spacing and updating keywords.
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.