Samyam Rajbhandari

Principal Architect at Snowflake

Redmond, Washington, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Samyam Rajbhandari is a Principal Architect in Redmond with 11 years of experience building and scaling AI systems, currently leading inference efforts at Snowflake after a multi-year tenure as a Principal Architect at Microsoft. He pairs deep academic roots—a PhD from The Ohio State University where he developed communication‑optimal tensor contraction algorithms and CUDA implementations—with hands‑on production engineering. An active contributor to DeepSpeed and DeepSpeed‑MII, he added low‑level CUDA kernels (including fused LAMB), ZeRO memory and checkpointing improvements, and features that enable low‑latency, tensor‑parallel inference. His work spans register‑tiling and fusion techniques for tensor kernels through to MLOps integrations like gRPC asynchronous serving and Azure ML model versioning. That blend of research-grade performance optimization and practical deployment know‑how helps him translate cutting‑edge model advances into reliable, scalable inference services.
code11 years of coding experience
job11 years of employment as a software developer
bookDoctor of Philosophy (PhD) Computer Science and Engineering, Doctor of Philosophy (PhD) Computer Science and Engineering at The Ohio State University
languagesEnglish, Nepali, Hindi
github-logo-circle

Github Skills (29)

algorithm10
optimizations10
pytorch10
deploying10
python10
optimizers10
machine-learning10
inference10
azure-machine-learning10
azuremachinelearning10
mlops10
deepspeed10
deep-learning10
optimisation10
cuda10

Programming languages (3)

C++Jupyter NotebookPython

Github contributions (5)

github-logo-circle
deepspeedai/DeepSpeed-MII

Mar 2022 - Nov 2022

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Role in this project:
userMLOps Engineer
Contributions:1 review, 35 commits, 3 PRs in 7 months
Contributions summary:Samyam primarily contributed to the deployment and operationalization of machine learning models within the DeepSpeed-MII framework. Their work involved modifying the `mii/server_client.py` and associated files to integrate gRPC for model serving, enabling asynchronous requests and tensor parallelism. They also implemented features for registering and managing models within an Azure Machine Learning (AML) environment, incorporating model versioning, and supporting different deployment configurations like local and AML-on-AKS. Furthermore, they introduced the ability to enable or disable DeepSpeed optimizations during deployment and implemented features for parallelism configuration.
pytorchdeepspeeddeep-learninginferencelatency
deepspeedai/DeepSpeed

Feb 2020 - Dec 2022

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Role in this project:
userML Engineer
Contributions:78 reviews, 74 commits, 39 PRs in 2 years 11 months
Contributions summary:Samyam made several contributions focused on optimizing and extending the DeepSpeed library, specifically related to deep learning optimization. They added new CUDA kernels for fused Lamb optimization, indicating involvement in improving training performance. The user also worked on features for ZeRO optimization (stages 2 and 3), including memory management and checkpointing improvements, which are crucial for training large models. Further contributions include debugging and performance enhancements for allreduce operations and gradient accumulation.
billion-parametersfinetuningtrainingmixture-of-expertszero
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial