Ammar Ahmad Awan

Bellevue, Washington, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
Ammar Awan is a Principal Research Manager at Microsoft in Richardson, Texas with 9 years of experience building teams and systems to push low-latency ML training and inference into production. He leads a new group focused on low-latency innovations and, as a former Senior Researcher on DeepSpeed, contributed core features like 1-bit Adam, MoE integration, and inference pipelines for Hugging Face models. An active open-source contributor to DeepSpeed and Megatron-DeepSpeed, his work helped scale transformer training and serve large language models efficiently. His PhD-era research on GPU-aware MPI and large-scale DL benchmarks at Ohio State gives him a rare bridge between HPC communication stacks and practical, production ML systems. He combines hands-on systems engineering with research-driven leadership to turn performance research into usable tooling for distributed ML.
code10 years of coding experience
github-logo-circle

Github Skills (18)

pytorch10
python10
machine-learning10
inference10
hugging-face-transformers10
transformer-models10
deepspeed10
deep-learning10
trainings10
gpu10
language-modeling10
modeling10
data-parallel9
model-optimization9
data-parallelism9

Programming languages (7)

C++ShellCJupyter NotebookPythonCudaFortran

Github contributions (5)

github-logo-circle
Example models using DeepSpeed
Role in this project:
userML Engineer
Contributions:51 reviews, 62 commits, 76 PRs in 2 years 4 months
Contributions summary:Ammar contributed to examples and utilities related to DeepSpeed, particularly focusing on inference with Hugging Face models. Their work includes converting checkpoints, integrating models, and creating inference pipelines for text generation using models like GPT-J, GPT-2, and BLOOM. They also implemented examples demonstrating Mixture of Experts (MoE) usage within the DeepSpeed framework.
deep-learningpytorchdeepspeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Role in this project:
userML Engineer
Contributions:18 reviews, 42 commits, 15 PRs in 11 months
Contributions summary:Ammar contributed to the implementation of MoE (Mixture of Experts) support within the training pipeline, modifying the `megatron/training.py` file. They also worked on integrating a teacher model for knowledge distillation, setting up the teacher model and incorporating it into the training loop. Furthermore, the user made multiple spelling corrections within configuration files.
nlptransformersbertongoingscale
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Ammar Ahmad Awan