Ammar Ahmad Awan

Bellevue, Washington, United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

Ammar Awan is a Principal Research Manager at Microsoft in Richardson, Texas with 9 years of experience building teams and systems to push low-latency ML training and inference into production. He leads a new group focused on low-latency innovations and, as a former Senior Researcher on DeepSpeed, contributed core features like 1-bit Adam, MoE integration, and inference pipelines for Hugging Face models. An active open-source contributor to DeepSpeed and Megatron-DeepSpeed, his work helped scale transformer training and serve large language models efficiently. His PhD-era research on GPU-aware MPI and large-scale DL benchmarks at Ohio State gives him a rare bridge between HPC communication stacks and practical, production ML systems. He combines hands-on systems engineering with research-driven leadership to turn performance research into usable tooling for distributed ML.

10 years of coding experience

Github Skills (18)

pytorch10

python10

machine-learning10

inference10

hugging-face-transformers10

transformer-models10

deepspeed10

deep-learning10

trainings10

gpu10

language-modeling10

modeling10

data-parallel9

model-optimization9

data-parallelism9

Programming languages (7)

C++ShellCJupyter NotebookPythonCudaFortran

Github contributions (5)

deepspeedai/DeepSpeedExamples

Jun 2020 - Nov 2022

Example models using DeepSpeed

Role in this project:

ML Engineer

Contributions:51 reviews, 62 commits, 76 PRs in 2 years 4 months

Contributions summary:Ammar contributed to examples and utilities related to DeepSpeed, particularly focusing on inference with Hugging Face models. Their work includes converting checkpoints, integrating models, and creating inference pipelines for text generation using models like GPT-J, GPT-2, and BLOOM. They also implemented examples demonstrating Mixture of Experts (MoE) usage within the DeepSpeed framework.

deep-learningpytorchdeepspeed

deepspeedai/Megatron-DeepSpeed

Nov 2021 - Nov 2022

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Role in this project:

ML Engineer

Contributions:18 reviews, 42 commits, 15 PRs in 11 months

Contributions summary:Ammar contributed to the implementation of MoE (Mixture of Experts) support within the training pipeline, modifying the `megatron/training.py` file. They also worked on integrating a teacher model for knowledge distillation, setting up the teacher model and incorporating it into the training loop. Furthermore, the user made multiple spelling corrections within configuration files.

nlptransformersbertongoingscale

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial