Staff Software Engineer Tech Lead, Deep Learning Compilers at NVIDIA
California, United States
Join Prog.AI to see contacts
Join Prog.AI to see contacts
Summary
🤩
Rockstar
🎓
Top School
Ayan Moitra is a Staff Software Engineer and Tech Lead specializing in deep learning compilers, with seven years of industry experience building high-performance GPU-accelerated tooling for AI workloads. Currently leading XLA development at NVIDIA, he has a strong track record integrating cuDNN and CUDA optimizations into major open-source projects like TensorFlow/XLA and nGraph to accelerate fused MHA, convolutions, and matrix kernels. His background includes architecting GPU inference and training engines at MathWorks and backend optimizations for Nervana/Intel, giving him rare end-to-end expertise across compiler, runtime, and kernel implementation layers. He holds a Ph.D. in Computational Dynamics, where he developed novel GPU-parallel algorithms for nonlinear wave instabilities—a research-to-production trajectory that surfaces in his meticulous unit-testing and performance-focused contributions. Based in California, he combines deep numerical modeling instincts with pragmatic engineering to squeeze performance out of modern accelerators.
7 years of coding experience
11 years of employment as a software developer
The University of Maryland, College Park
Master of Science - MS, Mechanical Engineering, Master of Science - MS, Mechanical Engineering at University of Maryland
Bachelor of Technology (B.Tech.), Bachelor of Technology (B.Tech.) at Indian Institute of Technology, Kharagpur
DeepRec is a high-performance recommendation deep learning framework based on TensorFlow. It is hosted in incubation in LF AI & Data Foundation.
Role in this project:
Back-end Developer & ML Engineer
Contributions:69 commits in 1 year 8 months
Contributions summary:Ayan contributed to the deep learning framework by implementing and fixing issues related to convolution operations, specifically addressing backward input and filter convolutions. The commits involve modifications to the CUDNN convolution rewriter, and depthwise convolution converters. The user also incorporated comments and addressed potential issues within the source code.
A machine learning compiler for GPUs, CPUs, and ML accelerators
Role in this project:
Back-end Developer
Contributions:1 review, 54 commits, 6 PRs in 3 years 2 months
Contributions summary:Ayan primarily contributed to the XLA compiler, focusing on the integration of cuDNN APIs for grouped convolution operations. Their work included enabling the use of cuDNN backprop APIs for grouped convolutions, as well as handling depthwise forward and backward filter convolutions using cuDNN. They also addressed comments and resolved conflicts in the codebase, further refining and enhancing the compiler's functionality.
compilercommunity-drivenmachine-learningmodular
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Ayan Moitra - Staff Software Engineer Tech Lead, Deep Learning Compilers at NVIDIA