Guangtai Huang

SDE at Amazon Web Services (AWS)

San Jose, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Guangtai Huang is an SDE based in San Jose with nine years of experience building performant deep learning compilers and backend systems at AWS. He has a strong open-source track record contributing to high-profile projects like Apache MXNet and TVM, where he implemented operators (e.g., isnan, numpy ops), added BF16/CUDA support, and improved correctness and memory efficiency at the C++/CUDA level. His background includes focused work on NumPy compatibility in MXNet during an AWS Shanghai AI Lab internship, showing both production-grade engineering and research-adjacent compiler expertise. Comfortable across low-level systems and ML stack integration, he combines pragmatic bug-fixing with adding core functionality that benefits diverse hardware backends.
code9 years of coding experience
job3 years of employment as a software developer
book工学学士, 信息工程, 工学学士, 信息工程 at 南方科技大学
github-logo-circle

Github Skills (15)

cuda10
compiler10
tvm10
mxnet10
deeplearning-ai10
c-language10
compiler-compiler10
deep-learning10
cprogramming-language10
python10
numpy10
gpu9
machine-learning9
bfd9
tensor9

Programming languages (9)

JuliaTypeScriptC++CSCSSJavaScriptGoJupyter Notebook

Github contributions (5)

github-logo-circle
apache/mxnet

Jul 2019 - May 2020

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Role in this project:
userBack-end Developer
Contributions:17 commits, 54 PRs, 100 comments in 9 months
Contributions summary:Guangtai primarily focused on fixing bugs and improving the `numpy` operator implementations within the `mxnet` deep learning framework. Their work involved modifying C++ and CUDA code related to the `np_unique_op` and `where` operators, optimizing memory usage, and addressing potential issues. Additionally, the user contributed to the test suite by adjusting and adding test cases to ensure the correct functionality of the implemented operators.
pythonschedulerdataflowmutationdata-science
apache/tvm

Sep 2019 - Feb 2022

Open deep learning compiler stack for cpu, gpu and specialized accelerators
Role in this project:
userML Engineer
Contributions:5 reviews, 10 commits, 12 PRs in 2 years 4 months
Contributions summary:Guangtai primarily contributed to the development of TVM, an open-source deep learning compiler stack. Their work focused on adding the `isnan` operator to the codebase, which involved implementing the operator, integrating it into the test suite, and supporting various data types including float and bfloat16. In addition, the user modified the compile engine, updated relay passes to improve efficiency, and added BF16 support to CUDA codegen, further demonstrating contributions to core compiler functionality and GPU-specific optimization.
metalvulkancompilertensoropencl
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Guangtai Huang - SDE at Amazon Web Services (AWS)