Hua Jiang

Principal Software Engineer at AMD

San Jose, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Hua Jiang is a Principal Software Engineer and systems architect based in San Jose with two decades of experience building drivers, runtimes, and compilers for AI accelerators and high-throughput networking. At AMD he leads AI Engine driver and framework efforts, combining TVM, LLVM/MLIR and baremetal runtimes to optimize LLM fine-tuning and inference on specialized hardware. His background spans kernel and firmware work across vendors (AMD, Juniper, Dell, Riverbed) and includes leading SD/WAN data-plane and WLAN driver teams, giving him deep cross-stack expertise from silicon to cloud. An active contributor to the widely used apache/tvm project, he has improved the VTA accelerator runtime, reduced memory use, and added TFLite operator and simulator multithreading support. Known for diagnosing hard-to-reproduce kernel and hardware issues, he pairs pragmatic engineering with performance-first design. He holds a BS in Mechanical Design and Manufacturing, bringing a hardware-aware perspective to software architecture.
code9 years of coding experience
job15 years of employment as a software developer
bookBS Mechanical design and manufacturing, BS Mechanical design and manufacturing at Nanjing University of Aeronautics and Astronautics
languagesEnglish, Chinese
github-logo-circle

Github Skills (12)

tvm10
compiler10
compiler-compiler10
c-language10
deep-learning10
cprogramming-language10
performance-optimization10
gpu9
tensor9
machine-learning8
python8
cuda7

Programming languages (4)

C++CScalaPython

Github contributions (5)

github-logo-circle
apache/tvm

May 2019 - Jul 2022

Open deep learning compiler stack for cpu, gpu and specialized accelerators
Role in this project:
userBack-end Developer & Performance Engineer
Contributions:408 reviews, 56 commits, 66 PRs in 3 years 2 months
Contributions summary:Hua primarily worked on improving the VTA (Versatile Tensor Accelerator) component of the TVM project. Their contributions focused on fixing critical bugs related to VTA runtime, DRAM memory access, and compilation issues on the PYNQ board. Furthermore, the user implemented optimizations to reduce memory usage within VTA and added support for new operators to the TFLite frontend, and added multi-threading support for function simulator. They also addressed and resolved several performance-related issues, including those involving DRAM logic and hardware compilation errors.
metalvulkancompilertensoropencl
huajsj/tvm

May 2019 - Sep 2020

Open deep learning compiler stack for cpu, gpu and specialized accelerators
Contributions:2 PRs, 181 pushes, 21 branches in 1 year 4 months
cpugpu-programmingamdgpu-accelerationtvm
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Hua Jiang - Principal Software Engineer at AMD