Matt Stack

Solutions Architect at NVIDIA

California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Matt Stack is a Solutions Architect at NVIDIA with eight years of experience building high-performance, GPU-accelerated systems and tooling. He combines hands-on C++ and CUDA expertise with architecture-level thinking, having contributed notable backend work to the widely used Kokkos performance portability library—improving CUDA memory management and asynchronous behavior. A University of Delaware computer science alumnus, he started by accelerating biology modeling on GPUs, a background that informs his practical approach to computational problems. Based in California, Matt bridges research and production, translating low-level performance tuning into scalable solutions for complex technical teams.
code8 years of coding experience
bookBachelor of Science - BS, Computer Science, Bachelor of Science - BS, Computer Science at University of Delaware
stackoverflow-logo

Stackoverflow

Stats
1reputation
0reached
0answers
0questions
github-logo-circle

Github Skills (9)

cuda10
cluster-computing10
c-language10
parallel-computing10
cprogramming-language10
scientific-computing10
abstraction9
hip7
programming-language7

Programming languages (6)

JavaC++JavaScriptFortranCudaPython

Github contributions (5)

github-logo-circle
kokkos/kokkos

May 2021 - Sep 2021

Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Role in this project:
userBack-end Developer
Contributions:21 reviews, 23 commits, 11 PRs in 4 months
Contributions summary:Matt primarily contributed to the Kokkos library's CUDA integration. Their work focused on implementing and refining memory management for CUDA, including `cudaMallocAsync` and `cudaFreeAsync`, to improve performance. They added conditional compilation based on CUDA versions, addressing memory allocation issues, and added synchronization calls to prevent unintended asynchronous behavior. The user also made corrections to address compilation issues.
memorympic-plus-plusmulti-threadingkokkos
matt-stack/hello-world

Feb 2018 - Nov 2020

Contributions:2 PRs, 3 pushes, 3 branches in 2 years 9 months
python
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Matt Stack - Solutions Architect at NVIDIA