Muhammad Osama is a Senior Member of Technical Staff at AMD with ten years of experience optimizing GPU kernels for dense and sparse-parallel computations. He holds a Ph.D. from UC Davis (Owens group) and uniquely blends research rigor with production engineering—authoring core algorithms in the open-source Gunrock graph analytics library (radix sort, SpGEMM, and a more efficient load-balancing advance) and contributing modernization and SM80/86 plus Windows support to ModernGPU. His background includes multiple NVIDIA research internships focused on load-balancing irregular sparse workloads, and he repeatedly modernizes codebases via C++ refactors and cross-platform fixes. Based in Davis, CA, he delivers high-performance GPU primitives that bridge academic research and real-world HPC deployments.
11 years of coding experience
10 years of employment as a software developer
Doctor of Philosophy (Ph.D.), Computer Engineering, Doctor of Philosophy (Ph.D.), Computer Engineering at University of California, Davis
Bachelor of Engineering (B.E.), Electrical and Electronics Engineering, Bachelor of Engineering (B.E.), Electrical and Electronics Engineering at University of Washington
Associate of Science (A.S.), Computer Science and Electrical Engineering, Associate of Science (A.S.), Computer Science and Electrical Engineering at Edmonds College
Contributions:6 releases, 25 reviews, 1114 commits in 7 years 1 month
Contributions summary:Muhammad's contributions primarily involve implementing and optimizing fundamental graph algorithms within the Gunrock framework. Their work includes the implementation of the radix sort and the construction of a new sparse matrix-matrix multiplication implementation (SpGEMM). The user added support for the more efficient load-balancing scheme for the advance operator. Furthermore, the user has been refactoring the graph algorithms, and creating a better class structure.
Contributions:1 release, 1 review, 33 commits in 5 years
Contributions summary:Muhammad primarily contributes to the `moderngpu` project, focusing on GPU computing patterns. Their work involves updating the code to support newer GPU architectures, specifically adding support for SM 80 and SM 86. The contributions also include refactoring code to adhere to modern C++ standards, such as removing deprecated features and addressing type-mismatch issues. Moreover, they made adjustments for cross-platform compatibility, including the addition of Windows support.
cudabehaviorsgpu-programminggpu-accelerationgpu
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Muhammad Osama - Senior Member Of Technical Staff at AMD