Devendar Bureddy

Distinguished Engineer at NVIDIA

California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Devendar Bureddy is a Distinguished Engineer based in California with 12 years of experience building high-performance, GPU-accelerated system software for industry-leading vendors including NVIDIA and Mellanox. He specializes in low-latency distributed communication and MPI ecosystems, contributing upstream to prominent open-source projects like UCX and Open MPI where he implemented CUDA integration, memory-type support, and optimized collective operations. Known for solving hard problems in memory registration, datatype handling, and progress efficiency, he blends deep kernel-to-user-space systems expertise with pragmatic engineering leadership. A master’s graduate from IIT Kanpur, Devendar brings a research-informed approach to production-grade software and a track record of influencing both vendor platforms and community-driven HPC stacks.
code12 years of coding experience
job18 years of employment as a software developer
bookJawaharlal Nehru Technological University Hyderabad
bookIndian Institute of Technology Kanpur
stackoverflow-logo

Stackoverflow

Stats
1reputation
0reached
1answer
0questions
github-logo-circle

Github Skills (13)

c1710
cuda10
gpu-programming10
mpi10
c-language10
hpc10
openmpi10
cprogramming-language10
c1110
performance-optimization10
rdma9
fortran9
networking8

Programming languages (5)

C++CPHPJupyter NotebookPython

Github contributions (5)

github-logo-circle
openucx/ucx

Sep 2017 - Dec 2021

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
Role in this project:
userBackend Developer
Contributions:150 reviews, 254 commits, 165 PRs in 4 years 3 months
Contributions summary:Devendar primarily contributed to the codebase by implementing CUDA-related functionalities within the Unified Communication X (UCX) library. Their work included adding build configurations and flags for CUDA, configuring GDRCOPY, and introducing and modifying various UCT/API interfaces. They also made changes to incorporate and test the new memory type support. The contributions demonstrate a focus on integrating CUDA and potentially GPU-accelerated technologies into the UCX framework.
craympiopenshmemyangroce
open-mpi/ompi

Nov 2013 - Dec 2022

Open MPI main development repository
Role in this project:
userBackend Developer
Contributions:12 reviews, 51 commits, 11 PRs in 9 years 2 months
Contributions summary:Devendar primarily contributed to the Open MPI project by modifying and enhancing the HCOLL (High-Performance Collective Communications) module. Their changes included implementing support for new collective operations like gatherv and alltoallv, fixing issues related to datatype handling, and improving overall performance through progress-related optimizations. The user also addressed memory management issues, particularly concerning memory registration limits within the OpenIB BTL (Byte Transfer Layer) component. These modifications indicate a focus on improving the efficiency and functionality of MPI collective operations.
mpicluster-computingfortranopenmpipetsc
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Devendar Bureddy - Distinguished Engineer at NVIDIA