Siddharth Dalmia - Member Of Technical Staff at WaveForms AI

Siddharth Dalmia

Member Of Technical Staff at WaveForms AI

New York, New York, United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

🎓

Top School

Siddharth Dalmia is a Member of Technical Staff at WaveForms AI with eight years of experience building audio LLMs and multimodal long-context systems. He was previously a Research Scientist at Google DeepMind, where he worked on multimodal audio and long-context capabilities for Gemini. Siddharth holds a Ph.D. from Carnegie Mellon University’s Language Technologies Institute, where his research made sequence models more practical in resource-constrained settings by applying compositional principles like task simplification, reusability, transferability, and data pooling. He pairs research depth with production engineering—contributing to high-profile open-source speech tooling (notably backend and DevOps fixes for espnet) and automating tasks such as audio resampling and model logging. Based in New York, he specializes in translating advanced speech and language research into robust, deployable systems.

8 years of coding experience

9 years of employment as a software developer

Bachelor’s Degree, Computer Science, Bachelor’s Degree, Computer Science at Birla Institute of Technology and Science

Doctor of Philosophy - PhD Language Technologies Computer Science, Doctor of Philosophy - PhD Language Technologies Computer Science at Carnegie Mellon University

Github Skills (17)

pytorch10

python10

scripting9

machine-learning9

shell9

script9

sh9

speech-recognition8

deep-learning8

text-to-speech8

speech-synthesis8

machine-translation8

dockerce4

kubernetes4

kubernetes-pod4

Programming languages (7)

CSSShellC++SCSSJavaScriptPythonCuda

Github contributions (5)

espnet/espnet

Oct 2020 - Mar 2022

End-to-End Speech Processing Toolkit

Role in this project:

Backend & DevOps Engineer

Contributions:26 reviews, 53 commits, 30 PRs in 1 year 4 months

Contributions summary:Siddharth primarily focused on improving the code's compatibility, debugging, and expanding the functionality of the core components. They addressed issues related to PyTorch versions, run script errors, and automated the process of resampling audio files. Additionally, the user contributed to improving model parameter logging and made several modifications to the configuration options of the language model.

speech-recognitionspeech-separationchainerspoken-language-understandingspeech-processing

siddalmia/espnet

Feb 2021 - Aug 2022

End-to-End Speech Processing Toolkit

Contributions:5 PRs, 125 pushes, 24 branches in 1 year 6 months

end-to-endspeech-to-textspeech-recognitionspeech-synthesisspeech

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial