Raul Puri

United States, United Kingdom

Join Prog.AI to see contacts

Summary

🤩

Rockstar

Raul Puri is a researcher and machine learning engineer with 10 years of experience based in Berkeley, currently on OpenAI’s Multimodal Learning team working on GPT-4 multimodality, Codex, embeddings, and the GPT Edit feature. He brings research leadership from NVIDIA and applied ML roles, spanning unsupervised and curriculum learning, distributed and mixed-precision training, and production deployment of speech and NLP systems. An active open-source contributor, he has helped adapt high-profile NVIDIA projects—patching Tacotron2 for reliable single-GPU and distributed inference and scaling unsupervised language modeling for robust sentiment classification—down to CUDA fixes and even changing optimizer defaults to improve training stability. With an EECS degree and near-complete bioengineering minor from UC Berkeley, he blends systems-level engineering with algorithmic research and a practical focus on reproducible ML at scale.

11 years of coding experience

Github Skills (24)

pytorch10

distributed-training10

python10

machine-learning10

speech-synthesis10

deep-learning10

cuda10

nlp10

preprocess9

datapreprocessing9

preprocessing9

pre-processing9

data-prep9

model-optimization9

data-pre-processing9

Programming languages (4)

C++JavaScriptJupyter NotebookPython

Github contributions (5)

NVIDIA/sentiment-discovery

Dec 2017 - Oct 2018

Unsupervised Language Modeling at scale for robust sentiment classification

Role in this project:

ML Engineer

Contributions:3 releases, 27 commits, 9 PRs in 10 months

Contributions summary:Raul primarily contributed to the development and maintenance of the sentiment discovery model, addressing critical issues such as import changes, CUDA compatibility, and handling of different data types. They fixed bugs, updated data loading procedures, and integrated necessary dependencies. Furthermore, the user modified the model wrapper and main script by changing default optimizer to Adam and added a base gpu argument for distributed training.

pytorchnlpbertdeep-learningunsupervised

NVIDIA/tacotron2

May 2018 - May 2018

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Role in this project:

ML Engineer

Contributions:7 commits, 3 PRs, 6 pushes in 2 days

Contributions summary:Raul primarily focused on updating and adapting the Tacotron2 model for a new version (0.4), including modifications to training scripts, model definitions, and utility functions. They adjusted the model's data handling, particularly regarding input lengths and padding. Additionally, the user patched the inference script to address issues with distributed data parallel models and single GPU execution.

pytorchrealtimeinferencefastertacotron

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial