Kashif Rasul is a Principal Research Scientist based in Berlin with 17 years of experience focused on deep learning, probabilistic time series forecasting, reinforcement learning and high-performance GPU engineering. He bridges research and production, shipping models and tooling—everything from Time Series Transformers and distribution heads to DPO trainers for RL fine-tuning—while driving performance work like CuDNN integrations in CuPy. An active open-source contributor, he has shaped major projects at Hugging Face (datasets, transformers, trl, diffusers), GluonTS and Keras, often improving test coverage, logging, docs and tensor-scaling utilities that make models reproducible in practice. His contributions span the full stack from novel reward functions for RLHF to package and DevOps updates, reflecting a rare combination of algorithmic depth and systems-level pragmatism.
PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
Role in this project:
Back-end Developer & ML Engineer
Contributions:14 releases, 1 review, 421 commits in 3 years 6 months
Contributions summary:Kashif made a series of commits focused on formatting fixes and cleanups within the `pytorch-ts` repository. These changes primarily involved refactoring and modifying existing transformation classes, indicating a focus on improving the data preprocessing pipeline. Furthermore, the commits show the creation and implementation of fundamental elements for machine learning model training, including a trainer class. The user's contributions suggest involvement in model development and optimization, which aligns with ML engineering responsibilities.
Train transformer language models with reinforcement learning.
Role in this project:
ML Engineer
Contributions:505 reviews, 235 PRs, 382 pushes in 2 years 1 month
Contributions summary:Kashif contributed significantly to the development of a DPO (Direct Preference Optimization) trainer, implementing core functionalities. They introduced a DPO trainer, added DPO data collators, incorporated loss functions (including SLiC hinge and IPO variants), and integrated support for precomputed reference log probabilities. They also added the KTO loss, and added option for compute_metrics, demonstrating expertise in reinforcement learning and model training.
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.