Shahul Shereef is a San Francisco–based data science leader and founder with eight years of experience building end-to-end ML and NLP systems in startups and production. As co-founder of ragas (YC W24), he’s driving an open-source standard for evaluating LLM applications while contributing to OpenAssistant and other community projects. A Kaggle Grandmaster ranked in the top 20 among 100,000+ users, he blends competition-grade modeling with practical MLOps—implementing dataset conversions for instruction-following dialogue and audio augmentations (RandomCrop, Padding, SpliceOut) for deep learning. His background includes credit-underwriting NLP, TensorFlow model deployment on Google Cloud, and adding evaluation metrics like BertScore and EditScore to open-source toolkits. Curious and product-minded, he publishes work and code publicly (shahules786.github.io) and focuses on tooling that makes ML systems more testable and reproducible.
8 years of coding experience
3 years of employment as a software developer
CGPA 8.01, CGPA 8.01 at Govt.model engineering college
Contributions:255 reviews, 582 PRs, 319 pushes in 1 year 10 months
Contributions summary:Shahul implemented a BertScore metric and added SBERT score calculation and relative imports within the `belar/metrics/similarity.py` file. Moreover, the user added EditScore metric with distance and ratio measures, and also a Bleu score. They also added Textual Entailment Score, fixed device checks, re-formatted imports and added the Q-square metric.
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Role in this project:
ML Engineer
Contributions:5 reviews, 52 commits, 4 PRs in 1 month
Contributions summary:Shahul primarily contributed to implementing and testing audio data augmentation techniques using PyTorch within the `torch-audiomentations` repository. They developed a `RandomCrop` augmentation, including initial implementation, base class initialization, type conversions, and testing. Furthermore, the user added a `Padding` augmentation and contributed to a `SpliceOut` augmentation. The commits focus on enhancing the library's audio processing capabilities, particularly for deep learning applications.
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.