Pulkit Arora

Data Engineer

Rotterdam, South Holland, Netherlands
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

👤
Senior
🎓
Top School
Pulkit Arora is a data engineer based in Rotterdam with eight years of hands-on experience building cloud-native, cost-optimized data pipelines and MLOps systems that serve thousands of active users. He combines practical ETL, CI/CD, IaC and orchestration expertise with research-led machine learning experience from roles at Fraunhofer and the University of Bonn, where he studied LLM robustness and multimodal self-supervised models. Pulkit has shipped production ML features—fine-tuning transformers on SageMaker and deploying distributed NER and extractor pipelines—while also contributing to open-source ML libraries by implementing and refactoring neural network padding layers in the mlpack project. His work emphasizes scalability and real-world impact: serverless ingestion, automated model deployment, and sub-1% false linkage in knowledge-graph construction. Comfortable across research and production, he brings an uncommon blend of low-level ML understanding (C++ library contributions) and cloud-first engineering pragmatism.
code8 years of coding experience
job5 years of employment as a software developer
bookM.Sc. Media Informatics Computer Science, M.Sc. Media Informatics Computer Science at RWTH Aachen University
languagesEnglish, Hindi, German
github-logo-circle

Github Skills (12)

machine-learning10
deeplearning-ai10
deep-learning10
cpp10
cplus10
data-structure9
algorithm9
data-structures9
algorithms9
regression4
nearest-neighbor-search3
nearest-neighbors3

Programming languages (8)

JavaCSSC++GoHTMLJupyter NotebookPythonMatlab

Github contributions (5)

github-logo-circle
mlpack/mlpack

Nov 2019 - Dec 2019

mlpack: a fast, header-only C++ machine learning library
Role in this project:
userML Engineer
Contributions:12 commits, 1 PR, 9 comments in 1 month
Contributions summary:Pulkit primarily focused on implementing and modifying components related to a padding layer within the mlpack library. Their contributions included adding padding layers, integrating them into the transposed convolution layer, and resolving associated testing issues. Furthermore, the user refactored code to streamline the use of padding in different contexts, demonstrating an understanding of neural network layer design within the machine learning domain. These changes likely involved a deep understanding of the underlying mathematical operations.
regressionheaderdeep-learningscientific-computingc-plus-plus
Contributions:19 pushes in 5 years 1 month
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Pulkit Arora - Data Engineer