Nick Pentreath

Principal Engineer at The Apache Software Foundation

Western Cape, South Africa
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

👤
Senior
🎓
Top School
Nick Pentreath is a principal engineer and AI/ML leader with 15 years of experience building scalable, data-driven systems across social, ecommerce, advertising, and finance domains. At Tumblr, he led ML development for core feeds, ranking and personalization, and helped architect embedding-based recommendations from the ground up. He is an active open-source contributor, Apache Spark PMC member and committer, and author of Machine Learning with Spark, with notable work on streaming analytics and scalable ML primitives. Nick co-founded GraphFlow and previously drove open-source ML initiatives at IBM CODAIT, bridging research rigor with production-grade systems. He is currently a Principal Engineer at Rumi.ai, shaping the intelligence layer for enterprise meetings and communication data. Based in the Western Cape, South Africa, he blends commercial focus with AI to deliver practical, data-driven business value across diverse industries.
code13 years of coding experience
job14 years of employment as a software developer
bookBSc, Quantitative Finance, BSc, Quantitative Finance at University of Cape Town
bookUniversity College London
github-logo-circle

Github Skills (40)

apache-spark10
spark10
recommendation-system10
elasticsearch710
python10
data-science10
amazon-elasticsearch10
elasticsearch810
machine-learning10
recommender-systems10
spark-streaming10
elasticsearchapi10
scala210
distributed-computing10
scala10

Programming languages (11)

DockerfileJavaC++ShellScalaJavaScriptPHPSwift

Github contributions (5)

github-logo-circle
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Role in this project:
userFull-stack Developer
Contributions:90 commits, 19 PRs, 69 pushes in 3 years 1 month
Contributions summary:Nick updated Jupyter Notebooks to align with specific versions of Apache Spark and Elasticsearch. This involved modifying code within the notebooks to reflect changes in library versions and ensure compatibility. The changes focused on the core components of a recommender system, including data loading, ALS model training, and writing model factors to Elasticsearch.
pythonrecommenderjupyter-notebookapachespark
apache/spark

Jan 2015 - Sep 2020

Apache Spark - A unified analytics engine for large-scale data processing
Role in this project:
userBack-end Developer & Data Scientist
Contributions:37 PRs, 1796 comments in 5 years 9 months
Contributions summary:Nick's contributions primarily involve modifying the `ALS.scala` file, which is a core component for Alternating Least Squares matrix factorization, a technique often used for recommendation systems. The changes include enhancements for implicit feedback models, and also include schema validation. The user has added or modified parameters related to storage levels and cold start strategies in the ALS algorithm, demonstrating the ability to configure and optimize the system for data processing and model training. The user is also involved in updating the ALS examples for Java and Python.
analyticspythondata-processingsqlapache
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Nick Pentreath - Principal Engineer at The Apache Software Foundation