Patrick Wendell is a co-founder and VP of Engineering at Databricks with 15 years of experience building large-scale data and machine learning platforms. He helped found Apache Spark as a Berkeley committer and continues to contribute to core open-source projects like Spark and Bahir, with hands-on improvements to shuffle performance, RDD tracking, and streaming examples. At Databricks he scaled engineering from 35 to 80 engineers and now leads multiple product areas including exploratory data analysis, ML, and ingestion for thousands of enterprise customers. He left a Berkeley PhD to pursue the startup, blending deep distributed-systems research with pragmatic product delivery. Based in San Francisco, he spends substantial time hiring senior ICs and managers, balancing technical leadership with team growth. An engineer at heart, he still digs into low-level performance, build systems, and developer tooling that keep massive data platforms running.
15 years of coding experience
5 years of employment as a software developer
Masters of Science Computer Science, Masters of Science Computer Science at University of California, Berkeley
San Francisco University High School
B.S.E. Computer Science, B.S.E. Computer Science at Princeton University
Lightning-fast cluster computing in Java, Scala and Python.
Role in this project:
Back-end Developer
Contributions:464 commits in 1 year 2 months
Contributions summary:Patrick's contributions focused on adding features and improving the underlying functionality of Apache Spark. This included implementing RDD origin tracking to help users understand where RDDs are created in their code, which involved modifications to the `RDD.scala` and `DAGScheduler.scala` files. Furthermore, the user added several logging messages to enhance the debugging of application performance. Further, the user added documentation to various Java API's.
Apache Spark - A unified analytics engine for large-scale data processing
Role in this project:
Back-end Developer
Contributions:15 commits, 29 PRs, 1125 comments in 2 years 3 months
Contributions summary:Patrick's contributions focused on enhancing the Apache Spark codebase. They made modifications to the build process, updating and refining dependencies such as Mesos and Jetty. They also contributed to improvements in shuffle performance metrics and the user interface, showcasing experience with low-level details of data processing. They also modified and fixed tests.
analyticspythondata-processingsqlapache
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Patrick Wendell - Co-founder And VP Of Engineering at Databricks