Holden Karau

San Francisco, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
Holden Karau is a Co-Founder and seasoned software engineer with 25 years of experience specializing in big data, search and distributed systems. An Apache Spark committer and PMC member, she is author/co-author of Fast Data Processing With Spark, Learning Spark and High Performance Spark and has contributed to high-profile projects like Apache Spark, Beam and Kubeflow. She led Spark migration automation, dynamic scaling and Kubernetes integration efforts at Netflix and Apple and served as an open source advocate and contributor at Google and IBM. Now co-leading Fight Health Insurance, she still writes core tests, build tooling and release automation while juggling model tuning and product work. Early-career highlights include updating Linux kernel wireless drivers and building the All The Code source-search engine, and she holds a B.Math in Computer Science from the University of Waterloo.
code25 years of coding experience
stackoverflow-logo

Stackoverflow

Stats
7,422reputation
1.1mreached
223answers
1question
Badges
pyspark
top-5%
scala
top-5%
rdd
top-5%
apache-spark
top-1%
hadoop
top-5%
github-logo-circle

Github Skills (67)

apache-spark10
documentations10
python10
testing10
kubeflow10
bash10
scala210
data-ingestion10
javas10
automation10
ci-cd10
aws-s310
beamng10
streaming10
build-configuration10

Programming languages (21)

MDXJavaJinjaC++RustCScalaGo

Github contributions (5)

github-logo-circle
holdenk/spark-testing-base

Jan 2015 - Jan 2023

Base classes to use when writing tests with Spark
Role in this project:
userBack-end Developer
Contributions:2 releases, 14 reviews, 923 commits in 8 years 1 month
Contributions summary:Holden's contributions primarily involved setting up the foundational build files and the base class for testing Spark Streaming applications. Their initial work established the project's build configuration using SBT, defining dependencies on core Spark components and testing libraries like ScalaTest. Following this setup, the user focused on creating a base test suite for Spark Streaming applications, showing development of core utility classes for testing stream consumption.
scalabase-classesspark
Examples for High Performance Spark
Role in this project:
userBack-end Developer
Contributions:849 commits, 168 PRs, 213 pushes in 7 years 3 months
Contributions summary:Holden primarily contributed to the development of core Spark examples within the `high-performance-spark-examples` repository. Their work focused on building and refining the build system for the project by introducing and configuring an SBT build. The user also demonstrated proficiency in optimizing the project's build configurations and dependencies. Additionally, they added and revised examples illustrating key aspects of Apache Spark for high-performance computing.
performancescalahigh-performancespark
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Holden Karau