DB Tsai

Senior Engineering Manager at The Apache Software Foundation

Cupertino, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
DB Tsai is a senior engineering manager and open-source leader with 14+ years building high-performance data platforms and ML infrastructure, currently leading cloud data efforts from Cupertino and recently joining Databricks. He scaled and led Apple’s Spark, Flink, and Data Security teams from small startups to award-winning organizations whose work earned back-to-back ACM SIGMOD Systems awards and produced industry-standard open-source components like a Spark-native accelerator and enterprise-grade encryption for columnar formats. A long-time Apache PMC member and committer on Spark and YuniKorn, he combines deep algorithmic contributions (e.g., Spark ML optimizers and online summarizers) with practical performance engineering—his commits touch core ML, SQL, and performance testing for one of the most widely used big-data engines. At Netflix and Alpine he built production ML pipelines and scalable algorithms that shortened experiment-to-deployment cycles and inspired patent-pending systems; he’s equally comfortable in low-level algorithm design and large-scale team-building. Known for turning ambiguous, research-grade ideas into production at scale, he also has a habit of turning internal engineering projects into widely adopted open-source tools.
code14 years of coding experience
job11 years of employment as a software developer
bookBachelor's degree Physics, Bachelor's degree Physics at National Cheng Kung University
bookMaster's degree Physics, Master's degree Physics at National Taiwan University
bookDoctor of Philosophy (Ph.D.) Program Applied Physics, Doctor of Philosophy (Ph.D.) Program Applied Physics at Stanford University
languagesEnglish, Chinese
stackoverflow-logo

Stackoverflow

Stats
1,388reputation
70kreached
4answers
10questions
github-logo-circle

Github Skills (28)

apache-spark10
spark10
data-science10
big-data10
machine-learning10
plot10
java10
ml10
scala10
javas10
html10
documentation10
javascript9
unit-testing9
linear-regression9

Programming languages (11)

JavaC++CSSShellRustCScalaHTML

Github contributions (5)

github-logo-circle
vegas-viz/Vegas

May 2016 - Oct 2016

The missing MatPlotLib for Scala + Spark
Role in this project:
userBack-end Developer
Contributions:9 commits, 29 PRs, 30 pushes in 5 months
Contributions summary:DB primarily contributed to the development of the Vegas visualization library for Scala and Spark. Their work included enhancing the DSL for axis customization by adding parameters for labels, formats, and other axis-related properties. They also added Spark support for the library, enabling integration with Spark DataFrames and created unit tests. Further contributions included minor documentation updates and bug fixes.
spark-scalasparkmissingscaladatascience
apache/spark

Apr 2017 - Mar 2019

Apache Spark - A unified analytics engine for large-scale data processing
Role in this project:
userBack-end Developer
Contributions:58 reviews, 5 commits, 154 PRs in 1 year 11 months
Contributions summary:DB primarily contributed to the Apache Spark project by fixing bugs, improving code quality, and adding new features. Their work involved correcting typos, filtering empty strings from configurations related to classloaders, and implementing online summarizer APIs for mean, variance, min, and max in MLlib. Additionally, the user addressed issues related to error messages and the DataFrame API. These contributions span across different modules, including MLlib, SQL, and core.
analyticspythondata-processingsqlapache
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
DB Tsai - Senior Engineering Manager at The Apache Software Foundation