Bo Gao

Sr. Software Engineer at Databricks

Seattle, Washington, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Bo Gao is a senior software engineer based in Seattle with 6+ years of experience building distributed systems and cloud-native services, currently advancing streaming engine capabilities at Databricks. He brings deep AWS expertise from six years at AWS where he led cross-organizational projects, shipped device inventory and Snow device platform improvements, and served as a technical mentor and SME. Skilled in both SQL and NoSQL, REST API design, and serverless services like Lambda and DynamoDB, Bo couples pragmatic engineering with a focus on scalability and operational excellence. He is also an active contributor to Apache Spark, enhancing Spark Connect’s Python APIs and stateful streaming features for real-time processing. His background in electrical and computer engineering gives him a systems-oriented perspective that informs both low-level reliability work and high-level architecture decisions.
code3 years of coding experience
job7 years of employment as a software developer
bookM.S, Computer Engineering, M.S, Computer Engineering at Northeastern University
bookBachelor's Degree, Electrical and Electronics Engineering, Bachelor's Degree, Electrical and Electronics Engineering at Shanghai Jiao Tong University
languagesEnglish, Chinese
github-logo-circle

Github Skills (9)

big-data10
spark10
python10
streaming10
scala10
apidoc9
api9
java7
javas7

Programming languages (3)

JavaScalaHTML

Github contributions (5)

github-logo-circle
apache/spark

Aug 2022 - Apr 2025

Apache Spark - A unified analytics engine for large-scale data processing
Role in this project:
userBack-end Developer
Contributions:192 reviews, 41 PRs, 254 comments in 2 years 7 months
Contributions summary:Bo primarily contributed to the Apache Spark project by implementing and improving functionalities related to streaming. They focused on adding Python APIs for Spark Connect, particularly for stateful streaming operations like `dropDuplicatesWithinWatermark`, `mapGroupsWithState`, and `flatMapGroupsWithState`. The user also addressed issues related to error handling and resource management in streaming Python workers, along with adding client-side support for streaming listeners. Their contributions enhance the capabilities of Spark Connect for real-time data processing.
analyticspythondata-processingsqlapache
bogao007/spark

Sep 2022 - Apr 2025

Apache Spark - A unified analytics engine for large-scale data processing
Contributions:247 pushes, 35 branches in 2 years 7 months
analyticsdata-processingapachebig-dataspark
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Bo Gao - Sr. Software Engineer at Databricks