Bo Gao is a senior software engineer based in Seattle with 6+ years of experience building distributed systems and cloud-native services, currently advancing streaming engine capabilities at Databricks. He brings deep AWS expertise from six years at AWS where he led cross-organizational projects, shipped device inventory and Snow device platform improvements, and served as a technical mentor and SME. Skilled in both SQL and NoSQL, REST API design, and serverless services like Lambda and DynamoDB, Bo couples pragmatic engineering with a focus on scalability and operational excellence. He is also an active contributor to Apache Spark, enhancing Spark Connect’s Python APIs and stateful streaming features for real-time processing. His background in electrical and computer engineering gives him a systems-oriented perspective that informs both low-level reliability work and high-level architecture decisions.
3 years of coding experience
7 years of employment as a software developer
M.S, Computer Engineering, M.S, Computer Engineering at Northeastern University
Bachelor's Degree, Electrical and Electronics Engineering, Bachelor's Degree, Electrical and Electronics Engineering at Shanghai Jiao Tong University
Apache Spark - A unified analytics engine for large-scale data processing
Role in this project:
Back-end Developer
Contributions:192 reviews, 41 PRs, 254 comments in 2 years 7 months
Contributions summary:Bo primarily contributed to the Apache Spark project by implementing and improving functionalities related to streaming. They focused on adding Python APIs for Spark Connect, particularly for stateful streaming operations like `dropDuplicatesWithinWatermark`, `mapGroupsWithState`, and `flatMapGroupsWithState`. The user also addressed issues related to error handling and resource management in streaming Python workers, along with adding client-side support for streaming listeners. Their contributions enhance the capabilities of Spark Connect for real-time data processing.
Apache Spark - A unified analytics engine for large-scale data processing
Contributions:247 pushes, 35 branches in 2 years 7 months
analyticsdata-processingapachebig-dataspark
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.