Yinan Li is a software engineer based in Mountain View with 11 years of experience building ML-driven ad systems and distributed data infrastructure. Currently at Google working on Shopping Ads pCTR and personalization, he is an Apache Spark committer who helped bring Spark to Kubernetes by improving the Kubernetes cluster scheduler backend and contributing core API types and controllers to the Kubeflow spark-operator. He has productionized CTR and conversion models (Wide & Deep, Deep Cross) and engineered large-scale feature pipelines using Spark/Scala, TensorFlow, and AWS. Yinan blends backend and DevOps expertise—Kubernetes, init containers, resource scheduling—with applied ML and data engineering. Trained in biological sciences before earning an MS in Computer Science, he brings a research-honed analytical mindset to practical, production-grade systems and open-source projects.
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Role in this project:
Back-end & DevOps Engineer
Contributions:30 releases, 145 reviews, 447 commits in 4 years 11 months
Contributions summary:Yinan primarily focused on implementing the initial setup for API types, client, and controllers for the Kubernetes operator. They added code for the creation of ConfigMaps, essential for managing Spark and Hadoop configuration files. The user's contributions involved core architecture and configuration aspects of the operator, demonstrating experience in backend development and infrastructure setup.
Apache Spark - A unified analytics engine for large-scale data processing
Role in this project:
Back-end & DevOps Engineer
Contributions:20 PRs, 677 comments in 2 years 6 months
Contributions summary:Yinan contributed significantly to the Kubernetes integration within Apache Spark, enhancing its capabilities for cluster management. They focused on implementing and refining the Kubernetes cluster scheduler backend, which included features such as executor pod creation, resource allocation, and dependency management via init containers. Furthermore, the user's work included improvements to the submission client and documentation updates, indicating a focus on both backend functionality and developer experience. This work enabled Spark applications to be deployed and run on Kubernetes clusters.
analyticspythondata-processingsqlapache
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Yinan Li - Software Engineer at Google GoogleCloudPlatform