Ke Jia is a seasoned software engineer with 21 years of experience, based in Shanghai, specializing in high-performance back-end systems for large-scale data processing. At Intel he is a PPMC member of Apache Gluten and a Velox committer, driving native-engine offloading and format support that bridge JVM SQL engines to faster execution layers. His open-source contributions include performance optimizations and bug fixes in Apache Spark's SQL engine—particularly join and shuffle improvements—and feature work in Velox to support Substrait plans, Parquet/DWRF formats, and Spark window functions. Ke combines deep systems-level C++ and Scala expertise with practical engineering pragmatism, improving adaptive query execution and IO buffering for real-world workloads. Colleagues rely on him for subtle performance regressions fixes and for integrating complex file-format support across heterogeneous backends. He holds a master's in computer science from East China Normal University and brings a proven track record of shipping robust data-engine features in major OSS projects.
21 years of coding experience
Master's degree, Computer Science, Master's degree, Computer Science at East China Normal University
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Role in this project:
Back-end Developer
Contributions:491 reviews, 35 commits, 353 PRs in 7 months
Contributions summary:Ke implemented DWRF format support within the Gluten's Velox backend, adding new Scala and C++ code to enable the reading and writing of DWRF files. This involved creating new scan builders and scan classes, as well as modifying existing files to support the integration of DWRF. In addition, the user added the support for parquet write in arrow backend and also enabled the date type validation. The commits suggest the user is actively working on extending the functionality of the Gluten project with support for various data formats.
A composable and fully extensible C++ execution engine library for data management systems.
Role in this project:
Back-end Developer
Contributions:540 reviews, 2 commits, 83 PRs in 5 months
Contributions summary:Ke's contributions focused on enhancing the Velox execution engine library. They implemented file format capture for Substrait plans, modifying the SubstraitToVeloxPlan class to handle different file formats. The user fixed an issue related to the order of grouping keys in GroupIdNode's output and updated the buffer grow factor for the Parquet writer to improve performance. Additionally, they added support for Spark window functions and fixed multiple date/time related functions.
queryvectorizedcppdata-processingacceleration
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.