Li Shuming

Software Engineer at CelerData

Hangzhou City, Zhejiang, China
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Li Shuming is a software engineer with 11 years of experience building high-performance query engines and big data systems, currently contributing at CelerData in Hangzhou. He has strong back-end expertise from roles at Alibaba (Hologres), Ant Group (AntSpark), NetEase, and Baidu, focusing on query optimization, storage, and distributed analytics. An active open-source contributor, he improved core database behaviors in projects like StarRocks—fixing empty-hash-table crashes and optimizing hash joins—and added SQL functions to Apache Calcite to better align it with PostgreSQL/MySQL semantics. His work shows a pragmatic combination of low-level performance tuning and rigorous testing, demonstrated by unit-test contributions that prevented regressions. With a Master’s from the University of Chinese Academy of Sciences, he brings both research-informed thinking and production-hardened delivery to large-scale analytics platforms.
code11 years of coding experience
job7 years of employment as a software developer
bookMaster’s Degree, Master’s Degree at University of Chinese Academy of Sciences
bookBachelor's degree, Bachelor's degree at Central South University
stackoverflow-logo

Stackoverflow

Stats
153reputation
8kreached
1answer
2questions
github-logo-circle

Github Skills (21)

apache-calcite10
c-language10
testing10
databases10
java10
calc10
javas10
sql10
hashtable10
cprogramming-language10
database10
big-data9
query-optimization9
olap9
distributed-database9

Programming languages (6)

JavaC++LeanScalaSCSSObjective-C

Github contributions (5)

github-logo-circle
StarRocks/starrocks

Aug 2022 - Jan 2023

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
Role in this project:
userBack-end Developer
Contributions:1905 reviews, 42 commits, 1264 PRs in 5 months
Contributions summary:Li focused on optimizing hash join operations within the StarRocks database engine. Their work involved enhancing the handling of empty hash tables, including refactoring of the JoinHashMap class. They fixed a core dump issue arising when hash tables were empty. They also contributed to unit tests to validate these optimizations and bug fixes.
multi-dimensionalcloudnativescenariosrealtime-databasereal-time
apache/calcite

Apr 2019 - Dec 2019

Apache Calcite
Role in this project:
userBack-end Developer
Contributions:5 commits, 8 PRs, 29 comments in 8 months
Contributions summary:Li primarily contributed to enhancing the Apache Calcite project's functionality through the addition of new SQL functions and improvements to existing code. They implemented functions such as MD5, SHA1, and REGEXP_REPLACE, aligning with functionalities present in other database systems like PostgreSQL and MySQL. Furthermore, the user made modifications in areas involving testing to improve efficiency, showcasing a commitment to overall code quality within the project.
geospatialapache-calcitesqlapachebig-data
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Li Shuming - Software Engineer at CelerData