Summary
Xiang Si is a software engineer based in the San Francisco Bay Area specializing in high-performance ML inference and distributed systems, currently working on Cloud TPU inference at Google. With an MS in Computer Science from Carnegie Mellon and prior engineering roles at AWS (Neuron frameworks) and internships at Apple, he focuses on optimizing LLM inference performance and enabling disaggregated, scalable model serving. His background combines research in statistical machine learning with hands-on production work accelerating distributed training and inference across hardware and framework stacks. Notably, he has moved between deep research environments and hyperscale cloud teams, giving him a practiced ability to translate algorithmic ideas into production-grade, hardware-aware systems.
1 year of coding experience
4 years of employment as a software developer
Master of Science - MS, Computer Science, Master of Science - MS, Computer Science at Carnegie Mellon University
Chinese, English