Summary
Kuan Lu is an engineering manager and ML systems leader with 11 years of experience building and scaling Cloud/AI products across multimodal domains. He has driven end-to-end delivery from model research to production at Microsoft—shipping foundation models, serverless inference platforms, and agentic search workflows—and now leads generative AI and vision platform efforts at TikTok. His work emphasizes operational rigor: revamping throttling and backpressure to improve cluster health, automating API and SDK release pipelines, and designing low-latency, cost-efficient inference stacks that reduced GPU spend by an order of magnitude. Kuan pairs hands-on technical depth (CUDA kernels, PyTorch migrations, large-scale benchmarking suites) with people leadership, routinely growing small teams into high-impact orgs. Based in Los Gatos and Berkeley-educated, he combines research pedigree with pragmatic product outcomes, including bringing Florence variants into production and achieving measurable ARR and performance gains. An under-the-radar strength is his track record of turning prototype model work into robust, secure services that preserve access controls with minimal overhead.
10 years of coding experience
7 years of employment as a software developer
Master's degree, Computer Science, Master's degree, Computer Science at University of California, Berkeley
Bachelor of Engineering (B.E.), Computer Science, Bachelor of Engineering (B.E.), Computer Science at Zhejiang University
English, Chinese