Ge Song

AI Frameworks Engineer at Intel Corporation

Shanghai, Shanghai, China
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Ge Song is an AI Frameworks Engineer at Intel with six years of experience building and optimizing ML and big-data systems, currently focused on accelerating LLM inference on Intel XPU platforms. A HKUST MS student with prior research experience and an Intel internship, he has contributed deep practical work to Spark MLlib, Intel DAAL, and ML benchmarking. At Intel he has driven API unification, refactoring, and performance improvements in ipex-llm—implementing KV-cache optimizations and mixed-precision quantization support for models like Mistral, Mixtral, ChatGLM2, and Baichuan2. He blends systems-level engineering with applied ML, routinely bridging low-level performance tuning and higher-level framework ergonomics. Based in Shanghai, he brings a research-informed mindset to production-grade open-source projects, often surfacing tooling and documentation improvements alongside core optimizations.
code6 years of coding experience
job1 year of employment as a software developer
bookBachelor's degree, Management Information Systems, Bachelor's degree, Management Information Systems at Nanjing University
bookHong Kong University of Science and Technology (HKUST)
languagesielts
github-logo-circle

Github Skills (7)

transformers10
pytorch10
gpu10
python10
model-optimization10
llm10
machine-learning9

Programming languages (6)

TypeScriptC++GoHTMLJupyter NotebookPython

Github contributions (5)

github-logo-circle
intel/ipex-llm

Jul 2021 - Jan 2023

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
Role in this project:
userML Engineer
Contributions:293 reviews, 130 commits, 309 PRs in 1 year 6 months
Contributions summary:Ge primarily contributed to optimizing and unifying the Transformers and Native APIs for LLM models within the ipex-llm repository. This involved refactoring code, renaming utilities, and updating API documentation. The user also implemented and tested several improvements including KV-cache optimizations for various models (e.g., GPTJ, Mistral, Mixtral, ChatGLM2, Baichuan2), and incorporating support for mixed-precision quantization (mixed_fp8/mixed_fp4).
llm-inferencellama2pythonfinetuningllama
sgwhat/ipex-llm

Apr 2024 - Feb 2025

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, ModelScope, etc.
Contributions:166 pushes, 30 branches in 10 months
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Ge Song - AI Frameworks Engineer at Intel Corporation