Ge Song - AI Frameworks Engineer at Intel Corporation

Ge Song

AI Frameworks Engineer at Intel Corporation

Shanghai, Shanghai, China

Join Prog.AI to see contacts

Summary

🤩

Rockstar

🎓

Top School

Ge Song is an AI Frameworks Engineer at Intel with six years of experience building and optimizing ML and big-data systems, currently focused on accelerating LLM inference on Intel XPU platforms. A HKUST MS student with prior research experience and an Intel internship, he has contributed deep practical work to Spark MLlib, Intel DAAL, and ML benchmarking. At Intel he has driven API unification, refactoring, and performance improvements in ipex-llm—implementing KV-cache optimizations and mixed-precision quantization support for models like Mistral, Mixtral, ChatGLM2, and Baichuan2. He blends systems-level engineering with applied ML, routinely bridging low-level performance tuning and higher-level framework ergonomics. Based in Shanghai, he brings a research-informed mindset to production-grade open-source projects, often surfacing tooling and documentation improvements alongside core optimizations.

6 years of coding experience

1 year of employment as a software developer

Bachelor's degree, Management Information Systems, Bachelor's degree, Management Information Systems at Nanjing University

Hong Kong University of Science and Technology (HKUST)

ielts

Github Skills (7)

transformers10

pytorch10

gpu10

python10

model-optimization10

llm10

machine-learning9

Programming languages (6)

TypeScriptC++GoHTMLJupyter NotebookPython

Github contributions (5)

intel/ipex-llm

Jul 2021 - Jan 2023

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

Role in this project:

ML Engineer

Contributions:293 reviews, 130 commits, 309 PRs in 1 year 6 months

Contributions summary:Ge primarily contributed to optimizing and unifying the Transformers and Native APIs for LLM models within the ipex-llm repository. This involved refactoring code, renaming utilities, and updating API documentation. The user also implemented and tested several improvements including KV-cache optimizations for various models (e.g., GPTJ, Mistral, Mixtral, ChatGLM2, Baichuan2), and incorporating support for mixed-precision quantization (mixed_fp8/mixed_fp4).

llm-inferencellama2pythonfinetuningllama

sgwhat/ipex-llm

Apr 2024 - Feb 2025

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, ModelScope, etc.

Contributions:166 pushes, 30 branches in 10 months

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial