Daniel Han is the CEO and founder of Unsloth (YC S24) based in San Francisco, building open-source tooling that accelerates LLM fine-tuning (2–30x speedups and ~70% less memory). Over 8 years he has combined production ML engineering and numerical optimization—at NVIDIA he sped up t-SNE by 2000x, reduced SVD memory use in cuPy by ~50%, and improved cuML’s docs and APIs. He is a hands-on backend engineer who has implemented and optimized core Llama components and performance-critical linear algebra routines (SVD, Cholesky, CSR) across his open-source projects. Trained in Data Science and Actuarial/Law at UNSW, he pairs rigorous algorithmic thinking with product leadership and declined a lifetime NVIDIA offer to pursue this startup mission.
8 years of coding experience
3 years of employment as a software developer
3.83/4 GPA (CS), 3.83/4 GPA (CS) at UNSW Australia
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Role in this project:
Back-end Developer
Contributions:19 releases, 8 reviews, 519 PRs in 1 year 4 months
Contributions summary:Daniel appears to be primarily involved in the initial development of the Unsloth code base. Their commits focus on the creation of the core Llama models within the project, with specific edits to model components like LlamaAttention and LlamaDecoderLayer, as well as the implementation of crucial methods. Their work includes patching functions within the base Llama models, which are likely necessary to apply various optimizations implemented by Unsloth. These tasks involve a high degree of in-depth knowledge about the underlying code and focus on the technical implementation.
2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.
Role in this project:
Data Scientist & ML Engineer
Contributions:1 review, 254 commits, 12 PRs in 3 years 10 months
Contributions summary:Daniel contributed to core machine learning algorithms for the hyperlearn library. Their work focused on implementing and optimizing linear algebra routines, specifically for SVD and Cholesky decompositions, as well as building functions for a more efficient creation of CSR matrices, aiming to optimize and provide efficient decomposition algorithms. The code changes reveal a focus on improving performance and memory usage in matrix computations, which is aligned with the repository's goals. They further included the creation of an SVD imputer.
memorymemory-usagepythonregression-modelstensor
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.