Sofie Van Landeghem

Nanochat's Repo Czar

Belgium
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
award
Top expert inNatural Language Processing and Machine Learning Technologies
Sofie Van Landeghem is an experienced software engineer and open-source maintainer with 12 years of expertise in NLP, ML and backend development, holding a PhD in Bioinformatics and a Master’s in Software Engineering. She founded OxyKodit to deliver tailored NLP and LLM solutions across biomedical, legal and enterprise datasets, combining hands‑on modeling with rigorous testing and production-ready quality practices. As a core contributor to spaCy, Thinc and spacy-transformers she built NER, entity linking and transformer integration features, and today helps maintain FastAPI/Typer and acts as "repo czar" for Andrej Karpathy’s educational nanochat, curating PRs and running experiments to keep the code minimal and teachable. Her work spans low-level ML optimizations (e.g., custom layers and shape/padding fixes) to developer UX improvements (docs, test suites, CLI ergonomics), showing a rare blend of research depth and pragmatic engineering. Based in Belgium, she balances independent consulting with high-impact open-source stewardship, often serving as the first line of review that turns community contributions into merge-ready code. An understated strength is her history of scaling academic NLP pipelines to process tens of millions of PubMed articles, which informs her practical, data-centric approach to language AI.
code12 years of coding experience
job9 years of employment as a software developer
bookMachine Learning Summer School, Machine Learning Summer School at University of Cambridge
bookPhD in Sciences Bioinformatics, PhD in Sciences Bioinformatics at Ghent University
languagesFrench, Dutch, English
github-logo-circle

Github Skills (28)

transformers10
pytorch10
python10
distilbert10
machine-learning10
deep-learning10
spacy10
bert10
nlp10
xnet10
cli10
testing9
data-structure9
back-end-development9
algorithm9

Programming languages (11)

TypeScriptJavaC++RustCHandlebarsJavaScriptJupyter Notebook

Github contributions (5)

github-logo-circle
explosion/spaCy

Nov 2018 - Jan 2023

💫 Industrial-strength Natural Language Processing (NLP) in Python
Role in this project:
userData Scientist
Contributions:5 releases, 1295 reviews, 1044 commits in 4 years 2 months
Contributions summary:Sofie's contributions primarily involve the development of a Named Entity Recognition (NER) system within the spaCy NLP framework. They focused on leveraging entity descriptions and article texts to create input embedding vectors, which were then used for training. Their work includes creating and training a custom knowledge base and constructing training datasets for named entity linking. The code changes also indicate the creation of a model to predict NER labels and the implementation of methods for evaluating the performance of the developed models.
fairness-mlpythondata-preprocessinglanguage-processingtokenization
explosion/spacy-transformers

Dec 2019 - Oct 2021

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Role in this project:
userML Engineer
Contributions:1 release, 57 reviews, 96 commits in 1 year 10 months
Contributions summary:Sofie primarily contributed to the `spacy-transformers` repository by implementing and refining the integration of transformer models within spaCy. Their work included adding support for specific transformer models like DistilBERT and XLNet, enhancing the functionality of components such as the wordpiecer, text categorizer, and entity recognizer. Further contributions encompassed fixing configuration files and integrating features from the spaCy v3 branch.
natural-language-understandingxlnetbertgooglenatural-language-processing
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Sofie Van Landeghem - Nanochat's Repo Czar