Soledad Galli

Data Scientist Instructor

Berlin, Germany
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Soledad Galli is a Berlin-based data scientist, best-selling instructor and author with over a decade of experience building and productionizing machine learning solutions for finance and insurance. She founded Train in Data and created Feature-engine — an open-source feature engineering library with 250k+ monthly downloads, 2k+ GitHub stars and a 55+ contributor community — and has written practical books on feature engineering and feature selection. She leads end-to-end ML work from imputation and feature design to hyperparameter optimization, mentoring and hiring data teams while driving adoption of ML to combat fraud, automate claims processing and assess credit risk. A former research scientist with a Nature Communications publication and a 2018 Data Leaders Award, she also contributes to community tooling and docs (e.g., imbalanced-learn), reflecting a rare blend of rigorous research, production experience and pedagogy.
code10 years of coding experience
job13 years of employment as a software developer
bookDoctor of Philosophy (PhD), Cell/Cellular and Molecular Biology, Sobresaliente (highest distinction), Doctor of Philosophy (PhD), Cell/Cellular and Molecular Biology, Sobresaliente (highest distinction) at Universidad de Buenos Aires
languagesEnglish, Spanish, German
stackoverflow-logo

Stackoverflow

Stats
887reputation
66kreached
28answers
14questions
github-logo-circle

Github Skills (35)

fasterrcnn10
documentations10
python10
data-science10
scikit10
machine-learning10
feature-engineering10
hyperopt10
mask-rcnn10
hyperparameter-optimization10
scikit-learn10
faster-rcnn10
jupyter-notebook10
documentation10
optuna10

Programming languages (4)

CSSHTMLJupyter NotebookPython

Github contributions (5)

github-logo-circle
Code repository for the online course Machine Learning with Imbalanced Data
Role in this project:
userData Scientist
Contributions:51 commits, 9 PRs, 54 pushes in 2 years
Contributions summary:Soledad focused on fixing a bug within the `return_minority_perc` function and adding new content to section 3 and 9. Section 3 focuses on metrics and the bug fix demonstrates an understanding of how the code base will work. Section 9 focused on Probability and Calibration Notebooks. The overall repository focuses on Machine Learning and is very related to data science.
pythondata-scienceimbalanced-datamachine-learning-coursemachine-learning
Feature engineering package with sklearn like functionality
Role in this project:
userData Scientist & ML Engineer
Contributions:6 releases, 777 reviews, 154 commits in 2 years 10 months
Contributions summary:Soledad's commits indicate significant involvement in feature engineering tasks, including code updates, bug fixes, and improvements to existing functionalities within the feature_engine library. They updated and improved existing code for various features, discretizers and imputers, setup and configured continuous integration workflows. Their work involved modifying setup layout, updating requirements, adding and adjusting ci config, and incorporating style checks, suggesting a focus on code quality, model preparation, and overall project maintenance.
pythonfeature-extractiondata-sciencesklearnmachine-learning
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Soledad Galli - Data Scientist Instructor