Tom Augspurger

Software Engineer at NVIDIA

United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Tom Augspurger is a software engineer with 11 years’ experience building scalable data and ML infrastructure, currently at NVIDIA after a four-year stint as a Geospatial Infrastructure Engineer at Microsoft. He’s a prolific open-source contributor in the PyData ecosystem, notably improving Dask, Dask-ML and fastparquet to tighten pandas and scikit-learn compatibility for large-scale, distributed workflows. His work spans backend systems, DevOps (including Kubernetes integration and CI tooling), and data-science-focused examples and docs that make distributed computing more accessible. He also maintains practical guides on effective pandas usage and has contributed to high-profile projects like xarray, seaborn, and fsspec/s3fs. Tom combines an academic grounding in econometrics with hands-on production experience, which shows in careful attention to compatibility, testing, and reproducible examples. A less obvious strength is his habit of improving developer experience—formatting, CI, and docs—so technical improvements stick and scale across communities.
code11 years of coding experience
job7 years of employment as a software developer
bookBachelor of Arts (BA) Econometrics and Quantitative Economics, Bachelor of Arts (BA) Econometrics and Quantitative Economics at University of Northern Iowa
bookMaster of Arts (MA) Economics, Master of Arts (MA) Economics at University of Iowa
stackoverflow-logo

Stackoverflow

Stats
28,550reputation
5.5mreached
246answers
1question
Badges
sql
top-5%
boxplot
top-5%
pandas
top-1%
matplotlib
top-1%
dataframe
top-1%
plot
top-1%
github-logo-circle

Github Skills (81)

python10
testing10
distributed-computing10
parquet10
scikit10
dataframes10
pandas10
statistics10
xray10
econometrics10
plot10
dockers10
numpy10
parallel-processing10
kubernetes-pods10

Programming languages (19)

JavaC++CSSCTeXPLpgSQLMakefileHTML

Github contributions (5)

github-logo-circle
Source code for my collection of articles on using pandas.
Role in this project:
userData Scientist
Contributions:28 commits, 11 PRs, 20 pushes in 3 years
Contributions summary:Tom primarily contributes to a collection of articles on using pandas, a data analysis library for Python. Their commits focus on cleaning up and standardizing the code, adding caching for downloads, and updating the introduction with current resources. The modifications involve updating the examples and content, aligning with the goals of the project which is to provide information on effective usage of pandas.
polarspythondataframesdata-analysisdata-science
dask/dask-ml

Jun 2017 - Oct 2022

Scalable Machine Learning with Dask
Role in this project:
userML Engineer
Contributions:12 releases, 41 reviews, 51 commits in 5 years 5 months
Contributions summary:Tom's contributions focused on integrating and maintaining scikit-learn compatibility within the Dask-ML library. They addressed several bug fixes and enhancements, including improvements to existing metrics and the addition of new features for regression. The user was also responsible for updating dependencies, particularly scikit-learn, and preparing for new releases of the library.
scalablepythondata-sciencedaskmachine-learning
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Tom Augspurger - Software Engineer at NVIDIA