Forest Gregg

Data Fellow With The Columbia Labor Lab

Detroit Metropolitan Area United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Forest Gregg is a data-focused engineer and product builder with 14 years of experience applying machine learning and engineering to messy, real-world data problems. As a partner at DataMade and co‑founder of dedupe.io, he has shipped production-grade tools for record linkage, address parsing, and census data access while mentoring teams and shaping product direction. Currently a Data Fellow with the Columbia Labor Lab, he’s building data systems to support the California Fast Food Workers’ Union, blending civic impact with technical rigor. His open-source contributions include practical improvements to widely used projects like csvkit, dedupe, and usaddress, reflecting a knack for improving data tooling and test coverage. Trained in sociology at the University of Chicago, he brings social-science perspective to technical design, prioritizing how information helps communities recognize and address shared challenges. Colleagues describe him as someone who improves both codebases and the teams that maintain them.
code14 years of coding experience
job6 years of employment as a software developer
bookMaster's degree, Sociology, Master's degree, Sociology at University of Chicago
github-logo-circle

Github Skills (44)

e-government10
deduplication10
python10
apidoc10
py10
address-parser10
setuptools10
testing10
scikit10
scraper10
machine-learning10
data-parsing10
webscraping10
cicd10
csv-parser10

Programming languages (24)

JavaC++CSSCRustPLpgSQLMakefilePerl

Github contributions (5)

github-logo-circle
datamade/census

Oct 2016 - Jul 2022

A Python wrapper for the US Census API.
Role in this project:
userBack-end Developer
Contributions:1 release, 3 reviews, 98 commits in 5 years 9 months
Contributions summary:Forest primarily focused on enhancing the functionality of the Python-based US Census API wrapper. Their contributions include allowing queries for more than 50 variables, refactoring core methods for efficiency, and preparing the project for new releases. They also addressed code style issues and updated dependencies. The user also demonstrated an understanding of testing.
apipythonpython-wrappercensusus-census
datamade/usaddress

Jun 2014 - Jun 2021

Role in this project:
userData Scientist
Contributions:165 commits, 13 PRs, 49 pushes in 7 years 1 month
Contributions summary:Forest's primary contribution involved developing and refining a machine-learning model for parsing and structuring unstructured United States address strings. The user implemented test cases to validate the model's performance. The user worked on feature engineering for the model, including adding and refining the features used for address component identification. The user also contributed to the training data used for the model.
python-librarynlppythonstringsaddress
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial