Zygmunt Zając

The Proprietor at FastML.com

Poland
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

👤
Senior
Zygmunt Zając is a seasoned software engineer and proprietor of FastML.com with 13 years of experience building practical data and ML tooling from Poland. He combines hands-on machine learning engineering—contributing a robust CSVDataset loader and prediction utilities for pylearn2—with pragmatic data engineering, authoring a toolkit of Python scripts for large-file preprocessing and transformation. Comfortable across the data pipeline, he focuses on reliable data ingestion, normalization, and sampling to make models production-ready. Active on GitHub and Twitter, he blends open-source craftsmanship with entrepreneurial drive and a knack for turning messy datasets into reproducible workflows. An interesting non-obvious detail: he self-documents small but impactful utilities (e.g., shuffle/unshuffle and column stats) that solve everyday scale problems often overlooked in ML projects.
code13 years of coding experience
github-logo-circle

Github Skills (13)

file-handling10
data-preprocessing10
data-transformation10
machine-learning10
csv10
python10
numpy10
testing8
data-pipeline7
theano7
data-pipelines7
scikit4
scikit-learn4

Programming languages (5)

RC++LuaJupyter NotebookPython

Github contributions (5)

github-logo-circle
zygmuntz/phraug

Jul 2013 - Nov 2015

A set of simple Python scripts for pre-processing large files
Role in this project:
userData Engineer
Contributions:43 commits, 2 PRs, 8 pushes in 2 years 4 months
Contributions summary:Zygmunt primarily contributed to the development of data preprocessing scripts. They wrote several Python scripts, including `chunk.py` for splitting files, `sample.py` for sampling lines, `tsv2csv.py` for format conversion, `delete_cols.py` for column deletion, `standardize.py` and `colstats.py` for data normalization, `split.py` for data splitting, `shuffle.py` and `unshuffle.py` for shuffling/unshuffling, and various utility scripts for data transformation. Their contributions focused on data manipulation and preparation for potential downstream analysis or machine learning tasks.
large-filespre-processingpythonpython-scripts
lisa-lab/pylearn2

Nov 2013 - Mar 2014

Warning: This project does not have any current developer. See bellow.
Role in this project:
userML Engineer
Contributions:16 commits in 3 months
Contributions summary:Zygmunt primarily contributed to the development of a CSV dataset wrapper for the pylearn2 library. Their work included implementing the `CSVDataset` class, which handles loading and processing data from CSV files, including support for one-hot encoding and handling headers. They also added a unit test for the `CSVDataset` and a simple prediction script. The contributions focus on data loading and model prediction for machine learning tasks.
javascripttypescript
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Zygmunt Zając - The Proprietor at FastML.com