Zygmunt Zając - The Proprietor at FastML.com

Zygmunt Zając

The Proprietor at FastML.com

Poland

Join Prog.AI to see contacts

Summary

👤

Senior

Zygmunt Zając is a seasoned software engineer and proprietor of FastML.com with 13 years of experience building practical data and ML tooling from Poland. He combines hands-on machine learning engineering—contributing a robust CSVDataset loader and prediction utilities for pylearn2—with pragmatic data engineering, authoring a toolkit of Python scripts for large-file preprocessing and transformation. Comfortable across the data pipeline, he focuses on reliable data ingestion, normalization, and sampling to make models production-ready. Active on GitHub and Twitter, he blends open-source craftsmanship with entrepreneurial drive and a knack for turning messy datasets into reproducible workflows. An interesting non-obvious detail: he self-documents small but impactful utilities (e.g., shuffle/unshuffle and column stats) that solve everyday scale problems often overlooked in ML projects.

13 years of coding experience

Github Skills (13)

file-handling10

data-preprocessing10

data-transformation10

machine-learning10

csv10

python10

numpy10

testing8

data-pipeline7

theano7

data-pipelines7

scikit4

scikit-learn4

Programming languages (5)

RC++LuaJupyter NotebookPython

Github contributions (5)

zygmuntz/phraug

Jul 2013 - Nov 2015

A set of simple Python scripts for pre-processing large files

Role in this project:

Data Engineer

Contributions:43 commits, 2 PRs, 8 pushes in 2 years 4 months

Contributions summary:Zygmunt primarily contributed to the development of data preprocessing scripts. They wrote several Python scripts, including `chunk.py` for splitting files, `sample.py` for sampling lines, `tsv2csv.py` for format conversion, `delete_cols.py` for column deletion, `standardize.py` and `colstats.py` for data normalization, `split.py` for data splitting, `shuffle.py` and `unshuffle.py` for shuffling/unshuffling, and various utility scripts for data transformation. Their contributions focused on data manipulation and preparation for potential downstream analysis or machine learning tasks.

large-filespre-processingpythonpython-scripts

lisa-lab/pylearn2

Nov 2013 - Mar 2014

Warning: This project does not have any current developer. See bellow.

Role in this project:

ML Engineer

Contributions:16 commits in 3 months

Contributions summary:Zygmunt primarily contributed to the development of a CSV dataset wrapper for the pylearn2 library. Their work included implementing the `CSVDataset` class, which handles loading and processing data from CSV files, including support for one-hot encoding and handling headers. They also added a unit test for the `CSVDataset` and a simple prediction script. The contributions focus on data loading and model prediction for machine learning tasks.

javascripttypescript

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial