Iris Z - Software Engineer at Meta

Iris Z

Software Engineer at Meta

San Francisco Bay Area United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

🎓

Top School

Iris Z is a software engineer with eight years of experience building and optimizing distributed ML systems, currently working on second-order optimizers for distributed training at Meta after several years on the PyTorch Distributed team. She brings deep backend and systems expertise—contributing code to core PyTorch for distributed checkpointing, error handling, and multi-threaded filesystem writers—and regularly authors tutorials that improve developer onboarding for one of the most visible open-source ML projects. Her background spans production ML at scale (ads ranking and integrity) and academic research in grounded language understanding from Cornell, giving her a rare mix of practical systems engineering and research-driven problem framing. Based in the San Francisco Bay Area, she combines hands-on open-source impact with enterprise experience shipping robust distributed training infrastructure.

8 years of coding experience

8 years of employment as a software developer

Dual MS in Info Systems and Applied Science Computer Science, Dual MS in Info Systems and Applied Science Computer Science at Cornell Tech

Github Skills (16)

pytorch10

rs10

distributed-systems10

distributed-training10

checkpoint10

restructuredtext10

python10

documentation10

checkpointing10

multithreading9

cprogramming-language9

c-language9

error-handling9

fs8

performance-optimization8

Programming languages (4)

C++Jupyter NotebookRubyPython

Github contributions (5)

pytorch/pytorch

Oct 2022 - Jan 2023

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Role in this project:

Back-end Developer

Contributions:578 reviews, 34 commits, 415 PRs in 3 months

Contributions summary:Iris's commits primarily focus on enhancing the PyTorch codebase with features related to distributed checkpointing, error handling, and optimization. They added a customized logging handler for internal Facebook use and implemented exception handlers for distributed collectives. The user also contributed to the migration of distributed checkpointing components. Furthermore, they improved the error messages in distributed send operations and introduced a multi-threaded file system writer for optimized checkpointing.

pythongpu-accelerationdeep-learninggpunumpy

pytorch/tutorials

Sep 2023 - Jan 2025

PyTorch tutorials.

Role in this project:

Technical Writer

Contributions:17 reviews, 6 PRs, 2 branches in 1 year 3 months

Contributions summary:Iris primarily contributes to the repository by adding and updating tutorials related to PyTorch functionalities. The commits involve creating and modifying documentation files written in reStructuredText (RST) format, focusing on topics such as Distributed Checkpointing, DeviceMesh, and deprecation notices for older tutorials. These updates also include adding links, correcting indentations, and making minor content adjustments to improve clarity and user experience.

deep-learningpytorchpytorch-tutorials

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial