Brad Chapman - Principal Data Science Architect

Brad Chapman

Principal Data Science Architect

Somerville, Massachusetts, United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

🎓

Top School

Top expert inComputational Genomics and Bioinformatics Workflow Development

Brad Chapman is a Principal Data Science Architect with 25 years of experience building scalable bioinformatics tools, infrastructure, and reproducible workflows for academia and industry. Based in Somerville, MA, he combines a PhD in Plant Biology with deep engineering chops—contributing to flagship open-source projects like Biopython, IPython/ipyparallel, and MultiQC while packaging and deploying tools via Bioconda and CloudBioLinux. At Ginkgo Bioworks he translates complex biological problems into robust data pipelines and CWL-compatible workflows, drawing on prior leadership at Harvard and MGH. His strengths lie at the intersection of back-end development, DevOps, and scientific rigor: improving parser robustness, Docker/CWL integrations, and workflow engine reliability. Not obvious from job titles alone, he remains an active hands-on contributor who has implemented SQLite/BioSQL support, improved heartbeat and logging for parallel systems, and added nuanced parsing and plotting features for CNV and coverage analyses.

25 years of coding experience

14 years of employment as a software developer

University of Georgia

Bachelor of Science (BS) Plant Molecular Biology, Bachelor of Science (BS) Plant Molecular Biology at Michigan State University

Stackoverflow

Stats

1reputation

0reached

0answers

0questions

Github Skills (65)

variants10

python10

package-management10

bash10

ipython10

data-parsing10

gatk10

parallel-computing10

fabric10

reporting-services10

pandas10

dockers10

numpy10

workflow-description-language10

mu10

Programming languages (24)

C#JavaC++CSSCDScalaGo

Github contributions (5)

bcbio/bcbio-nextgen

Jul 2010 - Jan 2020

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Role in this project:

Back-end Developer & Bioinformatics Analyst

Contributions:5009 commits, 207 PRs, 2670 pushes in 9 years 7 months

Contributions summary:Brad primarily contributed to the bcbio-nextgen bioinformatics pipeline, focusing on enhancing and extending its capabilities. They implemented code changes to handle issues with colons in sample/batch names, improved documentation for CWL functionalities, and implemented features related to variant calling and data standardization. Furthermore, the user addressed build and performance issues, contributing to the stability and usability of the pipeline.

scalablecallingrnaseqvariant-callinggenomics

chapmanb/cloudbiolinux

Dec 2008 - Mar 2020

CloudBioLinux: configure virtual (or real) machines with tools for biological analyses

Role in this project:

Back-end & DevOps Engineer

Contributions:1752 commits, 68 PRs, 510 pushes in 11 years 5 months

Contributions summary:Brad's contributions centered around configuring and automating the installation of bioinformatics tools and related dependencies on a Linux system. They worked on creating installation scripts for several programs, including FreeNX, and automated the setting of environmental variables to facilitate their use. The user also handled the creation and configuration of anaconda environments and their associated dependencies. Their work demonstrates a focus on streamlining the infrastructure for a reproducible and easy-to-use environment.

analysesmachinesbioinformaticsconfigurebiological

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial