John Marshall

Senior Software Engineer at Centre for Population Genomics

Wānaka, Otago, New Zealand
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
John Marshall is a senior software engineer with 16 years of experience building high-performance bioinformatics tools and production-grade pipelines from Wellington to Wānaka. He has deep C and systems expertise, having maintained and optimized cornerstone projects like BWA and SAMtools—improving core alignment modules and adding SIMD intrinsics to accelerate performance. His background spans academia and industry, from developing clinical-grade NGS pipelines at the University of Glasgow to leading tooling and release automation for Bioconda packaging. Comfortable across backend development, integration, and CI/CD, he routinely fixes subtle edge cases and memory issues that improve stability at scale. John pairs algorithmic thinking (graph algorithms for cancer rearrangements) with practical engineering—automating tests, refactoring for clarity, and migrating specs and websites—making him equally at home in code, docs, and build systems. Based in New Zealand, he brings a rare combination of long-term open-source stewardship and hands-on performance tuning.
code16 years of coding experience
job9 years of employment as a software developer
bookBSc(Hons), Mathematics, Computer Science, BSc(Hons), Mathematics, Computer Science at University of Otago
stackoverflow-logo

Stackoverflow

Stats
6,925reputation
485kreached
88answers
0questions
Badges
gnu-make
top-5%
makefile
top-1%
gnu
top-5%
github-logo-circle

Github Skills (66)

python10
package-management10
bash10
c1110
integrations10
c1710
build-automation10
markdown10
text-manipulation10
memory-management10
latex10
htslib10
sse10
test-automation10
algorithm10

Programming languages (29)

C#CMakefileGoNextflowHTMLJupyter NotebookTypeScript

Github contributions (5)

github-logo-circle
samtools/hts-specs

Aug 2012 - Aug 2022

Specifications of SAM/BAM and related high-throughput sequencing file formats
Role in this project:
userTechnical Writer
Contributions:45 reviews, 198 commits, 153 PRs in 10 years 1 month
Contributions summary:John primarily focused on updating and maintaining the specifications documents within the repository. Their contributions include adding new specifications for CRAM and GA4GH htsget retrieval protocol, updating existing documents (VCFv4.2, SAMv1), and modifying the website's structure to include these specifications. Additionally, they migrated the website to use Jekyll and made significant changes to the build process, ensuring more precise version information and automating LaTeX compilation.
fastafile-formatshigh-throughput-sequencingsequencinggenomics
samtools/samtools

May 2013 - Aug 2022

Tools (written in C using htslib) for manipulating next-generation sequencing data
Role in this project:
userBackend Developer & Integration Engineer
Contributions:3 releases, 51 reviews, 450 commits in 9 years 4 months
Contributions summary:John's commits indicate a focus on modifying and integrating code within the context of the samtools/htslib project. They worked on improving the handling of read groups, by enhancing the ability to add or alter read group headers and tags within the output. Additionally, the user added support for incorporating alternative loci annotation, indicating a focus on maintaining and extending the capabilities of the library. These tasks required modifying core libraries and incorporating support for CRAM files as well.
next-generationsequencing-datanext-generation-sequencinggenomicssequencing
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
John Marshall - Senior Software Engineer at Centre for Population Genomics