Patrick Xia

Researcher at Microsoft

Baltimore, Maryland, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Patrick Xia is a researcher and PhD candidate at Johns Hopkins CLSP, now at Microsoft, specializing in entity-level information extraction, document representations, and efficient NLP systems working with noisy web data. He combines deep academic training (PhD and MS from JHU; dual BS in CS and Math from CMU) with practical engineering experience from internships at Google, Meta, Semantic Machines, and Microsoft. His contributions to open-source NLP toolkits—such as integrating FastText, masked transformers, and ELMo support into the widely used jiant toolkit—reflect a focus on improving representation and encoder tooling for real-world tasks. With over a decade of research and teaching experience, he bridges rigorous modeling and production-minded system design, often optimizing for noisy inputs and scalability.
code13 years of coding experience
bookJohns Hopkins University
bookLivingston High School
bookBachelor of Science - BS, Mathematics, Bachelor of Science - BS, Mathematics at Carnegie Mellon University
github-logo-circle

Github Skills (8)

transformers10
pytorch10
nlp10
python10
fasttext10
bert9
transfer-learning8
multi-task-learning8

Programming languages (6)

JavaScriptPerlHTMLJupyter NotebookPythonJsonnet

Github contributions (5)

github-logo-circle
nyu-mll/jiant

Jun 2018 - Oct 2019

jiant is an nlp toolkit
Role in this project:
userML Engineer
Contributions:16 commits, 2 PRs, 10 pushes in 1 year 4 months
Contributions summary:Patrick primarily contributed to the implementation and integration of FastText embeddings for word representation within the NLP toolkit. They developed functions to load and utilize pre-trained embeddings, including handling model loading and path configurations. The user also introduced masked transformer components, and made changes to Elmo integration. These changes suggest a focus on enhancing the toolkit's capabilities for various NLP tasks, including encoding and model performance.
nlptransformersmultitask-learningsentence-representationbert
Code for "Moving on from OntoNotes: Coreference Resolution Model Transfer" and "Incremental Neural Coreference Resolution in Constant Memory"
Contributions:10 commits in 1 year 2 months
pytorchmemoryincrementaldeep-learningcoreference-resolution
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Patrick Xia - Researcher at Microsoft