Hoang Anh Ngo is a research-focused machine learning engineer with six years of experience applying advanced clustering and unsupervised methods to streaming data. Currently a Research Assistant at Télécom Paris and alumnus of École Polytechnique and the University of Edinburgh, he has driven significant contributions to the River online-ML library—adding algorithms like CluStream, DenStream, STREAMKMeans and integrating evaluation metrics such as Mutual Information and Silhouette. His background in mathematics and economics, combined with hands-on data wrangling and PCA/regression work in prior research, gives him a rare mix of theoretical rigor and practical data engineering. Energetic and creative by nature, he thrives on challenging problems and leadership in collaborative research settings, and his River contributions are being used by industry projects and spawned downstream tools like IBM’s Sail.
6 years of coding experience
1 year of employment as a software developer
Master of Science - MS, Epidemiology, Master of Science - MS, Epidemiology at The University of Edinburgh
Bachelor's degree, Mathematics and Economics, minor in Computational Mathematics, Bachelor's degree, Mathematics and Economics, minor in Computational Mathematics at École Polytechnique
High School Diploma, Mathematics, High School Diploma, Mathematics at High School for the Gifted VNU - HCM
Contributions:152 reviews, 14 commits, 34 PRs in 1 year 7 months
Contributions summary:Hoang primarily contributed to the development and implementation of various clustering algorithms within the `river` library. Their work included adding new clustering methods like CluStream, DenStream, STREAMKMeans, and DBSTREAM, as well as improving existing ones by refactoring the code. Furthermore, the user integrated a suite of internal and external metrics for evaluating clustering performance, which involved the implementation of metrics such as Mutual Information, Silhouette, and Generalized Dunn's indices. These contributions enhanced the library's capabilities for online machine learning tasks, with a specific focus on stream data analysis and model evaluation.
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.