Summary
W Huang is a Staff Research Scientist at Google DeepMind with eight years of experience building state-of-the-art speech and multimodal systems, leading core efforts on Gemini spoken-dialog pretraining and native audio-to-audio dialogue modeling. Trained as a physicist and PhD in EECS from MIT, he blends deep learning and optics expertise to push low-latency internal monologue and on-device speech language models from research into production. He led differentially private training and long-form ASR deployments for Pixel and YouTube, demonstrating an unusual mix of privacy, robustness, and product-minded engineering. Earlier work spans security and generalization research, consultant roles delivering ML solutions to clients, and laser-based experimental instrumentation at NASA and DESY. Comfortable straddling theory and systems, he brings hands-on leadership in data pipelines, safety evaluations, and real-time audio understanding. Peers describe him as a researcher who ships—turning complex acoustic and ML ideas into scalable, deployed systems.
8 years of coding experience
8 years of employment as a software developer
Doctor of Philosophy (PhD) Electrical Engineering & Computer Science, Doctor of Philosophy (PhD) Electrical Engineering & Computer Science at Massachusetts Institute of Technology
B.S. Applied & Engineering Physics, B.S. Applied & Engineering Physics at Cornell University
English, Chinese, Chinese