A curious human being – Ramblings on Deep Learning

About me

Thanks for visiting my space, this where I learn, think, and write.

My background is mostly in distributed machine learning systems.

I graduated from University of Michigan (summa cum laude) and Shanghai Jiao Tong University with degrees in both Mechanical Engineering and Electrical Computer Engineering.
I obtained my Master’s degree from UCLA in Computer Science (specialization: data mining and machine learning).
I spent 2 years in Meta’s marketplace team working on recommendation systems.
I spent 2 years in Voleon (largest quant HF in California) working on data and systems.
I spent 3 years traveling and exploring, learning about the mathematical foundation behind deep learning, implemented various models and architectures, played a lot with GPU kernels.

I currently work at Manifest AI, carrying out research on efficient language modeling.

I also believe in Longtermism where positively influencing the long-term future is the priority of our time.

Kumar, S., Buckman, J., Gelada, C., & Zhang, X. (2025). Conformal Transformations for Symmetric Power Transformers. In First Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models. Paper
Gelada, C., Buckman, J., Zhang, S., & Bach, T. (2025). Scaling Context Requires Rethinking Attention. arXiv preprint arXiv:2507.04239. Paper

I’m currently based in Vancouver, Canada. I can be reached at sz@seanzhang.me.