Research

My publications, preprints, and ongoing research projects.

Research Interests

My research focuses on understanding the internal mechanisms of deep learning models, with a focus on interpretability and generative systems. Key areas include:

  • Mechanistic Interpretability and Sparse Autoencoders
  • Diffusion Models and Generative AI
  • Trustworthy and Robust Machine Learning

Publications