A collection of projects I've worked on, ranging from academic research to personal explorations. If you think it would be useful or interesting to collaborate on a project, please contact me to discuss.
Featured Projects
Featured
Replicate Internal Coherence Maximization (ICM) from Wen et. al (2025)
Work
Praxis Project Trial. ICM (unsupervised) elicits human concepts from base language models by maximizing mutual predictability and local consistency among concept-related examples.
Replicate Counterfactual Test for Faithfulness from Atanasova et. al (2023)
Work
Pivotal Fellowship Spring 2026 Project Trial. Evaluate whether language model explanations are faithful by checking if perturbing input features leads to changes in model output and explanations.