Eric Anthony Mitchell
Ph.D. Candidate, Stanford University

2024 update: I am on the job market! If you think my experience would be a good fit for your organization or institution, reach out! You can find my CV here.

I am a final-year PhD student in Stanford’s CS department, where I’m fortunate to be advised by Chelsea Finn and Christopher D. Manning. The goal of my research is to make foundation models, particularly language models, a more trustworthy and easy to use technology. In particular, I’ve been interested in making language models more factual, up-to-date, and able to understand user intent. I’m also quite interested in scalable oversight as well as improving the reasoning & planning abilities of language models. Much of my PhD has been generously supported by a Knight-Hennessy Graduate Fellowship and a Stanford Accelerator for Learning grant for Generative AI for the Future of Learning.

In the summer of 2022, I was a research scientist intern at DeepMind in London, where I was lucky to spend four months working with Junyoung Chung, Nate Kushman, and AƤron van den Oord.

Before my PhD, I was a research engineer at Samsung’s AI Center in New York City, where I learned constantly from Volkan Isler, Daniel D. Lee, and many other wonderful (and patient) people. As an undergraduate, I completed my thesis under the guidance of H. Sebastian Seung after many hours in the Seung Lab at the Princeton Neuroscience Institute. I also competed for Princeton’s varsity men’s golf team.

In my free time, I make music for guitar and voice. I enjoy the outdoors, particularly playing golf, exploring mountains, and SCUBA diving.

Selected Works

Fine-tuning Language Models for Factuality
Katherine Tian*, Eric Mitchell*, Huaxiu Yao,
Christopher D. Manning, Chelsea Finn
ICLR, 2024
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Eric Mitchell, Rafael Rafailov, Archit Sharma,
Chelsea Finn, Christopher D. Manning
ICLR, 2024
Meta-Learning Online Adaptation of Language Models
Nathan Hu*, Eric Mitchell*, Christopher D. Manning,
Chelsea Finn
EMNLP, 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian*, Eric Mitchell*, Allan Zhou,
Archit Sharma, Rafael Rafailov, Huaxiu Yao,
Chelsea Finn, Christopher D. Manning
EMNLP, 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov*, Archit Sharma*, Eric Mitchell*,
Stefano Ermon, Christopher D. Manning, Chelsea Finn
NeurIPS (Oral), 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
Eric Mitchell, Yoonho Lee, Sasha Khazatsky,
Christopher D. Manning, Chelsea Finn
ICML (Oral), 2023
Enhancing Self-Consistency and Performance of Pretrained Language Models with NLI
Eric Mitchell, Joseph J. Noh, Siyan Li,
William S. Armstrong, Ananth Agarwal, Patrick Liu,
Chelsea Finn, Christopher D. Manning
EMNLP (Oral), 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses in Foundation Models
Peter Henderson*, Eric Mitchell*, Christopher D. Manning,
Dan Jurafsky, Chelsea Finn
AAAI/ACM Conference on AI, Ethics, and Society, 2022
Fast Model Editing at Scale
Eric Mitchell, Charles Lin, Antoine Bosselut,
Chelsea Finn, Christopher D. Manning
ICLR, 2022