Hi, I’m Xavier.
I’m a PhD student in the Laboratory for Social Cognitive Science at Harvard. I work on people and on AI.
Specifically, in my PhD I investigate how culture scaffolds cognition, enabling people to accumulate knowledge and abilities over generations and to solve tasks more complex than anyone can accomplish alone. I also work on contractualist moral psychology.
My focus in AI security is on human data for scalable oversight. We urgently need techniques for supervising increasingly sophisticated AI systems on fuzzy, hard-to-verify tasks (like alignment research). A promising idea is scalable oversight, using AI systems themselves to help us with supervision. But existing scalable oversight protocols lack theoretical guarantees in this extremely difficult setting. Equally importantly, we don’t have a good sense of how they work in practice.
I’ve also done some work on technical governance topics (mostly the design and interpretation of evals). Previously, I read Psychology & Philosophy at New College, Oxford, studying empathy and deep learning theory. I then worked at McKinsey’s London office, on strategy, org design, and data/analytics topics.
I write at Seeing Truly on Substack. Send me an email if you’d like to chat!