Scott Viteri

Scott Viteri

PhD Candidate in Computer Science

Stanford University

Biography

I am a CS PhD candidate at the Center for AI Safety at Stanford University, under the guidance of Prof. Clark Barrett. I am interested in training AIs to be human-like both in terms of the concepts they contain, as well as the values they hold. I think the key to this lies in training the language model to produce its own memory during training and to influence its future training data. In this vein, I have been optimizing pre-trained models to produce Chain-of-Thought reasoning which is informative to itself and to humans. In the future, I am interested in using a similar blend of unsupervised learning and reinforcement learning towards active learning, in the hopes that this sets the preconditions to raise the AI like a child, and have that result in something child-like.

During my PhD, my focus has evolved from formal verification and programming languages to AI alignment. Prior to this, I majored in computer science and electrical engineering at MIT, contributing to AI and robotics research. After MIT, I explored interactive theorem proving at CMU with Simon Dedeo, publishing research on abduction in mathematics in the Cognition journal.

My primary character trait is curiosity, and I really love math.

Interests

  • AI Alignment
  • Large Language Models
  • Chain-of-Thought Reasoning
  • Reinforcement Learning
  • Interactive Theorem Proving

Education

  • Ph.D. in Computer Science (emphasis in Artificial Intelligence), present

    Stanford University

  • B.S. in Computer Science and Electrical Engineering, 2018

    Massachusetts Institute of Technology

Publications

(2026). Markovian Transformers for Informative Language Modeling. ICLR 2026.

PDF Code Source Document

(2025). Uncovering Latent Chain of Thought Vectors in Large Language Models. ICLR 2025 Workshop.

Source Document

(2022). Flexible Proof Production in an Industrial-Strength SMT Solver. IJCAR.

DOI

(2021). Epistemic Phase Transitions in Mathematical Proofs. Cognition.

PDF DOI

Talks

Ontology Indentification and Utility Functions

Abstract:

For STS 10SI – Intro to AI Alignment in Winter 2023 I talked eliciting latent knowledge, cooperative inverse reinforcement learning, and shard theory. First I introduce the ELK problem as formulated by Paul Christiano and I go through the main proposals and counterexamples from the ELK document. Then we introduce inverse reinforcement learning, cooperative inverse reinforcement learning, and their relationship to the alignment problem. We proceed with a discussion of whether humans can be thought of as having utility functions, leading into a conversation about Shard theory. Lastly, we talk about similarities between the limbic-cortex relationship and the alignment problem, and frame utility functions as a story the brain tells about itself. Slides here.

Ontology Maps for AI Alignment

Abstract:

A presentation on ontology maps as a framework for AI alignment, exploring how mappings between agent representations can be used to ensure AI systems retain human-compatible concepts and values.

Computation, Communication, and Ontology Maps

Abstract:

How can we create intelligent systems that retain and expand into that which we find valuable in the universe? During this talk I will present my own thoughts based on ontology mapping, and I will communicate why I believe that mathematicians who think about systemic interactions are in an especially good place to answer this important question. I will start with a framework in which read-eval-print loops (repls) form a basis for reasoning about agents and computation in general. Then I will build ontology maps as an alignment framework on top of repls, and discuss its implications. Lastly I will invite discussion on what we might want out of the future, and talk about my thoughts on the central role of communication and its relation to both repls and ontology maps.

Blog Posts

Clifford Algebra and SageMath Tutorial

A tutorial on how to use Clifford Algebras and SageMath to represent geometric objects as vectors in an algebraic manner amenable to direct computation.

Deep Dream for Transformers

An interactive D3.js visualization of GPT-2’s model structure that lets users explore neuron-level explanations across layers.

Joint Text-EEG Embeddings

A proposal to create joint text-EEG embeddings by having participants read language model outputs while wearing EEG headsets, aiming to increase the bandwidth of human feedback for AI alignment.

Democratic AI Constitution: Round-Robin Debate and Synthesis

Proposes using GPT-4 to generate group-specific AI constitutions for diverse demographic and philosophical groups, then having those constitutions debate in a round-robin competition to produce a democratic AI governance framework.

Nature < Nurture for AIs

Argues that nurture (training data and procedure) matters more than nature (architecture) for sufficiently capable AIs, and that the Bitter Lesson implies convergent abstractions and even convergent values.