CS 362: Research in AI Alignment (Fall 2024)

Sep 1, 2024

Course Information

Instructor Name(s): Scott Viteri
Teaching Assistant: K. Fronsdal
Course Faculty Sponsor: Clark Barrett
Graduate-level course or advanced undergraduates (contact course instructor)
3 Units, Fall 2024, ExploreCourses
Intro Lecture Slides

Course Description

In this course we will explore the current state of research in the field of AI alignment, which seeks to bring increasingly intelligent AI systems in line with human values and interests. As the energy in the AI alignment landscape has been increasingly focused on political considerations, we seek to create a space to discuss which direction we should be pointing in, now that we have a better idea of what AI scaling will look like in the near future. This is a philosophical task, and we will invite several speakers that are philosophical in persuasion, but we also find that several of the most relevant philosophical questions cannot be asked without a strong technical familiarity with the specifics of language models and reinforcement learning. The format will consist of weekly lectures in which speakers present their relationships to the alignment problem and their current research approaches. Before each speaker, we will have some corresponding assigned readings and we will assign some form of active engagement with the material: we will accept a blog post in response to the ideas in the readings, but we will encourage jupyter notebooks that engage with the technical material directly. Therefore this course requires research experience, preferably using mathematical and programming tools (e.g. Python, PyTorch, calculus), and is a graduate level course, open to advanced undergraduates.