Course Overview

This graduate seminar course covers the latest techniques and applications of AI agents that can continuously improve themselves through interaction with themselves and the environment. The course will start with self-improvement techniques for LLMs, such as constitutional AI, using verifiers, scaling test-time compute, and combining search with LLMs. We will then discuss the latest research in augmenting LLMs with tool use, memory and retrieval techniques, and orchestrating AI capabilities with multimodal web interaction. We will next discuss multi-step reasoning and planning problems for agentic workflows, and the challenges in building robust evaluation and orchestration frameworks.

Our goal is that the students learn from the latest research papers, discuss the suggested readings in each class, work on an original research project in this area, and learn from invited academic and industry speakers about applications in building coding agents, research assistants in STEM and agent orchestration frameworks.

Course Staff

Mert Yuksekgonul
Course Assistant
Jon Saad-Falcon
Course Assistant

Logistics


Schedule

# Date Description Paper Readings* Deadlines
1 Mon Jan 6 Course Overview ▶️
2 Fri Jan 10 Test-time Compute Scaling ▶️
3 Mon Jan 13 Self-Improvement Techniques with Verifiers▶️
4 Fri Jan 17 Self-Improvement Techniques with RL
Mon Jan 20 MLK Day - No classes
5 Fri Jan 24 Self-Improvement Techniques with Search Project proposal due @10pm Homework 1 released
6 Mon Jan 27 Open-ended Agent Learning in the Era of Foundation Models (Guest Lecture: Prof. Jeff Clune, UBC/Google DeepMind)
7 Fri Jan 31 Augmenting LLMs with Tool use/Actions
8 Mon Feb 3 Planning and Multi-Step Reasoning Homework 1 due February 4, @10pm Homework 2 released on February 5
9 Fri Feb 7 Reasoning across Modalities (incl Invited talk on Gemini Multimodal)
10 Mon Feb 10 Benchmarks & Challenges in Evaluating Agents
11 Fri Feb 14 AI Coding Agents (Guest Lecture: Michele Catasta, Replit) Homework 2 due on Feb 18 @11:59pm
Mon Feb 17 President's Day - No classes
12 Fri Feb 21 Midterm Progress Presentations
13 Mon Feb 24 Midterm Progress Presentations Midterm Progress Presentation submission to Gradescope
14 Fri Feb 28 Agent Orchestration Frameworks (Guest Lecture: Chi Wang, Autogen)
15 Mon Mar 3 Augmenting LLMs with Retrieval/Memory
16 Fri Mar 7 Guest Lecture Lukasz Kaiser (OpenAI)
17 Mon Mar 10 Multimodal AI Agents (Guest Lecture: Prof. Ruslan Salakhutdinov, CMU/Meta)
18 Fri Mar 14 Multi-agent Systems & Future Research Areas
19 Wed Mar 19 Final Project Poster Presentation Project Final Report due Mar 21@10pm
*Paper readings may be updated closer to the class date.

Grading

Student Lectures

Most weeks there will be at least one Student Lecture. Each Student Lecture will consist of three papers or so. Each paper is presented by a team of 2-3 students that are presenting the paper for ~15 Minutes. The student lecture counts toward 15% of the grade.

Format: Three paper presentations of 15 minute each followed by 10 minute Q&A sessions. The presenting team can get feedback from the instructors and TAs to improve their presentation in office hours.

Weekly Discussion Questions

Students must submit 2 discussion questions for each student lecture via Gradescope. Questions are due by 10am the day of the class. There will be 10 student lectures and 2 guest lectures throughout the quarter, for which students can submit discussion questions. Each lecture's questions are worth 1 point (if you submit questions for 10 out of 12 lectures, you earn a total of 10 points toward the final grade). Discussion questions should demonstrate engagement with the papers' key concepts and methodologies.

Homework Assignments

There will be two homework assignments. Homework 1 will be released on Jan 24, due on Feb 4 at 10pm. Homework 2 will be released on Feb 5, due on Feb 18 at 11:59pm. Both of the homeworks will help develop intuition for the basics of self-improvement, multi-step reasoning, and tool use techniques.

Research Projects

As a graduate seminar, research is a big part of the class. Students will work in teams of 2 to 4 to complete original research. These projects should be broadly around research areas discussed in class and benchmarks related to agent workflows. Students will receive API credits to support their development work.

Your final report is due on Mar 21, 10 pm.

Course Policies

Late Policy

Audit Policy

Audits are not allowed for this course.

Communication with Course Staff

Industry Sponsorship

We are grateful to Google DeepMind, Anthropic, OpenAI and Together AI for sponsoring API credits for this class.