ARBOx Logo

(Alignment Research Bootcamp Oxford)


When: June 30th - July 11th, 2025

What: A 2-week intensive bootcamp to rapidly build skills in ML safety, including building gpt-2-small, learning interpretability techniques, understanding RLHF, and replicating key research papers.

Who: Ideal for those new to mechanistic interpretability, with basic familiarity in linear algebra, Python, and AI safety (e.g. AI Safety Fundamentals). Oxford students are particularly encouraged to apply.

Apply now!

(Applications close end of day anywhere on Earth, 25th of May 2025.)


Programme Details

The curriculum includes content from ARENA, whose alumni have gone on to become MATS scholars, LASR participants, and AI safety engineers at organizations like Apollo Research, Anthropic, METR, and OpenAI, and even founders of their own AI safety initiatives.

Who should apply?

We’re looking for applicants who are new to mechanistic interpretability. Basic familiarity with linear algebra, Python programming, and AI safety is expected (e.g., having completed AI Safety Fundamentals or another fellowship).

You do not need to be an Oxford student to participate. ARBOx is designed to upskill participants in ML safety, targeting those who would benefit from this training, regardless of their background.

We think you would be a good fit if you are a postgraduate student or a working professional, though we will also consider strong undergraduate applicants.


Syllabus

We’ll cover:

We’ll have lectures in the morning covering aspects of the syllabus, and the rest of the day will be spent pair-programming. During lunch break there will be short talks from experts in the field, and we plan to run a couple of socials for the participants.