with RL for sequence models, post-training, preference-based learning, or agentic systems. Experience with modern research...Snapshot We are starting a small team aimed at building a real science of post-training for agents. This involves...
academic community. Our focus areas are: LLM Training (Continued Pretraining, Instruction Tuning, Reinforcement Learning...Interested in training and evaluating large-scale LLMs ( 200B) in a frontier research team focused on AI impact...