Policy Optimization (PPO), and reward modeling to improve agent performance. Launch and support fine-tuned models... with applied AI/ML teams to translate state-of-the-art research in agentic reasoning, planning, and tool use into reliable...
, so we provide funds for continuing education. We also offer in-house training and ongoing development through our internal GROW... look good! Work Hard, Play Hard - We reward our employees with generous vacation time, to the tune of up to five weeks off...
operations and service training and education. This Managerial position regularly engages in business planning and analysis... for People Management processes including but not limited to selection, training, performance, operational results, cost...
About this role As a Machine Learning Research Engineer, you'll drive research that teaches models what great feels... or ML research engineering, especially in post-training/fine-tuning large models (SFT, RLHF, DPO). Experience with LLM...