field 5+ years of experience in DevOps, MLOps, Data Engineering, Software Engineering or Site Reliability Engineering...Job Description DPR is looking for an experienced Data and MLOps Engineer to join our Data and AI team and work closely...
and maintaining FASTer Way's digital platforms, ecommerce infrastructure, and integrated client experiences. The Principal Engineer..., reliability, and compliance with industry standards. Oversee the development of IT service management strategies to improve the...
, reliability, efficiency, observability, and performance of our products, while maintaining high standards for correctness... that power them, ensuring performance, reliability, and maintainability. Run A/B experiments and analyze results to make data...
quality, testing, observability, reliability, and performance. Oversee end-to-end delivery processes, including requirements... and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site...
, and automated release management to accelerate delivery while improving quality and stability. Champion Site Reliability... Engineering (SRE) principles, including observability, error budgets, incident management, and continuous reliability improvement...
, repeatable systems. You will advance Platform Architecture & Reliability - contribute to the evolution of distributed, service...-oriented architectures that support high scale, resiliency, observability, and safe change across hundreds of tenants...
, and automated release management to accelerate delivery while improving quality and stability. Champion Site Reliability... Engineering (SRE) principles, including observability, error budgets, incident management, and continuous reliability improvement...
with observability, diagnostics, and live-site operations for mission-critical services. Experience working in environments with limited..., monitoring, and live-site support. Collaborate with cross-functional partners and partner engineering teams) to translate product...
High-Impact Backend Role | AI-Native Legal Platform New York / On-Site Full-time | Mid–Senior.... This is a product-first backend role where reliability, safety, and extensibility are mission-critical. What You'll Build Core...
, applying site reliability engineering principles to drive automation, observability, and resilience across the data platform... Required Qualifications 5+ years in platform engineering, data platform operations, site reliability engineering, DevOps, or related roles...
-site health: improve observability, monitoring/alerting, incident response, and reduce time-to-diagnosis through systemic... of improving reliability, performance, and operational excellence through observability and systematic engineering practices....
. Integrate AI systems with code repositories, CI/CD pipelines, observability tools, and security/compliance frameworks to enhance... reliability and performance. Drive best practices, design reviews, and technical direction, ensuring data governance, security...
delivery schedules, drive alignment across partner teams, and ensure proper end-to-end testing, live-site coverage, scalability..., production reliability, and security hardening for both protections and detections. Hold accountability as a designated...
observability tools (logging, metrics, tracing) to diagnose service issues and improve system reliability. Experience.... Build extensible, maintainable services and features with strong diagnosability, reliability, and production-readiness...
, reliability, fault tolerance, and cost optimization. Experience using observability tools (logging, metrics, distributed tracing..., security best practices, and deployment infrastructure. Maintain operations of live site services on a rotational on-call basis...
. Guarantee Reliability and Security: Define and meet rigorous SLIs/SLOs by engineering robust observability stacks (Prometheus... architectures. Observability & Reliability Mindset: Experience building comprehensive monitoring and logging frameworks (ELK...
. Ensure secure, high-quality product delivery, overseeing system architecture and code quality. Champion Live Site culture..., ensuring reliability and customer delight and mentor engineers, shaping the vision for agentic AI-powered work management. Seek...
improvements across agentic workflows. Oversee Live Site operations for agentic systems, ensuring reliability, rapid incident... for agent interoperability, real-time processing, and fault tolerance. Drive performance optimization and observability...
advanced deployment and support of enterprise software solutions, digital intelligence (monitoring and observability... their development and deployment processes. Mentor junior engineers on automation, observability, and continuous delivery concepts...
, and deployment of applications System Reliability and Scalability: Implementing Site Reliability Engineering (SRE) principles... to enhance system reliability, availability, and performance Monitoring and Optimization: Implementing monitoring...