. You'll architect resilient, cloud-native backend services that power large-scale ad delivery and campaign planning tools... and development of scalable, maintainable services that power ad buying and the complicated parts to it. Architect and evolve the...
production systems used by external enterprise customers. You will work alongside an established architect and senior product... speed of delivery with long-term operability and cost control. SRE & Reliability Practices - Establish and enforce SRE...
across both streaming and file-based data flows. Reliability & SRE: Establish SLOs/SLAs, observability standards, multi-region resiliency... Professional certifications (Cloud Architect, Data Engineer). Excellent communication and stakeholder influence. Proven expertise...
in evolving the way networks are managed—leveraging automation frameworks, observability tools, and SRE best practices to support... configurations, enforcing access policies across our data centers and AWS environments, and developing observability and monitoring...
, and available for our customers. By curating observability data and insights, and finding opportunities for improvement through technical system... to collaborate on necessary changes. You will ensure our SRE team is a trusted partner to development and operational teams, helping...
. This is a player-coach role: you'll architect and implement compliant solutions directly, while also guiding, mentoring, and enabling... performance reviews, and promote professional development across the team Architect and execute cloud deployment strategies...
, and evidence for governance and audit Collaborate with Product Owners, Developers, SRE/Observability, and Governance to ensure... Center of Excellence (AI CoE), you will architect and drive the end-to-end AI Quality program from inception. This is a hands...
monitoring, observability, and alerting Experience managing shared, multi-tenant services. SRE experience is a plus Experience... Architect (RSA) on the design and execution plans regarding implementation and associated integrations of the HashiCorp tool...
. You Have: 7+ years of experience in platform engineering, DevOps, or Site Reliability Engineering (SRE), supporting production... determination based on client requirements Bachelor’s degree Ability to obtain an AWS Solutions Architect – Associate...
in Agile/Lean, Platform Engineering, MLOps, SRE, Continuous Delivery, Cloud, Containers/Kubernetes, and compliance-by-design..., resilience, observability, cost). Ensure platform roadmaps consistently account for compliance, security, operations, performance...
reliability engineers, and leadership stakeholders to improve situational awareness through enterprise monitoring, observability..., analytics, alerts, and ITSI capabilities Architect complex SPL, correlation logic, and data models to support operational...
in Agile/Lean, Platform Engineering, MLOps, SRE, Continuous Delivery, Cloud, Containers/Kubernetes, and compliance-by-design..., resilience, observability, cost). Ensure platform roadmaps consistently account for compliance, security, operations, performance...
—recognized globally for its innovation, scale, and engineering excellence. As a Lead Software Engineer, you’ll help architect... As a Lead Software Engineer (Java/Python) and drive large-scale digital transformation. You will architect and modernize...
. Collaborate with SRE and engineering teams to improve reliability, observability, and operational efficiency. Participate..., and end user support. This role is rooted in modern IT Operations but works closely with our SRE team to improve reliability...
pipelines, observability, release management, and reliability engineering (SRE). Innovation & Customer Focus: Understand... management. DevOps: building and operating CI/CD pipelines, infrastructure as code (IaC), observability (logs/metrics/traces...
scalable and resilient infrastructure on AWS. Architect and maintain Windows/Linux based environments, ensuring seamless... AI adoption at the platform level. Implement observability, security, data privacy and cost-optimization strategies specifically...
observability to manage complex routing, failover, and multi-region connectivity. - AI/ML & GPU Infrastructure: Architect large... teams spanning platform engineering, SRE, networking/traffic, storage and databases, data infrastructure, and GPU/ML...
enterprise-grade CI/CD pipelines end-to-end. Architect and manage highly available cloud infrastructure on AWS / Azure / GCP... monitoring, logging, and observability using Prometheus, Grafana, ELK, Splunk, Datadog, etc. Manage production issues, identify...
Reliability Engineering (SRE) best practices into our workflows. Key Responsibilities Cloud-Native Development Architect... in processes, tools, and technologies to maintain a competitive edge. Implement monitoring and observability solutions (e.g...
through sophisticated automation. What You Will Do Architect Reliability: Establish SRE as a core function at Laravel... Kubernetes clusters, building robust observability systems, and solving complex operational puzzles with code, you’ve found...