public cloud footprint, DevOps pipelines and processes, and observability practices. Job Responsibilities Design...
and ESXi. Strong knowledge of Linux systems and networking. Experience with monitoring, logging, alerting & Observability...
observability tools, monitoring, and alerting systems to proactively identify and resolve potential issues. Strike a balance..., and implementing preventive measures. Familiarity with observability tools, monitoring, and alerting systems. A commitment...
, Docker, Apache-Tomcat JVM, Profiling Monitoring tools like Elastic APM, Observability Java or Python or Shell...
with monitoring/logging tools for API observability Ability to work in Agile setups and coordinate closely with frontend, QA... with monitoring/logging tools for API observability Ability to work in Agile setups and coordinate closely with frontend, QA...
, observability, and SRE-style reliability practices. Proven experience establishing CI/CD for data and models with automated tests... to coaching and uplifting teams on data/ML testing, observability, CI/CD, trunk-based development/XP, and writing clear...
-tolerance improvements, and system observability enhancements. Participate in code reviews, design discussions, and agile...
and coding standards. Contribute to performance optimization, fault-tolerance improvements, and system observability...
systems. Familiarity with AI/ML integration in data platforms Experience with data observability and monitoring tools...
, observability, and maintainability. Support capacity planning, performance tuning, and disaster recovery strategies in partnership...
-grade AI systems - from agent frameworks and LLM pipelines to observability and evaluation layers that ensure reliability... and configuration using Terraform and Infrastructure as Code principles. Implement monitoring, logging, and observability solutions...
-grade AI systems - from agent frameworks and LLM pipelines to observability and evaluation layers that ensure reliability... and configuration using Terraform and Infrastructure as Code principles. Implement monitoring, logging, and observability solutions...
, observability, and maintainability. Support capacity planning, performance tuning, and disaster recovery strategies in partnership...
AI systems - from agent frameworks and LLM pipelines to observability and evaluation layers that ensure reliability... pipelines. Experience with AI observability tools and practices (e.g., logging, monitoring, tracing, metrics for AI agents...
AI systems - from agent frameworks and LLM pipelines to observability and evaluation layers that ensure reliability...
with modularity, observability, and fault tolerance to ensure reliability and maintainability. Proven hands-on experience with cloud...
, and document systems. You will design and implement observability and incident response frameworks, scale high-performance... observability and incident response pipelines for LLM, API, and backend systems Define SLAs, SLIs, alerts, and dashboards...
, NeoLoad or equivalent) Extensive experience with APM/ Observability Tools (Dynatrace, Datadog, AppDynamics etc) Extensive...
role to fit your skills and interests, whether it be in CI/CD, deployment automation, or observability engineering...
of infrastructure monitoring tools like Nagios, New Relic Knowledge of GCP and/or observability platforms such as Splunk. Experience...