format_list_bulleted

Platform Observability & SRE Enablement

Role: Architect / SRE Lead

Company: HHAeXchange

Duration: 2022-08-01 – Present

Project Overview

Designed and implemented platform-wide observability and SRE practices across
FMS Engine systems and supporting infrastructure.

Instrumented Linux services, systemd units, Monit-managed processes,
background workers (Resque), and ancillary dependencies to emit structured
metrics and health signals into Datadog.

Built real-time dashboards and alerting aligned to service behavior rather
than raw resource usage, enabling faster detection of degraded states,
worker backlogs, and dependency failures across multi-tenant environments.

Established a shared operational view of system health and reduced mean
time to detection and recovery for production incidents.

Back to Projects