Skill / Create Jeffallan Full-Stack Skills

Sre Engineer

Defines service level objectives, creates error budget policies, designs incident response procedures, develops capacity models, and produces monitoring configurations and automation scripts for production systems. Use when defining SLIs/SLOs, managing error budgets, building reliable systems at scale, incident management, chaos engineering, toil reduction, or capacity planning.

# SRE Engineer

## Core Workflow

1. **Assess reliability** - Review architecture, SLOs, incidents, toil levels
2. **Define SLOs** - Identify meaningful SLIs and set appropriate targets
3. **Verify alignment** - Confirm SLO targets reflect user expectations before proceeding
4. **Implement monitoring** - Build golden signal dashboards and alerting
5. **Automate toil** - Identify repetitive tasks and build automation
6. **Test resilience** - Design and execute chaos experiments; verify recovery meets RTO/RPO targets before marking the experiment complete; validate recovery behavior end-to-end

## Reference Guide

Load detailed guidance based on context:

| Topic | Reference | Load When |
|-------|-----------|-----------|
| SLO/SLI | `references/slo-sli-management.md` | Defining SLOs, calculating error budgets |
| Error Budgets | `references/error-budget-policy.md` | Managing budgets, burn rates, policies |
| Monitoring | `references/monitoring-alerting.md` | Golden signals, alert design, dashboards |
| Automation | `references/automation-toil.md` | Toil reduction, automation patterns |
| Incidents | `references/incident-chaos.md` | Incident response, chaos engineering |

## Constraints

Classification

Skill Capability with explicit trigger pattern

Skill Create

Generate or transform

Scope Project

This codebase

Triggered Activates on context match -- file patterns, topics, working state

Sre Engineer

Tags

Classification