Course Introduction
- Course Goals
- Course Agenda
Module 1: SRE Principles & Practices
- What is Site Reliability Engineering?
- SRE & DevOps: What is the Difference?
- SRE Principles & Practices
Module 2: Service Level Objectives & Error Budgets
- Service Level Objectives (SLO’s)
- Error Budgets
- Error Budget Policies
Module 3: Reducing Toil
- What is Toil?
- Why is Toil Bad?
- Doing Something About Toil
Module 4: Monitoring & Service Level Indicators
- Service Level Indicators (SLI’s)
- Monitoring
- Observability
Module 5: SRE Tools & Automation
- Automation Defined
- Automation Focus
- Hierarchy of Automation Types
- Secure Automation
- Automation Tools
Module 6: Anti-Fragility & Learning from Failure
- Why Learn from Failure
- Benefits of Anti-Fragility
- Shifting the Organizational Balance
Module 7: Organizational Impact of SRE
- Why Organizations Embrace SRE
- Patterns for SRE Adoption
- On-Call Necessities
- Blameless Post-Mortems
- SRE & Scale
Module 8: SRE, Other Frameworks, The Future
- SRE & Other Frameworks
- The Future
- Additional Sources of Information
- Exam Preparations
- Exam Requirements, Question Weighting, and Terminology List
- Sample Exam Review
Certification
Successfully passing (65%) the 60-minute examination, consisting of 40 multiple-choice questions, leads to the SRE (Site Reliability Engineering) Foundation certificate. The certification is governed and maintained by DevOps Institute. The certification is governed and maintained by the DevOps Institute.