Pages

Sunday, February 1, 2026

Platform Engineering - SLA vs SLO & Platform Org Chart

When building and operating platform services—whether it’s cloud hosting, APIs, or IT support—you’ll often hear the terms SLA and SLO. They might sound similar, but they serve very different purposes.

What is a Service?

A service is a self-contained, reusable capability that the platform exposes to customers. It delivers value and solves a specific problem without requiring the customer to manage the underlying complexity.

  • Services are technical units of functionality
  • Customers interact with services via APIs, UIs, or SDKs
  • Services can be part of a larger product

Example of Platform Services:

  • Compute / Virtual Machine
    • Service Name: vServer
      • Description: Virtual servers with configurable CPU, memory, and storage
        • SLA: 
          • 99.999% uptime;
        • SLO: 
          • VM deployment < 5 min 
  • Compute / Bare Metal
    • Service Name: pServer
      • Description: High-performance computing, compliance workloads
        • SLA: 
          • 99.9% uptime;
        • SLO: 
          • provisioning < 30 min 
  • Storage / Block Storage (vDisk)
    • Service Name: vHDD
      • Description: Capacity block storage device
        • SLA:
          • 99.999% uptime; 
          • 1 IOPS per GB (32kB) @ <10 ms
        • SLO: 
          • provisioning < 5 min
    • Service Name: vSSD
      • Description: Capacity block storage device
        • SLA:
          • 99.999% uptime; 
          • 3 IOPS per GB (32kB) @ <5 ms
        • SLO: 
          • provisioning < 5 min
  • etc. 

SLA – Service Level Agreement

An SLA is a formal contract between a service provider and a customer. It defines what the customer can expect from the service and usually includes consequences if the service falls short.

Example:
“Our cloud storage guarantees 99.9% uptime per month. If we fail, you receive a service credit.”

SLO – Service Level Objective

An SLO is an internal goal for service performance. It helps teams measure and improve service quality but isn’t usually legally binding.

Example:
“Our API endpoints aim for 99.95% uptime per month.”

The Key Difference

  • SLO = internal target
  • SLA = external promise

Think of it like this: you aim for an SLO, but you commit to an SLA.

Why It Matters

Understanding SLAs vs SLOs helps teams:

  • Set realistic goals
  • Measure service quality accurately
  • Manage customer expectations clearly

Getting this distinction right is a cornerstone of good service management.

Example of detailed SLA: Cloud Storage Service

Product: S3 Cloud Storage as a Service

SLA (Service Level Agreement) – What the provider promises to the customer:

  • Uptime guarantee: 99.9% per month  
    • allowed downtime 43m 50s per month
  • Data durability: 99.9999999% per year (9 nines)
    • If you have 1 million objects, expected loss is 0.001 per year → practically zero, very safe.
    • If you have 1 billion objects, expect ~1 object lost per year → still within SLA.
    • SLA doesn’t guarantee zero data loss, but the chance of losing any given object in a year is tiny.
  • Support response time: within 2 hours for critical issues
    • Our SME is available for you within 2 hours
    • We do not guarantee fix time. We guarantee response time    
  • Penalty if broken: Service credits equivalent to 10% of the monthly bill
  • In short: “We guarantee our service will be available 99.9% of the time each month, with almost no data loss, and our subject matter experts are ready for you. If not, you get a credit.

 Cost: 100 Credits (Credit can be in USD, EUR, CZK, you name it)

Example of SLO: Internal API Endpoint

Product: Internal REST API Endpoint

SLO (Service Level Objective) – What the provider aims for internally:

  • Uptime target: 99.95% per month
  • API request latency: 95% of requests under 200ms
  • Backup success rate: 100% per day
  • In short: “Internally, we aim to exceed the SLA and keep our service as reliable and fast as possible.”

Responsibility: Platform Engineering SRE / DevOps Engineers

Accountability: Platform Engineering Lead / Manager with Product Owner / Platform Product Manager

Well, now there is another question. 

What is the difference between Responsibility and Accountability?

Quick way to remember:

  •     Responsibility = doing
  •     Accountability = owning

Responsibility

  • Definition: Being tasked with doing something or completing a specific duty.
  • Focus: The work itself.
  • Who it applies to: The person or team who actually performs the work.
  • Key idea: “I am responsible for completing this task.”
  • Example:
    • A developer is responsible for writing and maintaining the code for a REST API functions and data validity.
    • A SRE / Platform Engineer is responsible for API Endpoint availability.

Accountability

  • Definition: Being answerable for the outcome of a task or decision, regardless of who did the work.
  • Focus: The results or impact.
  • Who it applies to: The person who owns the outcome and must report or justify it.
  • Key idea: “I am accountable if this task succeeds or fails.”
  • Example:  The project manager is accountable for the feature being delivered on time, even if the developer does the coding.

The Platform Engineering team as a whole delivers the service (doing), but the Platform Engineering Lead or Platform Product Owner is typically accountable for ensuring the platform works (owning), meets SLAs, and enables developer productivity. In next two sections, it is explained in further details. 

RACI

RACI table below clearly defines responsibility and accountability of various Platform Engineering roles. 

  • Platform Product Owner
  • Platform Engineering Lead (Manager)
  • Platfrom Architect 
  • Platform Engineers
  • DevOps
  • SREs
  • Operations 
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- |
| Activity / Service.    | Accountable (A)         | Responsible (R)    | Consulted (C)   | Informed (I)     | 
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- |
| Platform roadmap &     | Platform Product Owner  | Platform Engineers | Developer teams | CTO              |
| feature prioritization |                         |                    | Architects      | VP Engineering   |
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- | 
| Platform design &      | Platform Engineer.Lead  | Platform Engineers | Security        | Developer teams  |
architecture           |                         |                    | Architects      |                  |
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- | 
| Build & maintain       | Platform Engineer.Lead  | DevOps / SREs      | Developer teams | CTO              |
CI/CD pipelines        |                         |                    |                 | Product Managers |
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- | 
| Infrastructure         | Platform Engineer.Lead  | Platform Engineers | Security        | Developer teams  |
provisioning &         |                         | SREs               | Cloud Ops.      |                  | 
automation             |                         |                    |                 |                  |
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- |
Platform monitoring &. | Platform Engineer.Lead. | SRE / Operations.  | Developer teams | CTO              | 
incident response.     |                         |                    |                 | VP Engineering   |
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- |
Developer enablement & | Platform Product Owner. | Platform Engineers | Developer teams | CTO              |
support                |                         |                    |                 | VP Engineering.  |
| ---------------------- | ----------------------- | ------------------ | --------------- | ---------------- |

Roles

In the table below, the various platform engineering roles are explained.

| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| Role                 | Focus                    | Mindset / Goal            | Typical Work                  |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| Platform Product.    | Value, usability, and    | Product manager mindset   | Own the platform backlog;     |
| Owner                | adoption of the platform | Maximize platform value.  | Gather users needs            |
|                      |                          | ROI rather than features  | Translate need to requirement |
|                      |                          |                           | Define platform objectives    |
|                      |                          |                           | Manage trade-offs between     |
|                      |                          |                           |   features, reliability, and  |
|                      |                          |                           |   technical debt              |
|                      |                          |                           | Drive adoption and            |
|                      |                          |                           |                feedback loops |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| Platform Engineering | Strategy, ownership, and | Platform productivity,    | Def. platform vision & roadmap|
Lead                 | evolution of the internal| reliability & long-term   | Set standards and SLAs.       |
|                      | platform.                | platform sustainability.  | Prioritize platf. initiatives |
|                      |                          |                           | Align with business,security, |
|                      |                          |                           | and architecture;             |
|                      |                          |                           | lead and mentor platform eng. |
|                      |                          |                           | ensure platf. adoption & value|
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
Platform Architect   | Technical architecture & | Systems Design trade-offs | Define platform architecture  |
|                      | long-term design of      | Design for scalability    | and reference designs;        |
|                      | the platform             | resilience, security,     | Select core technologies and  |
|                      |                          | and evolvability while    | patterns;                     |
|                      |                          | keeping platform usable   | Set architectural standards.  |
|                      |                          |                           | and guardrails;               |
|                      |                          |                           | Review platform changes;      |
|                      |                          |                           | Ensure alignment with.        |
|                      |                          |                           | architecture, security, and   |
|                      |                          |                           | compliance                    |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| Platform Engineer    | Developer of platform    | Enable developers         | CI/CD, IaC, APIs,             |
|                      |                          | Must have product mindset | developer tools               |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| DevOps               | Dev<->Ops collaboration  | Speed and automation      | CI/CD, pipelines, IaC,        |
|                      |                          | culture and philosophy    | deployment tooling            |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| SRE                  | Reliability & uptime     | Prevent outages;proactive | Monitoring, incidents,        |
|                      |                          | automation                | scaling, postmortems          |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- |
| Operations (Ops)     | System/admin maintenance | Stability; reactive       | Patching, backups,            |
|                      |                          |                           | incident handling, networking |
| -------------------- | ------------------------ | ------------------------- | ----------------------------- | 
 

Clear distinction between roles

  • Platform Product Owner → decides why the platform exists and what problems it solves
  • Platform Architect → decides how the platform should be built
  • Platform Engineering Lead → decides what gets built and why
  • Platform Engineer → designs and builds the platform
  • DevOps Engineers → ensure platform is deployable, scalable, and maintainable 
  • Site Reliability Engineer (SRE) → ensure platform reliability and operational excellence
  • Operations / NOC / Support Engineers → Handle day-to-day operational support

Typical Platform Org Chart

Platform Engineering Org Chart could look like the one on drawing below.

Platform Engineering Org Chart

No comments:

Post a Comment