Paul Finlayson Adams
Writing
platform-engineeringdeveloper-experiencedevops

The Platform Engineering Playbook: From 50 to 300 Engineers

22 min read

There's a pattern every growing engineering organisation hits somewhere between 50 and 150 engineers. Deploys start taking 30 minutes. Flaky tests eat whole afternoons. New engineers take three weeks to get their first PR into production. The people who know how the CI pipeline actually works are leaving, and nobody has time to document it properly. Meanwhile, the Atlassian 2024 State of Developer Experience report found that 69% of developers lose eight or more hours per week to inefficiencies. That's 20% of engineering capacity evaporating into toil rather than product.

Platform engineering is the fix. But most teams either build the wrong thing or build it too soon. This post is the practical guide for organisations in the 50-to-300 engineer range. When to form a platform team, what it should own, how to structure it, and how to know whether it's working.

Key Takeaways

  • In 2024, the DORA Report found elite-performing teams deploy 182x more frequently than low performers and recover from incidents 2,293x faster. The gap is structural, not just cultural.
  • The trigger for a platform team isn't headcount, it's pain signals: deploy times over 20 minutes, engineers blocked on infrastructure, onboarding taking weeks.
  • Platform teams work best with clear internal customers, a product mindset, and SLA-like commitments to the teams they serve.

What Is Platform Engineering? (And What It's Not)

Platform engineering builds and maintains the internal infrastructure, tooling, and workflows that product engineers use to build, test, and deploy software. The platform team's customer is the internal developer. That framing matters more than it sounds. According to the CNCF Annual Survey 2024, 28% of organisations now have a dedicated platform engineering team, but the definition of what that team does varies wildly, which is where a lot of organisations go wrong.

Platform engineering isn't DevOps. DevOps is a culture and a set of practices, platform engineering is the organisational structure that makes those practices scale. It's not SRE, which is reliability-focused and tends to own production stability rather than developer tooling. And it's not just "the infra team" renamed. The key distinction: platform teams treat their tooling like a product, with users, adoption, feedback loops, and iteration cycles.

From practice: At Deliveroo-scale, a few hundred engineers across many squads, I saw the same confusion play out repeatedly. Teams would hire infrastructure engineers, call them "platform", and then have them spend 80% of their time on reactive ops work: unblocking builds, fixing broken Kubernetes configs, handling one-off requests. That's infrastructure management, not platform engineering. The shift happens when the team stops being a service desk and starts owning a product with documented APIs, self-service capabilities, and a roadmap.

DORA Metrics: Elite vs Low Performers (2024)Grouped horizontal bar chart comparing elite and low performing engineering teams across deployment frequency, lead time for changes, change failure rate, and time to restore. Elite performers dramatically outperform low performers on all four metrics.LowMediumHighEliteDORA Metrics: Performance Tiers (2024)Deploy FrequencyMultiple/dayLead TimeLow: monthsElite: <1 dayChange Failure RateLow: 46–60%Elite: <5%Time to RestoreLow: days/weeksElite: <1 hourElite / High performersLow / Medium performersSource: DORA State of DevOps Report 2024 — dora.dev

When Do You Actually Need a Platform Team?

The trigger for a dedicated platform team isn't a headcount number, it's a set of pain signals that tell you internal tooling has become a drag on product velocity. In 2024, the DORA research programme found that only 19% of surveyed teams qualified as elite performers, with the high-performance tier actually shrinking from 31% to 22%, partly because teams at growth stage lose the operational discipline they had when smaller.

The signals to watch for:

  • Deploy times exceeding 20-30 minutes for a standard service. Your developers are paying a tax on every merge
  • Engineers regularly blocked by infrastructure, opening tickets or pinging a shared Slack channel to get things done
  • No clear ownership of CI/CD or observability, everybody maintains it, nobody improves it
  • Onboarding new engineers takes weeks rather than days to reach first production deploy
  • Repeated incidents caused by configuration drift or missing guardrails that a self-service platform would prevent

The rough rule of thumb: a dedicated platform team becomes viable around 40-60 engineers total. Before that, an embedded model, one or two engineers with platform responsibilities sitting inside product squads, usually makes more sense. Forming a team too early means they have nothing to standardise yet, and nobody to serve at scale.

Explore how platform team topologies affect delivery speed

Citation: The 2024 DORA State of DevOps Report surveyed over 3,000 professionals globally. It found that burnout, organisational friction, and poor tooling are the primary factors separating low performers from elite ones, not technical skill. Teams with clear platform ownership consistently outperform those relying on shared, unowned infrastructure. Source: dora.dev

What Does a Platform Team Own?

Platform teams should own the foundational capabilities that every product team depends on. The capabilities that, if badly maintained, block everyone. According to the Atlassian Developer Experience Report 2024, developers spending 23-25% of their week on toil tasks is the norm, not the exception. A well-scoped platform team can recover a meaningful share of that.

The core ownership areas:

CI/CD pipelines, build, test, deploy tooling and performance. This is usually where the most visible pain lives. Slow pipelines are an immediate tax on every engineer's day.

Developer environments, local dev setup, preview environments, branch deploys. The goal is to make a new engineer productive on day one, not day fifteen.

Observability, logging, metrics, tracing, alerting. Platform teams don't own the dashboards product teams build, but they own the infrastructure those dashboards run on and the standards for what gets instrumented.

Infrastructure provisioning. IaC templates, Kubernetes cluster management, cloud account structures, cost visibility. Self-service where possible; guardrails always.

Developer portal and internal docs, a central place for service catalogues, runbooks, and onboarding guides. This might be Backstage, a Notion space, or a custom tool. The tooling matters less than whether engineers actually use it.

Security tooling integration, secrets management, dependency scanning, SAST in CI. The platform team's job is to make secure defaults easy, not to be a security gatekeeper.

Incident management and on-call tooling, not owning incidents, but owning the tooling and processes that make incident response faster across all teams.

Not everything at once. The most common early mistake is trying to own all of this simultaneously. A new platform team of four engineers can't build a developer portal, migrate CI, and stand up a new observability stack in parallel. Prioritise ruthlessly, usually, start with whichever pain point is costing the most engineering time right now.

Thinking about build vs buy for your internal platform? Read the breakdown

Developer Time Allocation: With vs Without a Platform TeamLollipop chart comparing how developers allocate time across product work, infrastructure toil, meetings and coordination, and other tasks, before and after a platform team is in place.Developer Time: With vs Without Platform Team% of work week0%25%50%75%Product work45%65%Infra toil25%8%Coordination20%17%Other10%10%Without platform teamWith platform teamBased on Atlassian Developer Experience Report 2024 toil data

How Should You Structure a Platform Team?

Platform teams typically represent 5-10% of total engineering headcount. A team supporting 100 product engineers should be roughly 5-10 people. The rule of thumb from practice being one platform engineer per 8-12 product engineers, with a minimum viable team of three to avoid single-point-of-failure risk. Gartner predicts 80% of large engineering organisations will have dedicated platform teams by 2027, up from roughly 45% in 2024, but fewer than 30% will achieve measurable developer productivity gains, which says as much about structure as it does about intent.

The three common structural models:

Fully centralised, one platform team owns all platform concerns for the entire engineering organisation. Works well up to about 200 engineers; beyond that, the team becomes a bottleneck by sheer volume of requests.

Federated, a small platform core (architecture, standards, shared tooling) with embedded platform engineers sitting in product squads. The embedded engineers handle squad-specific needs; the core team handles organisation-wide platform capabilities. This is the model I've seen work best at 150-300 engineer scale.

Hybrid, centralised team with formal office hours, self-service first, escalation path for complex work. Good for organisations where product teams have strong engineering capability and just need guardrails, not handholding.

Every platform team needs a product manager (or a tech lead operating in a PM role). Without someone actively owning the roadmap, running discovery with internal users, and making prioritisation calls, platform teams default to reactive work and reactive platform teams are slow platform teams.

From practice: At Scout24 scale, a large, multi-country product org, I saw what happens when platform teams don't publish an interface. Teams would go around the platform, building their own CI configurations, their own Terraform modules, their own observability setups. You end up with 15 slightly different ways to deploy a service. The fix isn't mandates; it's making the platform path so much easier than the DIY path that teams choose it voluntarily. That requires treating adoption as a success metric alongside reliability.

What Does Good Platform Engineering Look Like?

A healthy platform team is measurable. In 2024, the DORA Report identified elite performers as those who deploy multiple times per day, recover from incidents in under an hour, and maintain change failure rates below 5%, compared to low performers who deploy monthly and take days or weeks to recover. Those outcomes don't happen by accident; they're the product of well-maintained platform capabilities.

The indicators to track:

  • Deployment frequency, measured per service and trending over time, not just a vanity metric, but a proxy for how much friction exists in the path to production
  • P50 and P95 build times, tracked weekly. If your median build is 15 minutes and your 95th percentile is 45, something is broken and nobody has made it anyone's job to fix it
  • Developer satisfaction with tooling, a simple quarterly pulse survey asking engineers to rate their tools, their deploy experience, and their ability to debug production issues
  • Time from commit to production for a standard service, a concrete, reproducible measure of your pipeline's health
  • Time to first production deploy for a new engineer. The single best proxy for how good your onboarding and developer experience actually is

What you measure, you improve. If none of these are tracked, you can't make the case for investment and you can't demonstrate value.

Read the full CI/CD scaling guide for growing engineering teams

The Most Common Platform Engineering Mistakes

Platform engineering is still a relatively young discipline, and the failure modes are predictable. Here are the six I see most often.

1. Building too early. Forming a platform team at 20 engineers means you're standardising before you understand the problems. Wait for the pain signals described earlier. A premature platform team will spend its time on work that doesn't matter yet and will build abstractions that the organisation immediately outgrows.

2. Building the wrong thing. The most common version: spending six months building a beautiful developer portal in Backstage when the actual problem is that builds take 35 minutes. Ask engineers what's slowing them down. They'll tell you. Then fix that, not what looks impressive on a conference slide.

3. Treating internal developers like they don't have opinions. The platform team's users are engineers with strong views on tooling. If you build something they hate and then make it mandatory, you'll breed resentment and workarounds. Run discovery, ship MVPs, collect feedback.

4. Owning everything and saying yes to everything. Platform teams that can't say no become bottlenecks. Every request that comes in is a potential distraction from the roadmap. The team needs a published scope and a lightweight triage process.

5. No SLA or published interface. If product teams don't know what to expect from the platform team, how quickly requests are handled, what's self-service vs. what requires a ticket. They'll either underuse the platform or over-depend on individual relationships.

6. Building custom when off-the-shelf would do. European engineering orgs in particular have a tendency to underestimate the build cost of custom tooling and overestimate the uniqueness of their requirements. Buying a mature CI platform or a managed Kubernetes service is almost always cheaper than building and maintaining the equivalent in-house, especially when you factor in GDPR compliance requirements for any tooling that touches production data.

From practice: The mistake I saw most consistently, across organisations at Deliveroo scale and at Scout24, was platform teams that had excellent engineering capability but no product discipline. They'd ship features nobody asked for and miss the things that were actively blocking teams. Adopting a product mindset, users, discovery, outcome metrics, roadmap, is the single highest-leverage change a platform team can make.

How Do You Measure Platform Engineering ROI?

Platform engineering investment has to be justified to boards and CFOs, and that means translating engineering improvements into financial terms. The framework is straightforward: recovered developer time multiplied by headcount and fully-loaded salary cost gives you a productivity value. Combine that with incident rate reductions (incidents are expensive. The average significant production incident costs a European mid-size company tens of thousands of euros in engineering time alone) and onboarding improvements.

DORA metrics are the standard framework for the board conversation. Deploy frequency increasing from weekly to daily is measurable. Change failure rate dropping from 30% to 8% is measurable. Time to restore going from hours to minutes is measurable. These numbers translate to reduced incident cost, faster product iteration, and higher engineering retention, all things a CFO can price.

Gartner's 2024 analysis notes that the organisations failing to see ROI from platform investment are typically those that haven't defined success metrics upfront. Set baseline measurements before you start, measure at 3-month intervals, and publish the results internally. Visibility creates accountability.

The financial case becomes clearest when you calculate: if 100 product engineers each recover five hours per week from reduced toil, at a fully-loaded cost of €120,000 per year, you've freed up roughly €3 million in annual engineering capacity. A platform team of eight costs perhaps €1.5 million all-in. The ROI is not theoretical.

Your First 6 Months: A Platform Engineering Roadmap

Don't try to build everything at once. The teams that succeed are the ones that pick one thing, fix it properly, measure the improvement, and use that win to justify the next investment.

Months 1-2: Audit. Before writing a line of platform code, measure your current state. How long does a standard deploy take? What does P95 build time look like? How long does onboarding take? Where are engineers raising the most infrastructure-related tickets or Slack messages? Talk to product squads, ask them where they're losing time. Map the pain, don't assume it.

Months 3-4: Pick one thing. Based on the audit, identify the highest-impact improvement. Usually it's build time or deploy reliability. Form a clear problem statement: "Builds for our main service take 28 minutes at P50, causing X hours of lost time per week across N engineers. Our goal is P50 under 10 minutes within 90 days." Then work that problem only. Resist the pull to expand scope.

Months 5-6: Ship it, measure it, expand. Get the improvement into production, measure against your baseline, and communicate the result. A 15-minute build time reduction across 50 engineers is a visible, quantifiable win. Use it to build internal credibility and justify the next hire or the next initiative on the roadmap.

Platform Adoption Milestones by Engineering Team SizeLine and area chart showing the growth of platform maturity indicators — from ad hoc scripting at small team sizes to a full internal developer platform at 200-300 engineers. Key milestones are marked along the curve.Platform Maturity by Team SizeNoneBasicDefinedManagedOptimised102550100200300Total engineering headcountEmbeddedmodelDedicatedteam viableIDP + portal

Frequently Asked Questions

What's the difference between platform engineering and DevOps?

DevOps is a culture and a set of practices, shared ownership of reliability, short feedback loops, automation-first thinking. Platform engineering is the organisational structure that makes those practices sustainable at scale. Platform teams build the tooling that makes DevOps principles easy to follow. You can have DevOps culture without a platform team, but at scale you'll eventually need one to maintain it.

Should a platform team report to the CTO or VP Infrastructure?

In most organisations at Series B–C stage, the platform team reporting to the CTO or a VP of Engineering works better than reporting into an Infrastructure or Ops function. The framing matters: platform teams are developer-experience functions, not infrastructure management functions. Reporting into infrastructure tends to pull the team toward reactive ops work rather than product thinking.

How do you get product teams to actually use the platform?

Make the platform path easier than the DIY path. Mandate is a last resort, it creates compliance without adoption and generates resentment. The better approach: identify a pain point product teams already have, solve it better than they could themselves, and make the solution available self-service. Adoption follows value. A platform team that reduces build time from 30 minutes to 8 minutes doesn't need to mandate its CI system, teams will migrate voluntarily.

What's a realistic platform team size at 100 engineers?

At 100 product engineers, a platform team of 6-10 people is typical, depending on the breadth of what you're owning. Start lean, 4-5 people, and prove the value before scaling the team. The 1:8-12 ratio (one platform engineer per 8-12 product engineers) is a reasonable planning heuristic, but the right size depends heavily on how mature your existing tooling is and how much standardisation you're starting from scratch.

Should platform engineers have on-call responsibilities?

Yes, but scoped carefully. Platform engineers should be on-call for the platform itself. CI systems, shared Kubernetes infrastructure, observability tooling. They shouldn't be on-call for every product incident. That scope creep is a team morale killer and a fast path to burnout. European employment law also matters here: German Bereitschaftsdienst regulations and equivalent frameworks across the EU create real constraints on how on-call can be structured, which is worth addressing explicitly in team agreements.

How do platform teams avoid becoming bottlenecks?

Self-service is the primary answer. Every capability the platform team builds should have a self-service path as the default. The platform team's job is to build the road, not to drive every car. Beyond that: a published interface (what the team owns, how to engage, what's self-service vs. requires a ticket), a clear escalation path, and a commitment to responding to blocking issues within a defined SLA. Bottlenecks happen when platform teams don't publish their interface and product teams default to direct dependency on individual relationships.

Conclusion

The gap between elite-performing engineering teams and everyone else isn't primarily a hiring problem or a technology problem, it's a systems problem. In 2024, the DORA data showed that elite teams deploy 182 times more frequently than low performers. The infrastructure, tooling, and workflows that platform teams build are what close that gap.

Get the timing right (pain signals, not headcount thresholds), scope it correctly (one thing done properly beats five things half-built), and run it like a product (users, discovery, adoption metrics, roadmap). At 50-300 engineers, a well-structured platform team is one of the highest-leverage investments an engineering organisation can make.

The next question is usually how to structure the team's relationship with the rest of engineering, explore platform team topologies for growing organisations for a deeper look at the federated model, team API design, and how to manage the transition as you scale through 100 and beyond.