Tribes, guilds and squads

Many organisations have been swept in the last years by the wind of “tribes, guilds and squads”. “Chapters” often got lost and it, usually, all turned into herding cats.

The brief story of how it all started and what it means

Sounding like a childhood game “tribes, guilds and squads” aka “The Spotify model” refers to Spotify’s agile at scale model.

In 2014, Henrik Kniberg made two videos about Spotify’s Engineering Culture. They describe how a product company, relying on B2C transactions moved from Scrum to a different type of Agile, while having an extremely fast growth. Kniberg emphasized from the very first minute that the key was “autonomy”. Autonomy meant ALL decisions are taken inside the squad. Features were small and decoupled, using interfaces. Each squad was allowed to modify another team’s code, as long as the team owning the code, was reviewing the changes. Squads can be built and dissolved as needed. In 2018, Spotify had 180 teams (squads).

Now, the terms used in the Spotify model:

Tribes (which in Europe existed up to 6th century) are large groups (of squads) responsible for a set of features or having a specific function (ex: Infrastructure).

A Squad (mid 17th century term, in Europe) is a full stack, cross-functional, self-organizing team, with less than 8 people usually, working co-located, having complete responsibility for what they do from design to operations. They decide what and how to build. They align of course with a product strategy and a squad mission, meaning they are told where to arrive, but not how to get there.

A Chapter (this meaning is specific to North America) is a competency or if you want, the equivalent of a practice area or service line (e.g.: testing, architecture, web development).

A Guild (11th century organisation, in Europe) is an informal community based on common interest, made for exchanging knowledge (a sort of internal MeetUp).

Herding cats

I’ve worked so far with two organisations trying to apply the “Spotify model”.  Here’s what happened:

Fail #1: They applied the tribe-squad-guild labels overs the existing structure (and yes, “chapter” got lost on the way). Basically, the business unit / service line became a Tribe, the project team or scrum team, a Squad.

Wrong because… Spotify is not a labelling model; the names are literally the last thing to worry about. Even Spotify recommends to not simply copy their model.

Fail #2: Every development squad had to follow Scrum, which being a construction / development framework ignores design, transition to operations, end of product lifecycle and is generally insufficient for Enterprises.

Wrong because… In the Spotify model, ALL squads must use Agile methodologies, but there is no imposed framework. Visuals, daily sync, weekly demo and Improvement boards are recommended. Teams have a tech lead, a product lead and a Definition of Awesome (which is ok to be unrealistic). Impact is more important than velocity. Done is achieved only when quality has been reached. Focus is on value, not on plan fulfilment. If it works, you keep it, if not, you dump it.

Fail #3: There was no autonomy in the teams and part of the enterprise ecosystem was not Agile (Ex: infra teams). Priorities were coming from top to bottom (with or w/o a clear roadmap) and they were communicated to the team during refinement and planning. Architecture was already decided.

Wrong because… Autonomy is the core of Spotify – I really can’t emphasize this enough. Design is a fundamental part of a squad’s autonomy and it improves velocity because decision-making bottlenecks are removed. The team can simply refuse any idea, tool, anything. They just have to stay aligned with the company strategy and deliver their objective. You can’t experiment with a single autonomous squad among other directed teams. There is no hand-off between the squads (feature squads, infrastructure squads and platform squads).

Fail #4: The new so-called squads had their members globally distributed, making also knowledge handover more difficult. Key knowledge per area resided with one single person.

Wrong because… Spotify designed its model for co-located teams with physical spaces for design (white board walls) and meeting. Using Slack, Skype, WebEx, MeetMe, Zoom, etc does work, they are functional, get the job done, but will never have the value and force of a face to face meeting or a human contact. Refactoring was done on a per need basis (which makes sense) and pair programming was something I pushed for in order to eliminate key person dependency.

Fail #6: Layers of formal management and a lot, a lot of politics.

Wrong because… Autonomy. Should I explain it again? Spotify accepted however that a minimum bureaucracy is needed to avoid chaos. The idea was to have servant leaders and mutual respect, community vs structure and trust over control. No politics, no fear.

Fail #7: Quarterly releases. Everything had to be ready, on all streams, to meet the quarterly deadline.

Wrong because… The fundamental concept of Release Trains was completely foreign to these organisations. So were feature toggles.

Fail #8 Production defects were usually a drama.

Wrong because… The model was supposed to be a fail-friendly environment, with a limited, non-critical, potential impact, given the gradual rollout and decoupled architecture. It was ok to fail, if you learned and improved fast.

As Taiichi Ohno said: “watch the process and think for yourself”. I will let you google his name or just click the link.