SRE

Stop creating Ops teams!

I really thought we had fixed this. I was thinking of retiring Next Gen DevOps because it was out of date having been published 11 years ago. Sadly it isn’t. If your development teams need to beg for favours from your ‘platform team’ in order to get things done then you’ve recreated the nightmare of the Ops team. If the term ‘beg’ offends you then feel free to substitude the term ‘agree priorities’.

In this post I am going to give some helpful advice about how to enable your development teams to get to production without any blockers AND minimise their toil AND ensure your infrastructure is built safely and securely and in compliance with all your companies standards.

The key to enabling the development teams is to free them from things they don’t need to care about in order to ensure the quality, performance, availability and security of their applications. Equally the key to having a platform team that can enable the development teams while also ensuring infrastructure is built securely and compliant with all company standards it where to draw the line. On Amazon this looks something like this:

  • Creating the VPC
  • Creating all the subnets
  • Creating the Internet Gateway
  • Creating the NAT Gateway
  • Mapping the public and the private subnets
  • Creating all security groups
  • Creating Policies
  • Creating CloudWatch Log group

  • Creating secrets using secrets manager
  • Creating Parameter Store
  • Creating Load Balancer
  • Creating Target Groups
  • Creating Listener
  • Creating ECR Repositories
  • Preparing the docker images
  • Creating IAM Roles and Policies
  • Creating ECS cluster
  • Creating ECS task definitions
  • Creating ECS services

The items above the line are global cloud configuration, cost management controls and should be owned by the platform team. The items below the line are application specific configuration and should be owned by each application team.

This split should be agreed and regularly reviewed by the engineering community as a whole. If items below the line are seen to be toil, by the application teams then they need some training to teach them how to reduce that toil for themselves.

As with security we trust but we verify compliance with company standards. These should be tested by running scripts and using AWS Anomaly detection.

One of the key points I make in Next Gen DevOps is how the concept of operations teams went wrong because they had different priorities from the teams building applications and generating revenue. That point is just as valid for Platform teams today as it was for operations teams ten years ago.

If you’re an Engineering Leader and you are considering creating a platform team pay close attention to what their priorties and motivations will be. If they are ostensibly motivated by safety and cost control and your application teams are motivated by revenue and experimentation then you’re going to have problems. You have one team pulling and another pushing. If you want some help thinking through this and aligning the priorities then get in touch or buy the book.

NEXT GEN DEVOPS A Managers Guide to DevOps and SRE

I’m excited to announce a new edition of Next Gen DevOps. Newly subtitled A Managers guide to DevOps and SRE. This third edition, has been comprehensively restructured and rewritten as a definitive guide for managers.

I wrote the first edition of the book in 2012, at that time, as hard as it is to believe now, many senior technology leaders remained unconvinced about the DevOps paradigm.

I felt this very keenly when I was working on my DevOps Transformation Framework. For any readers that don’t already know it wasn’t my initial intention to write a book at all. I wanted to create a framework for DevOps transformations. As I worked on the framework I realised that many of the assertions I was basing the framework on were unproven to many in the Industry. I ended up creating so much exposition to explain it all that it made more sense to collect it into a book.

I’ve had loads of great feedback in the 7 years that the book has been available. Over that time the main criticism I’ve received is that the book wasn’t instructional enough.

The book wasn’t originally intended to be instructional, that’s what the framework was for. The idea was that if executives could see where the old silos of development, testing and operations were failing and could see a clearly presented alternative thoroughly proven from first principles and supported by anecdotes they could work with their leadership team on their own transformation programme that suited their organisation. That’s where the Next Gen DevOps Transformation Framework would come in to help them to structure that transformation.

That isn’t what happened. Executives were convinced of the rightness of DevOps by a vast weight of evidence from multiple sources. Evidence like Puppet’s State of DevOps reports, various Gartner analyses, pitches from new specialist consultancies like Contino and the DevOps Group (who were eating the older consultancies lunches in the DevOps arena) and recruiters like Linuxrecruit and Esynergy and a huge wealth of success stories of which my small book was merely a drop in the ocean. These executives were then challenging their teams to create transformation programmes. Managers were then looking to my book, and others like The Phoenix Project, to help them figure out what good looked like.

Unfortunately my book wasn’t hitting that spot, until now. This new edition is aimed squarely at those managers looking to migrate development teams to DevOps working practices or operations teams to SRE.

Since I published the second edition I’ve provided strategic leadership to the Department for Work and Pension’s cloud and DevOps migration programme and helped Just Eat improve their resilience and performance and transitioned their operations team to SRE. This new book adds experience from both of those initiatives.

I’ve learned a new technique for creating departmental strategy from business goals which I’ve included in chapter six. I moved the popular history chapter to an appendix as it’s inclusion confused some readers. The point I was trying to make (and if I have to explain it I’ve clearly failed) was that the separate development and operations silos were the anomaly not DevOps. By moving that to an appendix I’ve been able to create a smoother flow from the problem through to the solution and on to a new chapter  about building DevOps teams, which includes a lot of information about hiring engineers. I’ve changed the management theories chapter into a chapter specifically about managing DevOps and SRE teams. Following on from that chapter five details how DevOps teams work with other teams. Each subsequent chapter has been re-focused away from transforming whole departments down to transforming individual teams. I haven’t removed any content so if you are looking for guidance about changing an entire technology department that is all still there. In addition to that there is now a wealth of guidance to help managers transform their teams and supporting their peers as they transform their teams.

If you bought either of the first two editions I’ll send you a free PDF of the new edition, if you email me at grant@nextgendevops.com with a picture of you holding a copy of the book in paperback or showing on your e-reader and give me permission to use those pictures in a future blog post.