I’ve been doing a bit of self-promotion recently. I’m close to finishing my new book and I’ve starting putting the word around at a couple of conferences and other events. This has given me the opportunity to hang out with some mates of mine who work in recruiting. These guys work at three different agencies each of which are of different sizes, levels of maturity and have quite different clients. They all share the same problem though. Each of these agencies are all placing more people than ever before but they’re also struggling to keep candidates in roles and their rebate levels are rising. If any of you don’t know agencies have to give rebates if the candidates they put forward don’t stay in the role for long enough. The problem is DevOps.
The Stated Expectations
Organisations of all shapes and sizes are approaching these agencies looking for DevOps Engineers.
For now let’s leave aside whether someone can be a DevOps Engineer and whether DevOps is a method or an attitude or a framework or whatever. I think the fact that there are so many “what is DevOps” articles is very telling but let’s stick to the matter at hand.
When organisations ask for DevOps their job descriptions usually talk about continuous integration, automated infrastructure configuration and automated testing. They may mention Puppet or Chef. The cooler ones might mention Ansible or Salt. They’ll probably state that they need experience of LAMP or .NET stacks. They will undoubtably require experience with languages like Ruby or Java. Finally no DevOps job description would be complete without requiring some experience with NoSQL solutions like Redis, MongoDB or Cassandra. To round it all off their will probably be a requirement for AWS, Openstack or VMWare experience.
There’s nothing wrong with this. Most of the people I’ve worked with over the last 15 years would relish the opportunity to build a continuous integration system with these technologies.
The problem comes from what isn’t mentioned in the job descriptions but seems to be expected as a matter of course.
The Implicit requirements
As someone applying for a DevOps role you expect to be developing code for deployment mechanisms and automated infrastructure solutions. You expect to have to configure a lot of third-party tools like source control systems, build systems, containers and repositories. These all have to work with the deployment system, the infrastructure access mechanism and caching solutions and obviously all the security systems that sit between all of those.
You don’t necessarily expect to be on-call 24x7x365. Particularly because that wasn’t mentioned in the job description. You don’t expect to be responsible for everyone else’s applications as well as your own. You don’t expect to be responsible for defining the data retention policy and building the backup & restore mechanism and implementing a suitable testing regimen. You don’t expect to be first line support and product owner of the ticketing system and document management systems. You also don’t understand why you’re supposed to enjoy solving other people’s problems and being interrupted every 5 minutes.
However those are very often the expectations of the leadership that wrote the job description and approved your appointment.
The solution seems straightforward. The hiring organisation could simply analyse the role and responsibilities they really need and put them all in the job description. What might that job description look like? Well you’d have the usual descriptive stuff at the top:
Experienced DevOps Engineer required within a dynamic and innovative software company blah blah blah… Unique opportunity blah blah blah… LAMP stack experience blah blah blah… Agile environment blah blah blah… Working closely with developers blah blah blah…
Then the interesting bit the Key skills and background:
- Ideally you’ll have two or more years of experience in a DevOps role in a fast-paced agile environment
- Working experience with Jenkins or other Continuous Integration tools (e.g. Cruise Control, Go)
- Experience with Amazon AWS / scaling applications beyond single server deployment
- Good network management skills including firewall rule management, AWS security group and AMI management
- Network security skills suitable to classify, manage and protect PII data
- Knowledge of programming languages (Python, Ruby, JavaScript preferred)
- Proven knowledge of server-side technologies such as PHP, NginX, Node, NoSQL and how they are utilised
- Strong storage management skills across both SAN and cloud-based storage solutions
- Experience of CDN and other caching solutions such as Memcache
- Experience of managing data retention solutions and implementing suitable tests
- Knowledge of business continuity best practice and experience implementing appropriate solutions
- Document management and Ticketing systems experience including migrations, upgrades, recovery and optimisation
- Strong incident and crisis management skills
- Change management experience
- Work across multiple work streams and development projects
- Proven experience with Git, ideally you’ll demonstrate your own projects during the interview.
Wow.
Does that seem a bit much?
That has been my team’s role in my last three positions (OK Playfish used SVN not Git but you get the idea). Less than half of that was mentioned in the job descriptions I and my team applied for and all that still neglects to mention 24x7x365 Incident response, crisis management and mitigation for every product the company builds!
The Result
It doesn’t surprise me that people recruited for DevOps roles are leaving them once they realise that the job description only covered half of what was actually expected from them.
There’s been a really interested reaction to this problem. Some recruiters are trying to distinguish System Administration as apart from DevOps. From their point of view a company that asks for DevOps but wants System Administration can save themselves upwards of 20%.
Some organisations have tried to segregate DevOps from System Administration (Engineering etc…). Ask anyone who’s tried to work in those environments and they’ll tell you that more silos aren’t the answer. IT has struggled for years with developer-operations relations simply because they’re distinct silos with different and conflicting goals.
Creating additional infrastructure, DevOps and support silos really isn’t the answer.
One of the principle issues that DevOps was trying to address was communication. We need people with multiple skillets and experiences to work together to build, launch, manage and support internet products. Silos don’t encourage people to work together. They encourage specialisation within the silo but not communication between silos.
If more silos aren’t the answer maybe the answer lies elsewhere
The Solution
Agile gave us the clues to solve this problem back in 2003. The creators of the Agile movement knew that the way to prevent projects overrunning and failing to hit the expectations of the sponsors and stakeholders was to bring those people into the project team. Further they received their requirements in small, discrete, testable units and they released and demonstrated often.
What if we extended those principles to not just development but also operations and testing.
Wait.
What if we extended those requirements to the internet products as a whole. Not just the sterile application sitting in a build environment but the entire product.
The infrastructure that hosts it in every environment: development, build, test and live. The systems and mechanisms that build, deploy and test it. The monitoring and alerting systems that watch it. And the disaster recovery and business continuity systems that attempt to look after it.
Well now we have a product team that combines their different skill-sets, experiences and capabilities in the service of a product. That team can identify the technologies they need to build, launch and manage their products. They can chose the backup solutions they need, they can implement appropriate security, they can develop or hire-in the expertise they need.
Rather than trying to make one team responsible for all the technologies and support challenges of all the products let each team focus on their own products. Let them share the on-call burden with each other. Let them support each other so that when there’s an incident out of hours the team can support their colleague on-call together.
It’s often said, in IT that change is the only constant and yet we’ve had separate development, operations and test teams for over six decades. Perhaps it’s time that changed.
I explore this idea and more in my new book Next Gen DevOps: Creating the DevOps organisation soon to be released on Amazon. If you’d like to know more please contact me grant@nextgendevops.com.
The Snowball moves with the Great Atuin.