Thursday, July 16, 2015

Pitfalls of Agile Transformations


“We are a conservative company, so we are just starting our agile transformation,” the manager told me. “But we expect big things from it:  faster delivery, easier recruiting, happier customers.”

“Interesting objectives,” I thought to myself. “Something I might have heard ten years ago.” It struck me that the reason an organization opts for late adoption is to learn from those who go first – from the companies that bushwhacked through the agile swamp a decade ago, or the organizations that followed a few years later. I wondered how much of what we have learned in the last decade will inform this budding agile transformation. I sensed that the answer was “not enough.”

Once you get past the sales pitches and confirmation biases, it doesn’t take much research to discover that agile and Scrum don’t have such a great track record. In the First Round Review article  I'm Sorry, But Agile Won't Fix Your Products, Adam Pisoni, co-founder and former CTO of Yammer, contends that “While SCRUM did manage to rein in impulsive managers, it ended up being used more to exert tighter control over engineers’ work.” In The Failure of Agile, Andy Hunt, an original signatory of the Agile Manifesto, writes “Agile methods themselves have not been agile. Now there‘s an irony for you.” Both of these pieces complain that agile does not provide real empowerment – one of several persistent problems we have observed in many organizations as they adopt agile practices.

Every organization undertaking an agile transformation imagines that the problems with other agile implementations will not plague THEIR transformation. If they hire the right consultants and use the best practices, they assume they will be fine. This kind of wishful thinking only lengthens the list of mediocre agile transformations. It would be more useful to understand the most predictable problems with agile implementations and actively help your organization avoid them.

With this in mind, I offer three questions you might ask to expose some of the typical ways in which agile disappoints, along with the best current approaches for avoiding these common agile pitfalls.

Question 1: Should you use Scrum or Continuous Delivery?

This may come as a surprise, but quite frankly, Scrum says nothing about how to develop software, nothing about how to deliver defect-free code and nothing about techniques for faster production releases. Other agile methodologies – especially the long lost Extreme Programming – have more to say on these topics, but most agile transformations reserve little time for improving the actual work involved in generating top notch software. Yet without a solid foundation in the technology that produces great systems, agile is pretty hollow. 

The technical heart of agile is embodied in the practices articulated by Jez Humble and Dave Farley in Continuous Delivery: acceptance test-driven development; automated builds, automated testing, automated database migration, and automated deployment; everyone checks their code into the mainline at least daily (there are no branches!); the mainline is ALWAYS production ready and is deployed very frequently (daily is slow); release is by switch rather than by deployment. If you aren’t heading toward these or similar technical practices and you think you are doing an agile transformation, think again. Agile without a strong technology base is usually a mistake.

Start your agile transformation by acknowledging that software development is a deeply technical endeavor leading to highly complex systems. These systems behave like all complex systems – if you smash them with a big change, all bets are off – you cannot predict the results. The only way to have predictable, stable code bases is to modify them with small probes, observe the results, modify the code and probe again. [Incidentally, a small probe is not two weeks of work; it’s more like two hours of work.] If deploying small probes to live systems is not at the core of your agile transformation strategy, you are missing today’s most reliable tools for delivering stable systems with predictable results.

Yes, this means writing a lot more code. It means tests as code, infrastructure as code, deployment as code. It means no one writes production code until there is an acceptance test for it, written in an executable language. It means teams can pretend they are working in a cloud because the infrastructure they need is always available and can be provisioned as needed. It means that whole teams (which include everyone from product to operations) retain responsibility for their code even after it goes live. And it means that the most common way teams decide what to do next is to examine feedback from the effects of their work in actual use.

The technology enabling Continuous Delivery should be at the core of any modern agile transformation because it has proven to be the safest way for an organization to gain and maintain control of complex software systems. If your agile transition team does not understand this technology, then you are probably trying to switch to agile without adequate technical leadership. This is not a good strategy.

Admittedly, Continuous Delivery is technically challenging, but no more so than the many other challenges that technical teams deal with every day. In fact, we have found that almost without exception, software engineers love to work in a Continuous Delivery environment because of the challenge, the discipline, the clarity, and the immediate feedback. One financial services company told us that in the three years since their (large) IT department switched to Continuous Delivery, they have had zero turnover, except for emigration. Their transformation resulted in the most desirable jobs in the area.

Question 2:  Do you hire Developers or Engineers?

What title do you use for people who solve problems with software? Years upon years ago, I was called a programmer and that was a high status job. But once waterfall processes placed analysts between programmers and their customers, the programmers were no longer expected to analyze customer problems and solve them. The title “programmer” was downgraded to a second class job which mostly involved coding what someone else wrote in a specification. Over time a new term – developers – came into use and referred to a more holistic job. But then, agile processes placed a product owner between developers and customers, so developers were no longer expected to analyze customer problems and solve them. Instead, they were given a prioritized list of relatively small stories to estimate, code, and (hopefully) test.

If you visit Silicon Valley these days you will find that software developers have been replaced by software engineers. We can only hope that those smart people who have this title will be presented with complete problems and expected to engineer a solution. They will not be given specs, because whoever wrote the spec designed the solution. They will not be given stories, because whoever wrote the stories designed the solution. They will be given real problems – customer problems, business problems, technical problems – and asked to engineer a solution. They will be expected to implement the solution within valid constraints and take responsibility for its success. Silicon Valley companies understand that this is the kind of job that attracts the best engineers.

If you want more effective recruiting in today’s very tight talent market, don’t look for software developers or mention your agile transformation. Look for software engineers and reliability engineers and make it clear that you expect them to engineer effective solutions to meaningful problems. Then make sure that your agile transformation makes this challenging work the responsibility of your engineers, because most agile methodologies place it elsewhere.

Question 3: How will you handle dependencies?

I was astonished when I heard that after Amazon completed its switch to services, the company no longer used central databases. How could this possibly work? I thought it was self-evident that a single system of record is fundamental to the success of an enterprise – so how could Amazon possibly survive without a central database? Either the information about abandoning central databases was wrong or Amazon was doing something that defied all conventional wisdom.

It turns out that the second was correct – Amazon had discovered something so obvious that it had escaped us for decades: A central database is one humongous dependency generator. Ouch!  Take a look at Sam Newman’s book Building Microservices – where the case is made that dependencies are among the greatest evils in software development and central databases are among the most pernicious creators of dependencies in the software world. It’s eye-opening.

These days we see a lot of companies building microservices – Netflix and and Gilt and many more. Why? Because when they experience extremely high volume, the code that handles this volume needs constant attention and tuning. The only way to make that happen at scale is to adopt a structure which allows individual teams to deploy their code – live to production – independently of other teams. A microservice is exactly that – code owned by one (small) team that designs, monitors, maintains, and deploys the service – independent of other teams and other code.

If this sounds a lot like something you’ve heard of before, that’s because independent module deployment has been the dream of software development just about forever.  A couple decades ago, object-oriented programming promised this nirvana, but it never quite delivered. Now microservices are making the same promise, and there are instances of them working pretty well. Of course, microservices are rather new and the jury is still out. (See Martin Fowler’s summary of Microservices.) But we know that for very high volume systems, independent deployment appears to be mandatory and microservices seem to be the architecture of choice. Clearly microservices are a viable way – but not the only way – to handle dependencies.

No matter what kind of system you have, dependencies must be dealt with or else they will eventually haunt you. The Google code base started out as a monolith which rapidly developed many dependencies, but fortunately, Google's engineers understood the danger. So they developed a dependency matrix to keep track of code interactions, and whenever code was pushed to the test framework, the new code and all of its dependencies were tested together – immediately. If the test found problems, the code was reverted and everyone involved was notified. New code was system-tested thousands of times a day, which required a massive environment with a lot of automation. But it worked infinitely better than manually testing large changes because it identified the precise cause of potential problems before they happened. As expensive as it seems, it turns out that testing each small change with its complete stack of dependent code is better, cheaper, safer and faster than testing big batch releases the way we used to in the past.

“But how do we get from our legacy systems to that ideal state?” we are often asked. Well, that is precisely the question your agile transformation should answer. There are plenty of places to look for ideas, because this is a path many companies have taken. To get started, Martin Fowler's Strangler Application provides a general pattern for migrating away from legacy code, and several case studies can be found here. However, there are no canned answers for dealing with legacy code; the problems are quite specific to each situation. You need good engineers to take up the challenge supported by leadership that appreciates the importance of the issue. But the bottom line is that if an agile transformation does not provide a path from smashing your system with big releases to probing it with tiny bits of code, you have more homework to do before you get started. 

We have learned a lot about how to deal with dependencies over the last few years. We can do it with an architecture that isolates dependencies – perhaps microservices – or by automatically testing the complete system of dependent code after every small change. We know we should NOT deal with dependencies by consuming the last third of a release cycle with system testing (and fixing) the way we used to in the waterfall days. And we know it does not make sense to automate tests just to make this back-end testing go faster – a mistake we have seen frequently that you want to avoid. Test automation should be aimed at defect prevention, not defect discovery. Preventing defects as the code is written pays for itself. Many times over. Every time.

Ask the Right Questions

If you are one of those conservative organizations that is just getting around to an agile transformation, be sure you ask the right questions before you take the leap. Remember that typical agile practices are just table stakes. You need to know how to play the complex systems game, a deeply technical game played by very smart engineers. Don’t insult their intelligence if you want to engage them.

Understand that dependencies cause most defects and fragile code bases, and they also lead to tangled organizational structures. Really. If you’re skeptical, check out Conway’s Law. Get your technical and architectural act together, as well as your strategy for dealing with dependencies, before you begin. This may prompt you to consider an organizational change as part of the transformation.

When you are ready to start, be sure to articulate the specific business goals the agile transition will help achieve and how you will measure the agile transition’s contribution to these goals. Then challenge your smart people to figure out how to move those metrics – and your transition will be off to a good start.

As an industry, we know how to do this. Your colleagues have done it. You may as well avoid the pitfalls they have discovered. Start by asking a few questions.