Friday, August 19, 2011

Don’t Separate Design from Implementation

I was a programmer for about fifteen years. Then I managed a factory IT department for a few years, and managed vendors delivering software for yet more years.  In all of those years (with one exception), software was delivered on time and customers were happy. Yet I never used a list of detailed requirements, let alone a backlog of stories, to figure out what should be done – not for myself, not for my department, not even for vendors.

In fact, I couldn’t imagine how one could look at a piece of paper – words – and decipher what to program. I felt that if the work to be done could be adequately written down in a detailed enough manner that code could be written from it, well, it pretty much had to be pseudocode. And if someone was going to write pseudocode, why not just write the code? It would be equally difficult, less error-prone, and much more efficient.

Software Without Stories
So if I didn’t use detailed requirements – how did I know what to code? Actually, everything had requirements, it’s just that they were high level goals and constraints, not low level directives. For example, when I was developing process control systems, the requirements were clear: the system had to control whatever process equipment the guys two floors up were designing, the product made by the process had to be consistently high quality, the operator had to find the control system convenient to use, and the plant engineer had to be able to maintain it. In addition, there was a deadline to meet and it would be career-threatening to be late. Of course there was a rough budget based on history, but when a control system was going to be used for some decades, one was never penny wise and pound foolish. With these high level goals and constraints, a small team of us proceeded to design, develop, install, and start up a sophisticated control system, with guidance from senior engineers who had been doing this kind of work for decades.

One day, after I had some experience myself, an engineering manager from upstairs came to ask me for help. He had decided to have an outside firm develop and install a process monitoring system for a plant. There was a sophisticated software system involved – the kind I could have written, except that it was too large a job for the limited number of engineers who were experienced programmers. He had chosen to contract with the outside firm on a time-and-materials basis even though his boss thought time-and-materials was a mistake. The engineering manager didn’t believe that it was possible to pre-specify the details of what was needed, but if a working system wasn’t delivered on time and on budget, he would be in deep trouble. So he gave me this job: “Keep me out of trouble by making sure that the system is delivered on time and on budget, and make sure that it does what Harold Stressman wants it to do.”

Harold was a very senior plant product engineer who wanted to capture real time process information in a database. He already had quality results in a database, and he wanted to do statistical analysis to determine which process settings gave the best results. Harold didn’t really care how the system would work, he just wanted the data. My job was to keep the engineering manager out of trouble by making sure that the firm delivered the system Harold envisioned within strict cost and schedule constraints.

The engineering manager suggested that I visit the vendor every few weeks to monitor their work. So every month for eighteen months I flew to Salt Lake City with a small group of people.  Sometimes Harold came, sometimes the engineers responsible for the sensors joined us, sometimes the plant programmers were there. We did not deliver “requirements;” we were there to review the vendor’s design and implementation. Every visit I spent the first evening pouring over the current listings to be sure I believed that the code would do what the vendor claimed it would do. During the next day and a half we covered two topics: 1) What could the system actually do today (and was this a reasonable step toward getting the data Harold needed)? and 2) Exactly how did the vendor plan to get the system done on time (and was the plan believable)?

This story has a happy ending: I kept the engineering manager out of trouble, the system paid for half of its cost in the first month, and Harold was so pleased with the system that he convinced to plant manager to hire me as IT manager.

At the plant, just about everything we did was aimed at improving plant capacity, quality, or throughput, and since we were keepers of those numbers, we could see the impact of changes immediately. The programmers in my department lived in the same small town as their customers in the warehouse and on the manufacturing floor. They played softball together at night, met in town stores and at church, had kids in the same scout troop. Believe me, we didn’t need a customer proxy to design a system. If we ever got even a small detail of any system wrong, the programmers heard about it overnight and fixed it the next day.

Bad Amateur Design
The theme running through all of my experience is that the long list of things we have come to call requirements – and the large backlog of things we have come to call stories – are actually the design of the system. Even a list of features and functions is design. And in my experience, design is the responsibility of the technical team developing the system. For example, even though I was perfectly capable of designing and developing Harold’s process monitoring system myself, I never presumed to tell the vendor’s team what features and functions the system should have. Designing the system was their job; my job was to review their designs to be sure they would solve Harold’s problem and be delivered on time.

If detailed requirements are actually design, if features and functions are design, if stories are design, then perhaps we should re-think who is responsible for this design. In most software development processes I have encountered, a business analyst or product owner has been assigned the job of writing the requirements or stories or use cases which constitute the design of the system. Quite frankly, people in these roles often lack the training and experience to do good system design, to propose alternative designs and weigh their trade-offs, to examine implementation details and modify the design as the system is being developed. All too often, detailed requirements lists and backlogs of stories are actually bad system design done by amateurs.

I suggest we might get better results if we skip writing lists of requirements and building backlogs of stories. Instead, expect the experienced designers, architects, and engineers on the development team to design the system against a set of high-level goals and constraints – with input from and review by business analysts and product managers, as well as users, maintainers, and other stakeholders.

A couple of my “old school” colleagues agree with me on this point. Fred Brooks, author of the software engineering classic “The Mythical Man Month” wrote in his recent book “The Design of Design” [1]:
“One of the most striking 20th century developments in the design disciplines is the progressive divorce of the designer from both the implementer and the user. … [As a result] instances of disastrous, costly, or embarrassing miscommunication abound.”
Tom Gilb, author of the very popular books “Principles of Software Engineering Management” and “Competitive Engineering” recently wrote [2]:
“The worst scenario I can imagine is when we allow real customers, users, and our own salespeople to dictate ‘functions and features’ to the developers, carefully disguised as ‘customer requirements’. Maybe conveyed by our product owners. If you go slightly below the surface of these false ‘requirements’ (‘means’, not ‘ends’), you will immediately find that they are not really requirements. They are really bad amateur design for the ‘real’ requirements…. 
"Let developers engineer technical solutions to meet the quantified requirements. This gets the right job (design) done by the right people (developers) towards the right requirements (higher level views of the qualities of the application).”
Separating design from implementation amounts to outsourcing the responsibility for the suitability of the resulting system to people outside the development team. The team members are then in a position of simply doing what they are told to do, rather than being full partners collaborating to create great solutions to problems that they care about.

[1] “The Design of Design” by Fred Brooks, pp 176-77. Pearson Education, 2010
[2] "Value-Driven Development Principles and Values;" by Tom Gilb, July 2010 Issue 3, Page 18, Agile Record 2010 (

Friday, July 15, 2011

How Cadence Predicts Process

If you want to learn a lot about a software development organization very quickly, there are a few simple questions you might ask. You might find out if the organization focuses on projects or products. You might look into what development process it uses. But perhaps most the revealing question is this: How far apart are the software releases?

It is rare that new software is developed from scratch; typically existing software is expanded and modified, usually on a regular basis. As a result, most software development shops that we run into are focused on the next release, and very often releases are spaced out at regular intervals. We have discovered that a significant differentiator between development organizations is the length of that interval – the length of the software release cycle.

Organizations with release cycles of six months to a year (or more) tend to work like this: Before a release cycle begins, time is spent deciding what will be delivered in the next release. Estimates are made. Managers commit. Promises are made to customers. And then the development team is left to make good on all of those promises. As the code complete date gets closer, emergencies arise and changes have to be made, and yet, those initial promises are difficult to ignore. Pressure increases.

If all goes according to plan, about two-thirds of the way through the release cycle, code will be frozen for system integration testing (SIT) and user acceptance testing (UAT). Then the fun begins, because no one really knows what sort of unintended interactions will be exposed or how serious the consequences of those interactions will be. It goes without saying that there will be defects; the real question is, can all of the critical defects be found and fixed before the promised release date?

Releases are so time-consuming and risky that organizations tend to extend the length of their release cycle so as not to have to deal with this pain too often. Extending the release cycle invariably increases the pain, but at least the pain occurs less frequently. Counteracting the tendency to extend release cycles is the rapid pace of change in business environments that depend on software, because the longer release cycles become a constraint on business flexibility. This clash of cadences results in an intense pressure to cram as many features as possible into each release. As lengthy release cycles progress, pressure mounts to add more features, and yet the development organization is expected to meet the release date at all costs.

Into this intensely difficult environment a new idea often emerges – why not shorten the release cycle, rather than lengthen it? This seems like an excellent way to break the death spiral, but it isn’t as simple as it seems. The problem, as Kent Beck points out in his talk “Software G Forces: the Effects of Acceleration,” is that shorter release cycles demand different processes, different sales strategies, different behavior on the part of customers, and different governance systems. These kinds of changes are notoriously difficult to implement.

Quick and Dirty Value Stream Map
I’m standing in front of a large audience. I ask the question: “Who here has a release cycle longer than three months?” Many hands go up. I ask someone whose hand is up, “How long is your release cycle?” She may answer, “Six months.” “Let me guess how much time you reserve for final integration, testing, hardening, and UAT,” I say. “Maybe two months?” If she had said a year, I would have guessed four months. If she had said 18 months, I would have guessed 6 months. And my guess would be very close, every time. It seems quite acceptable to spend two-thirds of a release cycle building buggy software and the last third of the cycle finding and fixing as many of those bugs as possible.

The next question I ask is: When do you decide what features should go into the release? Invariably when the release cycle is six months or longer, the answer is: “Just before we start the cycle.” Think about a six month release cycle: For the half year prior to the start of the cycle, demand for new or changed features has been accumulating – presumably at a steady pace. So the average wait of a feature to even be considered for development is three months – half of the six month cycle time. Thus – on the average – it will take a feature three months of waiting before the cycle begins, plus six months of being in development and test before that feature is released to customers; nine months in all.

Finally I ask, “About how many features might you develop during a six month release cycle?” Answers to this vary widely from one domain to another, but let’s say I am told that that about 25 features are developed in a six month release, which averages out to about one feature per week.

This leaves us with a quick and dirty vision of the value stream: a feature takes a week to develop and best case it takes nine months (38 weeks) to make it through the system. So the process efficiency is 1÷38, or about 2.6%. A lot of this low efficiency can be attributed to batching up 25 features in a single release. A lot more can be attributed to the fact that only 4 months of the 9 total months are actually spent developing software – the rest of the time is spent waiting for a release cycle to start or waiting for integration testing to finish.

Why not Quarterly Releases?
With such dismal process efficiency, let’s revisit to the brilliant idea of shortening release cycles. The first problem we encounter is that at a six month cadence, integration testing generally takes about two months; if releases are annual, integration testing probably takes three or four months. This makes quarterly releases quite a challenge.

For starters, the bulk of the integration testing is going to have to be automated. However, most people rapidly discover that their code base is very difficult to test automatically, because it wasn’t designed or written to be tested automatically. If this sounds like your situation, I recommend that you read Gojko Adzic’s book “Specification by Example.” You will learn to think of automated tests as executable specifications that become living documentation. You will not be surprised to discover that automating integration tests is technically challenging, but the detailed case studies of successful teams will give you guidance on both the benefits and the pitfalls of creating a well architected integration test harness.

Once you have the beginnings of an automated integration test harness in place, you may as well start using it frequently, because its real value is to expose problems as soon as possible. But you will find that code needs to “done” in order to be tested in this harness, otherwise you will get a lot of false negatives. Thus all teams contributing to the release would do well to work in 2-4 week iterations and bring their code to a state that can be checked by the integration test harness at the end of every iteration. Once you can reasonably begin early, frequent integration testing, you will greatly reduce final integration time, making quarterly releases practical.

Be careful, however, not to move to quarterly releases without thinking through all of the implications. As Kent Beck noted in his Software G Forces talk, sales and support models at many companies are based on annual maintenance releases. If you move from an annual to a quarterly release, your support model will have to change for two reasons: 1) customers will not want to purchase a new release every quarter, and 2) you will not be able to support every single release over a long period of time. You might consider quarterly private releases with a single annual public release, or you might want to move to a subscription model for software support. In either case, you would be wise not to guarantee long term support for more than one release per year, or support will rapidly become very expensive.

From Quarterly to Monthly Releases
Organizations that have adjusted their processes and business models to deal with a quarterly release cycle begin to see the advantages of shorter release cycles. They see more stability, more predictability, less pressure, and they can be more responsive to their customers. The question then becomes, why not increase the pace and release monthly? They quickly discover that an additional level of process and business change will be necessary to achieve the faster cycle time because four weeks – twenty days – from design to deployment is not a whole lot of time.

At this cadence, as Kent Beck notes, there isn’t time for a lot of information to move back and forth between different departments; you need a development team that includes analysts and testers, developers and build specialists. This cross-functional team synchronizes via short daily meetings and visualization techniques such as cards and charts on the wall – because there simply isn’t time for paper-based communication. The team adopts processes to ensure that the code base always remains defect-free, because there isn’t time to insert defects and then remove them later. Both TDD (Test Driven Development) and SBE (Specification by Example) become essential disciplines.

From a business standpoint, monthly releases tend to work best with software-as-a-service (SaaS). First of all, pushing monthly releases to users for them to install creates huge support headaches and takes far too much time. Secondly, it is easy to instrument a service to see how useful any new feature might be, giving the development team immediate and valuable feedback.

Weekly / Daily Releases
There are many organizations that consider monthly releases a glacial pace, so they adopt weekly or even daily releases. At a weekly or daily cadence, iterations become largely irrelevant, as does estimating and commitment. Instead, a flow approach is used; features flow from design to done without pause, and at the end of the day or week, everything that is ready to be deployed is pushed to production. This rapid deployment is supported by a great deal of automation and requires a great deal of discipline, and it is usually limited to internal or SaaS environments.

There are a lot of companies doing daily releases; for example, one of our customers with a very large web-based business has been doing daily releases for five years. The developers at this company don’t really relate to the concept of iterations. They work on something, push it to systems test, and if it passes it is deployed at the end of the day. Features that are not complete are hidden from view until a keystone is put in place to expose the feature, but code is deployed daily, as it is written. Occasionally a roll-back is necessary, but this is becoming increasingly rare as the test suites improve. Managers at the company cannot imagine working in at a slower cadence; they believe that daily deployment increases predictability, stability, and responsiveness – all at the same time.

Continuous Delivery
Jez Humble and David Farley wrote “Continuous Delivery” to share with the software development community techniques they have developed to push code to the production environment as soon as it is developed and tested. But continuous delivery is not just a matter of automation. As noted above, sales models, pricing, organizational structure and the governance system all merit thoughtful consideration.

Every step of your software delivery process should operate at the same cadence. For example, with continuous delivery, portfolio management becomes a thing of the past; instead people make frequent decisions about what should be done next. Continuous design is necessary to keep pace with the downstream development, validation and verification flow. And finally, measurements of the “success” of software development have to be based on delivered value and improved business performance, because there is nothing else left to measure.

Monday, February 7, 2011

Before There Was Management

Management is a rather recent invention in the history of human evolution – it’s been around for maybe 100 or 150 years, about two or three times longer than software. But people have been living together for thousands of years, and it could be argued that over those thousands of years, we did pretty well without managers. People are social beings, hardwired through centuries of evolution to protect their family and community, and to provide for the next generation. For tens of thousands of years, people have lived together in small hamlets or clans that were relatively self-sufficient, where everyone knew – and was probably related to – everyone else. These hamlets inevitably had leaders to provide general direction, but day-to-day activities were governed by a set of well understood mutual obligations. As long as the hamlets stayed small enough, this was just about all the governance that was needed; and most hamlets stayed small enough to thrive without bureaucracy until the Industrial Revolution.

The Magic Number One Hundred and Fifty
Early in his career, British Anthropologist Robin Dunbar found himself studying the sizes of monkey colonies, and he noticed that different species of monkeys preferred different size colonies. Interestingly, the size of a monkey colony seemed to be related to the size of the monkeys’ brains; the smaller the brain, the smaller the colony. Dunbar theorized that brain size limits the number of social contacts that a primate could maintain at one time. Thinking about how humans seemed to have evolved from primates, Dunbar wondered if, since the human brain was larger than the monkey brain, humans would tend to live in larger groups. He calculated the maximum group size that humans would be likely to live in based on the relative size of the human brain, and arrived at a number just short of 150. Dunbar theorized that humans might have a limit on their social channel capacity (the number of individuals with whom a stable inter-personal relationship can be maintained) of about 150.[1]

To test his theory, Dunbar and other researchers started looking at the size of social groups of people. They found that a community size of 150 has been a very common maximum limit in human societies around the world going back in time as far as they can investigate. And Dunbar’s Number (150) isn’t found only in ancient times. The Hutterites, a religious group that formed self-sufficient agricultural communities in Europe and North America, have kept colonies under 150 people for centuries. Beyond religious communities, Dunbar found that during the eighteenth century, the average number of people in villages in every English county except Kent was around 160. (In Kent it was 100.) Even today, academic communities that are focused on a particular narrow discipline tend to be between 100 and 200 – when the community gets larger, it tends to split into sub-disciplines.[2]

Something akin to Dunbar’s number can be found in the world of technology also. When Steve Jobs ran the Mackintosh department at Apple, his magic number was 100. He figured he could not remember more than 100 names, so the department was limited to 100 people at one time. A team that never exceeded 100 people designed and developed both the hardware and software that became the legendary Apple Macintosh.[3] Another example: in a 2004 blog The Dunbar Number as a Limit to Group Sizes, Christopher Allen noted that on-line communities tend to have 40 to 60 active members at any one time. You can see two peaks in Allen’s chart of group satisfaction as a function of group size – one peak for a team size of 5 to 8, and an equally high peak when team size is around 50.[4]
Steve Job’s limit of 100 people was probably a derivative of the Dunbar Number, but Allen’s peak at 50 is something different. According to Dunbar, “If you look at the pattern of relationships within… our social world, a number of circles of intimacy can be detected. The innermost group consists of about three to five people. … Above this is a slightly larger grouping that typically consists of about ten additional people. And above this is a slightly bigger circle of around thirty more…”[5] In case you’ve stopped counting, the circles of intimacy are 5, 15, 50, 150 – each circle about three times the size of the smaller circle. The number 50, which Allen found in many on-line communities, is the number of people Dunbar found in many hunting groups in ancient times – and three of these groups of 50 would typically make up a clan.

Does this Work in Companies?
One Hundred and fifty is certainly a magic number for W.L. Gore & Associates. Gore is a privately held business that specializes in developing and manufacturing innovative products based on PTFE, the fluoropolymer in Gore-Tex fabrics. Gore has revenues exceeding 2.5 billion US dollars, employs over 8000 people, and has been profitable for over a half a century. It has held a permanent spot on the U.S. "100 Best Companies to Work For" since it’s inception in 1984, and is a fixture on similar lists in many countries in Europe. This amazing track record might be related to the fact that Gore doesn’t have managers. There are plenty of leaders at Gore, but leaders aren’t assigned the job, they have to earn it by attracting followers.

You’ve got to wonder how such a large company can turn in such consistent performance for such a long period of time without using traditional management structures. The answer seems to have something to do with the fact that Gore is organized into small businesses units that are limited to about 150 people. “We found again and again that things get clumsy at a hundred and fifty,” according to founder Bill Gore. So when the company builds a new plant, it puts 150 spaces in the parking lot, and when people start parking on the grass, they know it’s time to build a new plant.

Since associates at Gore do not have managers, they need different mechanisms to coordinate work, and interestingly, one of the key mechanisms is peer pressure. Here is a quote from Jim Buckley, a long-time associate at a Gore plant: “The pressure that comes to bear if we are not efficient as a plant, if we are not creating good enough earnings for the company, the peer pressure is unbelievable. …This is what you get when you have small teams, where everybody knows everybody. Peer pressure is much more effective than a concept of a boss. Many, many times more powerful.”[6]

Like many companies that depend on employees to work together and make good decisions, Gore is very careful to hire people who will fit well in its culture. Leaders create environments where people have the tools necessary for success and the information needed to make good decisions. Work groups are relatively stable so people get to know the capabilities and expectations of their colleagues. But in the end, the groups are organized around trust and mutual obligation – a throwback to the small communities in which humans have thrived for most of their history.

Google’s management culture has quite a few similarities with Gore’s. Google was designed to work more or less like a university – where people are encouraged to decide on their own (with guidance) what they want to investigate. Google is extremely careful about hiring people who will fit in its culture, and it creates environments where people can pursue their passion without too much management interference. For a deep dive into Google’s culture, see this video: Eric Schmidt at the Management Lab Summit

Peer Cultures
Before there were managers, peer cultures created the glue that held societies together. In clans and hamlets around the world throughout the centuries, the self-interest of the social group was tightly coupled with the self-interest of individuals and family units; and thus obligations based on family ties and reciprocity were essential in creating efficient communities.

There are many, many examples of peer cultures today, from volunteer organizations to open source software development to discussion forums and social networks on the web. In these communities, people are members by their own choice; they want to contribute to a worthy cause, get better at a personal skill, and feel good about their contribution. In a peer culture, leaders provide a vision, a way for people to contribute easily, and just enough guidance to be sure the vision is achieved.

Arguably, peer cultures work a lot better than management at getting many things done, because they create a social network and web of obligations that underlie intrinsic motivation. So perhaps we’d be better off taking a page out of the Gore or the Google or the Open Source playbook and leverage thousands of years of human evolution. We are naturally social beings and have a built-in need to protect our social unit and ensure that it thrives.

Example: Hardware/Software Products
“We have found through experience that the ideal team size is somewhere between 30 and 70,” the executive told us. At first we were surprised. Aren’t teams supposed to be limited to about 7 people? Don’t teams start breaking up when they’re much larger? Clearly the executive was talking about a different kind of team than we generally run into in agile software development. But his company was one of the most successful businesses we have encountered recently, so we figured there had to be something important in his observation.

We spend a morning with a senior project manager at the company – the guy who coordinated 60 people in the development of a spectacular product in record time. The resulting product was far ahead of its time and gave the company a significant competitive advantage. He explained how he coordinated the work: “Every 2 or 3 months we produced a working prototype, each one more sophisticated than the last one. As we were nearing the end of development, a new (faster, better, cheaper) chip hit the market. The team decided to delay the next prototype by two months so they could incorporate the new chip. Obviously we didn’t keep to the original schedule, but in this business, you have to be ready to seize the opportunities that present themselves.”

It’s not that this company had no small teams inside the larger teams; of course they did. It’s just that the coordination was done at the large team level, and the members of the smaller teams communicated on a regular basis with everyone on the larger team. All team members were keenly aware of the need to meet the prototype deadlines and they didn’t need much structure or encouragement to understand and meet the needs of their colleagues.

Another Example: Construction
The Lean Construction Institute has developed a similar approach to effectively organizing construction work. The first thing they do is to break down very large projects into multiple smaller ones so that a reasonable number of contractors can work together. (Remember Dunbar’s Number.) For example, they might completely separate a parking structure and landscaping from the main building; in a large building, the exterior would probably be a separate project from the interior. Each sub-project is further divided into phases of a few months; for example, foundation, structure, interior systems, etc. Before a phase starts, a meeting of all involved contractors is held and all of the things that need to be done to complete that phase are posted on cards on a wall by the contractors. The cards are organized into a timeline that takes dependencies into account, and all of the contractors agree that the wall represents a reasonable simulation of the work that needs to be done. This is not really a plan so much as an agreement among the contractors doing the work about what SHOULD be done to complete the phase.

Each week all of the “Last Planners” (crew chiefs, superintendents, etc.) get together and look at what they SHOULD do, and also what they CAN do, given the situation at the building site. Then they commit to each other what they WILL complete in the next week. The contractors make face-to-face commitments to peers that they know personally. This mutual commitment just plain gets things done faster and more reliably than almost any other organizing technique, including every classic scheduling approach in the book.

The Magic Number Seven
George Miller published “The Magical Number Seven, Plus or Minus Two” in The Psychological Review in 1956. Miller wasn’t talking about team size in this article; he was discussing the capacity of people to distinguish between alternatives. For example, most people can remember a string of 7 numbers, and they can divide colors or musical tones into about 7 categories. Ask people to distinguish between more than 7 categories, and they start making mistakes. “There seems to be some limitation built into us either by learning or by the design of our nervous systems, a limit that keeps our channel capacities in this general range [of seven],” Miller wrote.

This channel capacity seems to affect our direct interaction with other people – we can keep up a conversation with 7 or so people, but when a group gets larger, it is difficult to maintain a single dialog, and small groups tend to start separate discussions. So for face-to-face groups that must maintain a single conversation, the magic number of 7 +/-2 is a good size limit. And historically, most agile software development teams have been about this size.

Moving Beyond Seven
The problem is, 7 people are not enough to accomplish many jobs. Take the job of putting a new software-intensive product on the market, for example. The product is almost never about the software – the product is a medical device or a car or a mobile phone or maybe it’s a financial application. Invariably software is a subsystem of a larger overall system, which means that invariably the software development team is a sub-team of a larger overall system team.

In the book Scaling Lean & Agile Development, Craig Larman and Bas Vodde make a strong case for feature teams – cross-functional teams that deliver end-to-end customer features. They recommend against component teams, groups formed around a single component or layer of the system. I agree with their advice, but it seems to me that software is invariably a component of whatever system we are building. We might be creating software to automate a process or software to control a product, software to deliver information or software to provide entertainment. But our customers don’t care about the software; they care about how the product or process works, how relevant the information is or how entertaining the game might be. And if software is a component of a system, then software teams are component teams. What we might want to consider is that real feature teams – teams chartered to achieve a business goal – will almost certainly include more than software development.

Agile development started out as a practice for small software teams, but these days we often see teams of 40 or 50 developers applying agile practices to a single business problem. In almost every case, we notice that the developers are organized into several small teams that work quite separately – and in almost every case, therefore, the biggest problem seems to be coordination across the small teams. There are many mechanisms: use a divisible system architecture so teams can be truly independent; draw from a common list of tasks, which makes teams highly interdependent; send small team representatives to weekly coordinating meetings; and so on. But rarely do we see the most powerful coordination mechanism of all for groups this size: create a sense of mutual obligation through peer commitments.

Mutual Obligation
You can call mutual obligation peer pressure if you like, but whatever name you use, when individuals on a large team make a commitment to people they know well, the commitment will almost certainly be honored. Mutual obligation is a much more powerful motivating force than being told to do something by an authority figure. And the interesting thing is, the power of mutual obligation is not confined to small teams. It works very well in teams of 50, and can be effective with teams up to 150. The time to split teams is not necessarily when they reach 10; team sizes up to 100 or 150 can be very effective – if you can create a sense of mutual obligation among the team members.

There are, of course, a few things that need to be in place before mutual commitment can happen. First of all, team members must know each other – well. So this won’t work if you constantly reform teams. In addition to knowing each other’s names, teammates must understand the capabilities of their colleagues on the team, have the capacity to make reliable commitments, and be able to trust that their teammates will meet their commitments. This process of creating mutual obligations actually works best if there is no manager brokering commitments, because then the commitments are made to the manager, not to teammates. Instead, a leader’s role is to lay out the overall objectives, clarify the constraints, and create the environment in which reliable commitments are exchanged.

For example, the project manager of the hardware/software product (above) laid out a series of increasingly sophisticated prototypes scheduled about three months apart. Having made a commitment to the team, sub-teams organized their work so as to have something appropriate ready at each prototype deadline. When an opportunity to dramatically improve the product through incorporation of a new chip, the whole team was in a position to rapidly re-think what needed to be done and commit to the new goal.

In the case of lean construction (above), a large team of contractor representatives works out the details of a “schedule” every few months. Each week, the same team gets together and re-thinks how that “schedule” will have to be adapted to fit current reality. At that same weekly meeting, team members commit to each other what they will actually accomplish in the next week, which gives their colleagues a week to plan work crews, material arrival, and so on for the following week.

It certainly is a good idea to have small sub-teams whose members work closely together on focused technical problems, coordinating their work with brief daily meetings to touch base and make sure they are on track to meet their commitments. But the manner in which these sub-teams arrive at those commitments is open for re-thinking. It may be better to leverage thousands of years of human evolution and create an environment whereby people know each other and make mutual commitments to meet the critical goals of the larger community. After all, that’s the way most things got accomplished before there was management.

[1] Technically, Dunbar calculated the relative sizes of the neocortex – the outer surface of the brain responsible for conscious thinking. For a humorous parody of Dunbar's theory, see "What is the Monkeysphere?" by David Wong.

[2] Information in this paragraph is from: How Many Friends Does One Person Need? by Robin Dunbar.

[3] See John Sculley On Steve Jobs.

[4] This figure from “The Dunbar Number as a Limit to Group Sizes” is antidotal.

[5] From How Many Friends Does One Person Need? by Robin Dunbar. Interestingly, while Dunbar finds 15 an approximate limit of the second circle of intimacy, Allen finds a group of 15 problematic.

[6] The Dunbar Number was popularized by Malcolm Gladwell in Tipping Point. Much information and both quotes in this section are from Chapter 5 of that book. See for an extended excerpt.

Saturday, January 15, 2011

A Tale of Two Terminals

By any measure, Terminal 3 at Beijing Capital Airport is impressive. Built in less than four years and officially opened barely four months before the Olympics, the massive terminal has received numerous awards for both its stunning design and its comfortable atmosphere. And it escaped the start-up affliction of many new airport terminals when it commenced full operations on March 26, 2008 without any notable problems.

The next day, half way around the world, Heathrow Terminal 5 opened for business. At one-third the size of Beijing Terminal 3, the new London terminal had taken twice as long to build and cost twice as much. Proud executives at British Airlines and BAA (British Airports Authority) exuded confidence in a flawless opening, but that was not to be. Instead, hundreds of flights were canceled in the first few days of operation, and about 28,000 bags went missing the first weekend. The chaotic opening of Heathrow Terminal 5 was such an embarrassment that it triggered a government investigation.

The smooth opening of Beijing Terminal 3 was not an anomaly – Terminal 2 at Shanghai Pudong International Airport opened the same day, also without newsworthy incident. Given the timing just before the Beijing Olympic games, it was clear that China was keenly interested in projecting an image of competence to the traveling public. But of course, the UK was equally interested in showcasing its proficiency, and the British executives clearly expected that the opening of Heathrow Terminal 5 would go smoothly. So the question to ponder is this: How did the Chinese airports manage two uneventful terminal openings? Did they do something different, or were the problems in London just bad luck?

It’s not like testing was overlooked at Heathrow Terminal 5. In fact, a simulation of the terminal’s systems was developed and all of the technical systems were tested exhaustively, even before they were built. A special testing committee was formed and thousands of people were recruited to be mock passengers, culminating in a test with 2000 volunteer passengers a few weeks before the terminal opened. On the other hand, the planned testing regime was curtailed because the terminal construction was not completed as early as planned; in fact, hard-hats were required in the baggage handling area until shortly before opening day. In addition, a decision was made to move 70% of the flights targeted for Terminal 5 on the very first day of operations, because it was difficult to imagine how to move in smaller increments.[1]

Those of us in the software industry have heard this story before: the time runs out for system testing, but a big-bang cut-over to a mission critical new system proceeds anyway, because the planned date just can’t be delayed. The result is predictable: wishful thinking gives way to various degrees of disaster.

That kind of disaster wasn’t going to happen in Beijing. I was in Beijing a month before the Olympics, and every single person I met – from tour guide to restaurant worker – seemed to feel personally responsible for projecting a favorable image as the eyes of the world focused on their city. I imagine that for every worker in Terminal 3, a smooth startup was a matter of national pride. But the terminal didn’t open smoothly just because everyone wanted things to go well. It opened smoothly because the airport authorities understood how to orchestrate such a large, complex undertaking that involved hundreds of people. After all, they had just finished building the airport at amazing speed.[2]

The opening ceremony of the Beijing Olympics was also a large, complex undertaking that involved hundreds of people. It’s easy to imagine the many rehearsals that took place to make sure that everyone knew their part. When it comes to opening a new terminal, the idea of rehearsals doesn’t usually occur to the authorities, but at Beijing Capital Airport, rehearsals started in early February. First a couple of thousand mock passengers took part in a rehearsal, then five thousand, and finally, on February 23rd, 8000 mock passengers checked in luggage for 146 flights. This was the average daily load expected a week later, when six minor airlines moved into Terminal 3. During the month of March, Terminal 3 operated on a trial basis, ironing out any problems that arose. Meanwhile, staff from the large airlines about to move to the terminal rehearsed their jobs in the new terminal day after day, so that when the big moving day arrived, everyone knew what to do. On March 26, all the practice paid off when the Terminal was opened with very few problems.

This certainly wasn’t the approach taken at Heathrow Terminal 5. It’s pretty clear that the opening chaos was caused because people did not know what to do: they didn’t know where to park, couldn’t get through security, didn’t know how to sign on to the new PDA’s to get their work assignments, didn’t know where to get help, and didn’t know how to stop all the luggage from coming at them until their problems got sorted out. Even the worst of the technical problems was actually a people problem: the baggage handling software had been put in a ‘safe’ mode for testing, and apparently no one was responsible for removing the patch which cut off communication to other terminals in the airport. It took three days to realize that this very human error was the main cause of the software problems![3]

In testimony to the British House of Commons, union Shop Steward Iggy Vaid testified:[4]
We raised [worker concerns] with our senior management team especially in British Airways. … [Their response was to] involve what we call process engineers who came in and decided what type of process needed to be installed. They only wanted the union to implement that process and it was decided by somebody else, not the people who really worked it. The fact is that they paid lip service to, ignored or did not implement any suggestion we made.

… as early as January there was a meeting with the senior management team at which we highlighted our concerns about how the baggage system and everything else would fail, that the process introduced would not work and so on. We highlighted all these concerns, but there was no time to change the whole plan.

... [Workers] had two days of familiarization in a van or were shown slides; they were shown where their lockers were and so on, but there was no training for hands-on work.….
The opening of a new airport terminal is an exercise in dealing with complexity. At Heathrow Terminal 5, new technical systems and new work arrangements had to come together virtually overnight – and changing the date once it has been set would have been difficult and expensive. Hundreds of people were involved, and every glitch in the work system had a tendency to cascade into ever larger problems.

If this sounds familiar, it’s because this scenario has been played out several times in the lives of many of us in software development. Over time, we have learned a lot about handling unforgiving, complex systems, particularly systems that include people interacting with new technology. But every time we encounter messy transition like the one at Heathrow Terminal 5, we wonder if our hard-learned lessons for dealing with complexity couldn’t be spread a bit wider.

Socio-technical Systems
Not very far from Heathrow, the Tavistock Institute of London has spent some decades researching work designs that deal effectively with turbulence and complexity. In the 1950’s and 60’s, renowned scientists such as Eric Trist and Fred Emery documented novel working arrangements that were particularly effective in the coal mines and factories of Great Britain. They found that especially effective work systems were designed (and continually improved) by semi-autonomous work teams of between 10 and 100 people that accepted responsibility for meaningful (end-to-end) tasks. The teams used their knowledge of the work and of high-level objectives to design a system to accomplish the job in a manner that optimized the overall results. Moreover, these teams were much better at managing uncertainty and rapidly adapting to various problems as they were encountered. The researchers found that the most effective work design occurs when the social aspects of the work are balanced with its technical aspects, so they called these balanced work systems Socio-technical systems.

In 1981, Eric Trist published “The Evolution of Socio-technical Systems,” an engaging history of his work. He attributes the “old paradigm” of work design to Max Weber (bureaucracy) and Frederick Taylor (work fragmentation). He proposed that a “new paradigm” would be far more effective for organizations in turbulent, competitive, or rapidly changing situations:[5]
Old Paradigm New Paradigm
The technological imperative Joint optimization [of social & technical systems]
Man as an extension of the machine Man as complementary to the machine
Man as an expendable spare part Man as a resource to be developed
Maximum task breakdown, simple narrow skills Optimum task grouping, multiple broad skills
External controls (supervisors, specialist staffs, procedures) Internal controls (self-regulating subsystems)
Tall organization chart, autocratic style Flat organization chart, participative style
Competition, gamesmanship Collaboration, collegiality
Organization’s purposes only Members’ and society’s purposes also
Alienation Commitment
Low risk taking Innovation

In the 1980’s the socio-technical paradigm gained increased popularity when team work practices from Japan were widely copied in Europe and America. In the 1990’s socio-technical ideas merged with general systems theory, and the term “socio-technical systems” fell into disuse. But the ideas lived on. These days, it is generally accepted that the most effective way to deal with complex or fast-changing situations is by structuring work around semi-autonomous teams that have the leadership and training to respond effectively to any situation the groups are likely to encounter.

The clearest example we have of semi-autonomous work teams are emergency response teams – firefighters, paramedics, emergency room staff. Their job is to respond to challenging, complex, rapidly changing situations, frequently in dangerous surroundings and often with lives at stake. Emergency response teams prepare for these difficult situations by rehearsing their roles, so everyone knows what to do. During a real emergency, that training coupled with the experience of internal leaders enables the teams to respond dynamically and creatively to the emergency as events unfold.

Design Social Systems Along with Technical Systems
Developing a software system that automates a work system is fraught with just about as much danger as moving to a new airport terminal. There are many things we can do to mitigate that risk:
1. Cutover to any new system should be in small increments. Impossible? Don’t give up on increments too quickly – and don’t leave this to “customers” to decide! The technical risk of a big-bang cut-over is immense. And it’s almost always easier to divide the system in some way to facilitate incremental deployment than it is to deal with the virtually guaranteed chaos of a big-bang cutover.

2. Simplify before you automate. Never automate a work process until the work teams have devised as simple a work process as they possibly can. Automating the right thing is at least as important as automating it right.

3. Do not freeze work design into code! Leave as much work design as possible for work teams to determine and modify. If that is not possible, make sure that the people who will live with the new system are involved in the design of their work.

4. Rehearse! Don’t just test the technical part, include the people who will use the new system in end-to-end rehearsals. Be prepared to adapt the technical system to the social system and to refine the social system as well. Be sure everyone knows what to do; be sure that the new work design makes sense. Leave time to adjust and adapt. Don’t cut this part short.

5. Organize to manage complexity. Structure work around work teams that can adapt to changing situations, especially if the environment is complex, could change rapidly, or is mission critical. At minimum, have emergency response teams on hand when the new system goes live.
Much of the software we write ends up having an impact on the lives of other people; in other words, our work creates changes in social systems. We would do well to consider those social systems as we develop the technical systems. If we want to create systems that are truly successful, the technical and social aspects of our systems must be designed together and kept in balance.
[1] “The opening of Heathrow Terminal 5” report to the House of Commons Transportation Committee pages 13-14.

[2] Contrast this with the absence of the BAA management team that oversaw the on-time, on-budget construction of Heathrow Terminal 5; they were replaced after a 2006 takeover of BAA by the Spanish company Ferrovial.

[3] See “The opening of Heathrow Terminal 5” report to the House of Commons Transportation Committee.

[4] “The opening of Heathrow Terminal 5” report to the House of Commons Transportation Committee pages 22-25.

[5] From "The Evolution of Socio-technical Systems" by Eric Trist, 1981. p 42.