Lean Essays: Cloud

Showing posts with label Cloud. Show all posts

January 22, 2018

Official Intelligence

Every morning I pick up a small black remote, push a button and quietly say, “Alexa, turn on Mary’s Desk.” In the distance, I hear “Ok” and my desk lights come on. I never imagined that I would use voice control. I don’t like shouting at devices and don’t like announcing what I’m doing to those around me. My experience with voice in cars and smartphones has been mediocre. An Echo sat in our house for three years before I began talking to it.

Our home has been highly automated since the 1980’s, but my low voltage desk lamps are not compatible with our automation system. Last fall we connected our system to Alexa and bought an Alexa-compatible power strip for the lamps. Viola! we could control everything by voice – in fact, voice was the only way to control my desk lamps remotely. So I had to talk to a device. After I got used to it, I tried Alexa’s shopping list and found it convenient. Then I discovered that Alexa’s timers are well suited to cooking, especially with full hands. Soon I got an Echo Spot so I could see the timers as they counted down – I was hooked.

A Killer App for IoT

When devices are scattered throughout a physical space, controlling them by voice is a killer app for the Internet of Things [1] – but only if a simple control standard is embedded in every device. For the past couple of years, Amazon has been making it very easy for just about any Wi-Fi enabled device to be controlled through an Alexa skill. Better yet, taking a page out of the old Intel playbook, Amazon sells inexpensive kits and supplies testing support, so designers can easily embed microphones and Alexa intelligence inside their devices. True, these devices compete directly with Amazon Echos, but as a platform company, Amazon understands that this is a good thing.

Amazon was once considered the weakest of the voice assistant competitors, which include Google, Apple, Samsung, Microsoft, and Facebook. Most voice assistants are, well, assistants. They reserve movie tickets, arrange transportation, find answers to questions. Amazon’s Echo has a different focus: always-on listening and hands-free control of an exploding array of internet devices. This turned out to be a good choice. Google, playing fast follower, quickly introduced Google Home, while Microsoft’s Cortana is being integrated with Alexa. With its first mover advantage, Alexa has captured a commanding share of the voice control market, which could eventually become a fourth ‘pillar’ of Amazon’s success.[2]

I have to wonder: Is it an accident that Amazon discovered the most attractive use of voice while its biggest competitors were heading in a different direction? Or is there something about Amazon that gives it an edge, something that we might learn from? How does such a massively large company foster the kind of innovation that leads to completely new markets?

Day 1

In his letter to shareholders last spring, Jeff Bezos explained his longstanding mantra “It’s still Day 1” by describing Day 2: “Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.” Then Bezos laid out his principles for keeping the vitality of Day 1 alive:

True Customer Obsession
A Skeptical View of Proxies
Eager Adoption of External Trends
High-Velocity Decision Making

True Customer Obsession

We’ve heard so much about Amazon’s customer obsession that it can get boring. After all, doesn’t every company focus on customers?

Actually, no. Most executives lose a lot more sleep over profits, or shareholders, or competitors than they do worrying about customers. Imagine you worked at an airline, for example, and you had an idea of how to make customers really happy: “Let’s eliminate the baggage fees!” Your manager frowns at you and says: “Have you any idea how much money those baggage fees bring into this airline every month?” And that would be the last customer-obsessed suggestion you would make.

But consider the Amazon team that came up with Lambda. Some customers report up to an order of magnitude reduction in cost when they switch to Lambda. Yet the Lambda team did not have to answer the sobering question: “Do you know how much revenue Lambda might cannibalize?” Everyone understood that lower prices are good for customers, so they are good for Amazon.

How does customer obsession get all the way from a statement in a shareholder letter to the actions of front line employees? Amazon does this by creating a direct line of sight between small teams and the customers they are supposed to be obsessed with, then making the teams responsible for improving the lives of those customers in some way.

Too Big to Communicate

Around 2001, Amazon’s growth was outstripping the capability of its internal systems to keep up. The leadership team came to a pretty standard conclusion – better communication was needed. Jeff Bezos was wise enough to realize that if communication was the problem, the solution had to be less communication, not more. He wanted the company to grow much larger, and if communication was impeding growth at this early stage, they had better figure out how to operate with a lot less of it.

How did the Internet grow so large? Through a lot of independent agents following their own agendas. How does Open Source software grow? The same way. Bezos decided that Amazon should transition to the independent agent model by organizing into small, independent teams. “If you can arrange to do big things with a multitude of small teams – that takes a lot of effort to organize, but if you can figure that out – the communication on those small teams will be very natural and easy,”[3] Bezos observed.

What, exactly, does Bezos mean by a team? At Amazon: [3,4]

Teams are groups of 6-12 people with a leader who acts something like a team CEO. The leader often recruits the rest of the team, and members usually stay with a team for two or more years.
Teams are ‘separable’ [separated organizationally] and ‘single-threaded’ [work on a single thing].
Teams are responsible for a measurable set of external outcomes, usually focused on customers.
Team decide internally both what they will work on and how they do the work.
Dependencies between teams are kept to an absolute minimum.

Once Amazon decided to structure a company composed of small, autonomous teams responsible for small, independent services, it had to figure out how to build an extensible infrastructure with these teams. That took a lot of time and experimentation, but in the end, it worked. In fact, it worked so well that Amazon decided to sell the infrastructure rather than keep it proprietary. And thus, we have Amazon Web Services (AWS), also known as ‘the Cloud’. As more and more companies move to the cloud they would be wise to understand that before it was a system architecture, the Cloud was an organizational architecture designed to streamline communication.

The Cathedral and the Bazaar

You are probably saying to yourself about now, “Cloud architectures are fine for a digital world, but how can they possibly work for a large company?” When I first heard about AWS, I asked the same question. To echo Eric Raymond’s “The Cathedral and The Bazar,” [5]

I used to believe there was a certain critical complexity above which a centralized, a priori approach to running a company was required. I thought that successful large companies were built like cathedrals, carefully crafted by individual wizards or small bands of magicians who orchestrated successful strategies.

Amazon’s style of organization – assemble hundreds upon hundreds of autonomous teams that decide for themselves what they are going to work on – seemed to resemble a great babbling bazaar of differing agendas and approaches out of which a coherent and stable business could seemingly emerge only by a succession of miracles.

The fact that this bazaar style seems to work, and work well, came as a distinct shock. As I learned more about AWS, I worked hard at trying to understand why the Amazon world not only didn’t fly apart in confusion but seemed to go from strength to strength at a speed barely imaginable to cathedral-builders.

Knowledge Workers

In 1999, Peter Drucker published the paper "Knowledge-Worker Productivity: The Biggest Challenge." The productivity gains of the 20th century, he noted, applied to manual labor. In the 21st century, the challenge will be increasing knowledge worker productivity, and this requires an approach opposite to the one we have been using for manual labor productivity. He wrote: [6]

“Knowledge-worker productivity requires that the knowledge worker is both seen and treated as an ‘asset’ rather than a ‘cost.’ It requires that knowledge workers want to work for the organization in preference to all other opportunities.”

“Knowledge workers [unlike manual workers] own the means of production. That knowledge between their ears is a totally portable and enormous capital asset.”

“Economic theory and most business practice sees manual workers as a cost. To be productive, knowledge workers must be considered a capital asset. Costs need to be controlled and reduced. Assets need to be made to grow.”

Drucker pointed out that there are many jobs which we might consider manual labor that involve a lot of knowledge work. One good example would be retail clerks, the subject of Zeynep Ton’s book "The Good Jobs Strategy." Ton illustrates how several retail chains have generated strong growth and higher than normal profits by paying well, training extensively, and expecting their intelligent employees to create a great experience for customers.

Knowledge workers are everywhere in our companies, and they represent a huge opportunity for improved performance, if only we learn how to see them as assets and grow their potential.

Volunteers

In his 2001 book "Management Challenges for the 21st Century," Peter Drucker pointed out that knowledge workers must be managed as if they were volunteers, because in fact, they are volunteers. During my last three years at 3M, I used the then-classic approach at 3M of ‘bootlegging’ the efforts of dozens of scientists and engineers, and led volunteer team developing a product called ‘Light Fiber’ and the process needed to manufacture it. I learned a lot about what it takes to lead a large team of volunteer knowledge workers, and it boils down to this: Understand what energizes every person on the team and arrange for each person to do as much of what energizes them as possible. This works particularly well because people tend to be energized by what they are good at, and focusing people’s work on what they are good at creates a win-win situation.

The Light Fiber team met every Wednesday morning before regular work hours (when everyone was free). I supplied breakfast and a couple dozen people showed up to coordinate their efforts – every week, for three years! The meeting was essentially a forum where everyone got brag about their accomplishments and make promises to each other about what they would do in the future. Even though there were no work assignments, only promises, the team accomplished amazing things.

Promises

I was reminded of this experience by the book "Thinking in Promises" by Mark Burgess. He defines a promise as a public declaration of intention by an agent. Agents can only make promises for themselves – they cannot make promises for (impose intentions on) other agents. Agents communicate about what is necessary to achieve shared goals, and then make promises to each other about their intention to contribute to the shared goal. Trust develops when agents are observed routinely keeping promises. Agents can make promises contingent on the trusted behavior of other agents, but they need a fallback plan, since the best of intentions can go awry. This sounded very familiar to me – it is a good description of what happened every week at our Light Fiber meeting. And I agree with Burgess that a system built on promises can be very reliable and robust.

Think of the Bazaar approach as a marketplace where knowledge workers can find the best places to utilize their strengths. The currency of the Bazaar is the promises made to colleagues and the trust built up by promises that are kept. Companies that function as Bazaars have discovered a secret: “Peer pressure is much more powerful than a concept of a boss. Many, many times more powerful.”[7] When people make promises to colleagues and customers, they feel a personal commitment to keep the promise. When managers impose obligations on their teams, there are many points of failure.

A Skeptical View of Proxies

Jeff Bezos’ second principle for vitality is to take a skeptical view of proxies. What are proxies? Bezos cites process as a typical example – he doesn’t think “I followed the process” is a good excuse for poor results.

The most vexing proxies in the development world are the project metrics of cost, schedule, and scope. Teams that can focus directly on the desired outcome usually perform a lot better than teams constrained by these proxies. For IT departments, ‘The Business’ is a proxy. For many businesses, profits are a proxy for delighted customers. [8] Be skeptical.

Even if someone does their homework and proves that delivering specific proxy results will surely deliver the desired end results, a direct line of sight to the desired outcomes is much better. Why? 1) Things change, but proxies prevent the team from changing accordingly. 2) Proxies tend to mask the intent or purpose of the work, diminishing engagement. 3) Proxies interfere with feedback and therefore slow things down.

High-Velocity Decision Making

This brings us to another of Bezos’ principles for maintaining company vitality: high-velocity decision making. Fast decisions are local decisions, because when a decision must be made immediately, there is no time to push it up the chain of command. In military organizations, where high-velocity decision making is a matter of life and death, front line units make local decisions based on situational awareness and their understanding of command intent. Wise commanders get very good at communicating their intent and the desired end state, so that the rapid decisions made on the front lines will be good decisions. This well-tested approach is probably the best model we have for making fast decisions in rapidly changing environments.

There are three things that get in the way of high-velocity decisions at the local (team) level:

Proxies rather than a clear understanding of the desired end state.
Permission required from management or other teams.
Punishment if the decision is wrong.

We’ve already discussed proxies.

Permission

The State of DevOps report in 2017 found that the best performing teams are those that operate without the need to obtain permission from outside the team. Obviously, wise managers should try to step back and let teams make their own decisions. But there is a bigger issue here. All too often, there are significant dependencies between teams, requiring multiple teams to coordinate their actions across a large set of interconnections. This was once the reason for long delays between software releases, but we now know that breaking dependencies is a far better strategy than catering to them.

Dependencies can be subtle, and are usually based on the system architecture. In the late 20th century, companies spent years pursuing the holy grail of integration, only to discover that integrated systems create a legacy of intertwined processes. This interconnectedness makes it almost impossible for teams to make changes without getting permission from a lot of other teams. So much for high-velocity decision making.

Punishment

Some years ago, 3M’s visionary CEO, William McKnight, made clear what he thought about punishing mistakes: [9]

“As our business grows, it becomes increasingly necessary to delegate responsibility and to encourage men and women to exercise their initiative. This requires considerable tolerance. Those men and women, to whom we delegate authority and responsibility, if they are good people, are going to want to do their jobs in their own way.”

“Mistakes will be made. But if a person is essentially right, the mistakes he or she makes are not as serious in the long run as the mistakes management will make if it undertakes to tell those in authority exactly how they must do their jobs.”

“Management that is destructively critical when mistakes are made kills initiative. And it’s essential that we have many people with initiative if we are to continue to grow.”

Eager Adoption of External Trends

To wrap up our discussion of Day 1 principles, let’s consider why Bezos thinks it’s important to adopt external trends. He says that if you pay attention to how external trends are likely to affect you and your customers, you will have a tail wind that can push you toward some interesting opportunities.

Here is an example: Today’s big trends are artificial intelligence (AI) and machine learning – voice recognition is a trendy use of artificial intelligence. Verbal interaction has not been a large part of Amazon’s retail and web services businesses, but that did not keep the company from making very strong investments in voice technologies. By late 2017, the AWS re:Invent conference centered on voice recognition and machine learning embedded in devices at the edge of the cloud. You could feel the tail wind.

Let’s assume you want to take Bezos’ advice and embrace today’s big trend – artificial intelligence. You might start by asking: Can artificial intelligence be used to help understand customers better? [We know of a company using Watson to filter social media comments in order to find key customer frustrations.] Could it be used to improve the development process? [Think automation on steroids.] What could machine learning do to increase the reliability of deployed systems? [All the data needed to discover the causes of crashes is probably in logs somewhere.] If you are not asking yourself these questions, you’re asking for a headwind.

Official Intelligence

When (not if) you find uses for artificial intelligence, it’s time to hit the pause button. If you find jobs that could be replaced by smart machines, you should ask yourself: Why? Why aren’t the intelligent people who currently do those jobs being challenged to think, to find innovative solutions to problems, to be obsessed with customers? Your challenge, should you accept it, is not to embrace artificial intelligence, it is to uncover the official intelligence that is going to waste in your organization. Don’t worry about how AI might reduce the cost of development, focus on how it might be used to leverage the knowledge and creativity of all those intelligent people in your organization.

As Peter Drucker pointed out, we know how to make manual labor more productive – in fact, artificial intelligence can be quite helpful there. But the real advances of the 21st Century will come when we figure out how to make sure that everyone in our organization is officially considered an intelligent person. Officially intelligent people don’t ask permission, they make promises. Officially intelligent people don’t need proxies, they are challenged with the end game. Officially intelligent people have jobs that are augmented by artificial intelligence not replaced by it.

Unleashing the potential of all the bright, creative people in our organizations is the central challenge of the digital age.

_________________________
Footnotes:

See Alexa, the Killer App
The first three pillars are Marketplace, Prime, and AWS. See Alexa Could Be Amazon's "Fourth Pillar" and Why Amazon is the new Microsoft
See Leadership advice: How Amazon maintains focus while competing in so many industries at once. The video is worth watching.
See Amazon’s “two-pizza teams”: The ultimate divisional organization
I take great liberties paraphrasing Eric Raymond’s “The Cathedral and the Bazaar”
Knowledge-Worker Productivity: The Biggest Challenge by Peter F. Drucker, California Management Review vol. 41,no. 2 winter 1999. Italics from the original.
In The Tipping Point, Malcolm Gladwell wrote about Gore and Associates, a well-known Bazaar company. Gladwell attributes this quote to Jim Buckley of Gore and Associates.
I wrote more about proxies in this post: The Cost Center Trap
McKnight Principles

June 16, 2016

Integration Does. Not. Scale.

In times past, there was a difference between the front office of a business – designed to make a good impression – and the back office – a utilitarian place where most of the routine work got done. The first (and for a long time the predominant) use of computers in business centered around automating back office processes, so of course, IT was relegated to the back office.

As businesses grew, various back office functions developed their own computer systems – one for purchasing, one for payroll, one for manufacturing, and so on. The manufacturing system in vogue when I was in a factory was called MRP – Material Requirements Planning. As time went on, MRP systems were expanded to the supply chain, and then to the rest of the business, where they acquired the name ERP – Enterprise Resource Planning.

Over time it became obvious that the disparate systems for each function were handling the same data in different ways, making it difficult to coordinate across functions. So IT departments worked to create a single data repository, which quite often resided in the ERP system. The ERP suite of tools expanded to include most back office processes, including customer relationship management, order processing, human resources, and financial management.

The good news was that now all the enterprise data could be found in the single database managed by the ERP system. The bad news was that the ERP system became complex and slow. Even worse, enterprise processes had to either conform to “best practices” supported by the ERP suite or the ERP system had to be customized to support unique processes. In either case, these changes took a long time.

ERP Systems Meet Digital Organizations

As enterprise IT focused on implementing ERP suites and developing an authoritative system of record, the Internet became a platform for a whole new category of software, spawning new business models that did not fit into the traditional processes managed by ERP systems. Here are a few examples:

Many software offerings that used to be sold as products are now being sold “as a service”. However, ERP systems were designed to manage the manufacture and distribution of physical products; they don’t generally manage subscription services.
Some companies (Google for example) give away their services and sell advertising. Other companies (such as EBay and Airbnb) create platforms that unite consumers with suppliers, often disrupting traditional industries. In a platform business, the most critical processes focus on driving network effects by facilitating interactions between buyers and sellers. Although ERP systems can manage both suppliers and customers, they usually do not focus on the interactions between them.
The Internet of Things (IoT) brings real time data into many processes, changing the way they are best executed. For example, predictive maintenance of heavy equipment can be scheduled based on sensor data, resulting in better outcomes for customers and thus for the enterprise. ERP suites are intended to support standard practices; they struggle to support processes that change dynamically in response to digital input.
Capitalizing on the availability of data generated by products, companies are moving to selling business outcomes rather than individual products (GE is an example). When you are selling engine thrust or lighting costs, rather than engines or lightbulbs, processes need to be focused on the customer context. ERP systems generally focus on internal processes.
ERP systems are supposed to provide a single, integrated record of important enterprise data, but that data rarely includes dynamic product performance data, information about consumer characteristics and preferences, or other information that has come to be called “Big Data”. This kind of information is becoming an extremely valuable resource, but there isn’t room in ERP databases to store and manage the massive amount of interesting data that is available.

In summary, digitization is bringing the back office much closer to the front office, providing the data for dynamic decision-making, and substituting short feedback loops and data-driven interactions for “best practices.” Since enterprise ERP suites were not built for speed or rapidly changing processes, they are increasingly being supplemented with other systems that manage critical enterprise processes.

Postmodern ERP

In the last few years, in the wake of the success of Salesforce.com, many cloud-based software services have become available. Some target the entire enterprise (NetSuite for example), but many are focused on particular areas (e.g. human resources) or particular industries (e.g. construction). These services are finding an eager audience – even in companies that have existing ERP systems. Today, about 30% of the spend for IT systems is coming from business units outside of IT [1]. If they cannot get the software they need from their IT departments, business leaders are likely to purchase cloud-based services instead.

The cloud reduces dependence on a company’s IT department, so it has become quite easy for various areas of the enterprise to independently adopt “best-of-breed” solutions specifically targeted at their needs, rather than use a single ERP suite across the enterprise. These best-of-breed systems are usually selected by line business leaders and hosted in the cloud. They tend to be faster to implement and more responsive to changing business situations than the enterprise ERP suite – partly because they are decoupled from the rest of the enterprise. Gartner calls the movement from a single ERP suite to a collection of ERP modules from multiple vendors “Postmodern ERP”[2].

Gartner warns that a multi-vendor ERP approach can lead to significant integration problems, and recommends that multiple vendors should not be used until the integration issues are sorted out. Of course, business leaders want to know why integration is important. IT departments typically respond that the ERP’s central database is the enterprise system-of-record; other ERP modules – financial reporting, for example – depend on this database for critical data. Without an integrated database, how will the rest of the enterprise be able to operate? How will the accounting department produce its required financial reports?

Integration Does. Not. Scale

But hold on. There are plenty of very large companies that work remarkably well – and produce financial reports on time – without an integrated system-of-record. In fact, internet-scale companies have discovered that integration does not scale. If we go back to the year 2000, we find that Amazon.com had a traditional architecture – a big front end and a big back end – which got slower and slower as volume grew. Eventually Amazon abandoned its integrated backend database in the early 2000’s, in favor of independent services that manage their own data and communicate with each other exclusively through clearly defined interfaces.

If we have learned one thing from internet-scale players, it’s that true scale is not about integration, it is about federation. Amazon runs a massive order fulfillment business on a platform built out of small, independently deployable, horizontally scalable services. Each service is owned by a responsible team that decides what data the service will maintain and how that data will be exposed to other services. Netflix operates with the same architecture, as do many other internet-scale companies. In fact, adopting federated services is a proven approach for organizations that wish to scale to beyond their current limitations.

Let’s revisit the enterprise where business units prefer to run best-of-breed ERP modules to handle the specific needs of their business. This enterprise has two choices:

Integrate the various ERP modules and store their data in a single ERP database.
Coordinate independently-maintained enterprise data through API contracts.

The problem with the first option is that integration creates dependencies across the enterprise. Each time a data definition in the central database is added or changed, every software module that uses the database must be updated to match the new schema. This makes the integrated database a massive dependency generator; the result is a monolithic code base where changes are slow and painful.

Enterprises that want to move fast will select the second option. They will move to a federated architecture in which each module owns and maintains its own data, with data moving between modules via very well defined and stable interfaces. As radical as this approach may seem, internet-scale businesses have been living with services and local data stores for quite a while now, and they have found that managing interface contracts is no more difficult than managing a single, integrated database.

What Scales

Assume that every team responsible for a process can choose its own best-of-breed software module and is responsible for maintaining its own data in appropriately secure data stores. Then maintaining an authoritative source of data becomes an API problem, not a database problem. When the system-of-record for each process is contained within its own modules, new modules can be added for handling software-as-a-service, two-sided platforms, data from IoT sensors, customer outcomes or other new business model that may evolve. These modules will exchange a limited amount of data through well-defined API’s with the credit, order fulfillment, human resources, and financial modules. Internally, the new modules will collect, store, and act upon as much unstructured data and real time information as may be useful. More importantly, these modules can be updated at any time, independent of other modules in the system. In addition, they can be replicated horizontally as scale demands.

It is the API contract, not the central database, that assures each part of the company looks at the same data in the same way. Make no mistake, these API contracts are extremely important and must be carefully vetted by each data provider with all of its consumers. API contracts take the place of database schema, and data providers must ensure that their data meets the standards of a valid system-of-record. However, changes to an API contract are handled differently than most database schema changes. Each change creates a new version of the API; both old and new versions remain valid while other software modules are gradually updated to use the new version. A wise API versioning strategy eliminates the tight coupling that makes database changes so slow and cumbersome. The reason why federation scales – while a central database approach does not scale – is because with a well-defined API’s strategy, individual modules are not dependent on other modules, so each module can be deployed independently and (usually) scaled horizontally.

When you think of Enterprise ERP as a federation of independent modules communicating via API’s (rather than a database), the problems with multi-vendor ERP systems fade because the system-of-record is no longer a massive dependency-generator that requires lockstep deployments. With a federated approach, business leaders can move fast and experiment with different systems as they become available, and still synchronize critical enterprise data with the rest of the company. In addition, similar processes in different parts of the enterprise can use different applications to meet their unique needs without the significant tailoring expense encountered when a single ERP suite is imposed on the entire enterprise.

What about Standardization?

Won’t separate ERP modules lead to different processes in different parts of the enterprise? Yes, certainly. But the question is – under what circumstances are standard processes important? In the days of manual back office processes, there was lot of labor-intensive work: drafting, accounting, phone calls, people moving paperwork from one desk to another. Standardization in this kind of operating environment made sense and could lead to significant efficiencies. But in a digitized world, the important thing is not uniformity; it is rapid and continuous improvement in each business area. Different processes for different problems in different contexts can be a very good thing.

Jeff Bezos agrees; he believes that the only path to serious scale is to have a lot of independent agents making their own decisions about the best way to do things. This belief was a key factor in the birth of Amazon Web Services, a $10 billion business that keeps on growing. Amazon began its journey away from a big back end by creating small, cross-functional teams with end-to-end responsibility for a service. These teams designed their own processes to fit their particular environment. Amazon then developed a software architecture and data center infrastructure that allowed these teams to operate and deploy independently. The rest is history.

In Conclusion

It is time for enterprise processes become federated instead of integrated. This is not a new path – embedded software has used a similar architecture for decades. Today, almost every successful internet-scale business has adopted some type of federated approach because it is the only way to scale beyond the limitations of the enterprise.

As digitization brings back-office teams closer to consumers and providers, they must join with their front-office colleagues and form teams that are fully capable of designing and improving a process or a line of business. These “full stack” teams should be responsible for managing their own practices, technology and data, meeting industry standards for their particular areas. They should communicate with other areas of the enterprise on demand through well-defined interfaces.

The good news is that you can gradually migrate to a federation from almost any starting point, including an enterprise-wide ERP system. Even better, as IT moves from enforcing compliance with the company’s ERP system to brokering interface contracts and ensuring data security, it becomes a business enabler rather than a bottleneck. And best of all, responsible full stack teams that solve their own problems will create attractive jobs for talented engineers and give business units control over their own digital destiny.

February 10, 2016

The New Technology Stack

Over the last two decades, the software technology stack has undergone a rapid evolution, as this diagram from Docker.io lays out.

The evolution continues. Today’s world of smart phones is giving way to tomorrow’s world of smart devices with sensors and actuators and not much more. The app layer will only get thinner.

If you think this trend will not affect your organization, think again. Tony Scott, CIO of the US federal government, advised CIO’s throughout the country to move to the cloud as fast as possible. Why? Because the large cloud providers can provide more secure, less expensive, and more reliable infrastructure than most organizations can provide for themselves. Major industries, from banking to health care, are discovering the benefits of moving to the cloud. Thin apps and assembled services running on off-premises hardware will soon become the norm for most organizations, probably even yours.

What does the cloud have to do with software development? Quite a bit, it turns out. In the cloud:

1. The development team is responsible for product design.
Assembling services is a dynamic process, not a one-time affair.
The thin app is often the only differentiator in the stack.

2. The development team is responsible for its own infrastructure.
When infrastructure is code, one team does it all:
design/code/test/deploy/monitor/maintain.
Keeping things running is a new challenge for many software engineers.

3. Apps must be immune to infrastructure and service failure.
Stateless designs replace object-oriented designs.
Distributed, immutable data sets replace databases.
Things get done through producer/consumer chains.

So here’s the point: Practices designed for the problems of 1995 are not going to work for the problems of 2020. We need to frame today’s and tomorrow’s problems in a way that helps us to identify and tackle them effectively; we need to use fundamental principles to help us ask the right questions. [1]

What are the right questions? Consider this guidance from Taiichi Ohno, the father of Lean:

All we are doing is looking at the time line, from the moment the customer gives us an order to the point when we collect the cash. And we are reducing the time line by reducing the non-value adding wastes.

In the product development world, our timeline starts with a consumer problem instead of a customer order:

We look at the time line from the moment our consumers experience a problem until that problem is resolved. And we reduce the time line by reducing the non-value adding friction.

The technology stack of 1995 generated different kinds of friction than you will find in a modern technology stack. When banks moved to mobile apps a few years ago, they discovered that app development requires an agile approach because the underlying platforms change all the time. While the old technology stack resisted agile practices, the cloud demands them. There is no place for large projects or long release cycles in the new technology stack; agile development is simply table stakes - you need it to play the cloud game.

The new technology stack produces its own friction, a different kind of friction than was typically found in the old stack. This friction is particularly strong in organizations moving from the old to the new technology stack because the transition brings a lot of change to software development. Unfortunately, that change is not always well supported by the organization or welcomed by the software engineers.

Friction Generator #1: Since the new technology stack virtually requires small deployments, the development team can - and should - become deeply involved in designing differentiated products using tight feedback loops. In short, the development team becomes a product team. But frequently this product team does not have the right people (designers, for example), the authority, or the process to make dynamic product decisions. Too often development teams are told what to develop, rather than being asked to move business measures in the right direction. A lot of friction can occur if the organizational structure does not support the concept of fully responsible product teams.

Friction Generator #2: The development team must engineer solutions to quality, reliability and resilience issues that arise after deployment. This requires a different mindset than was common with the old technology stack, when the development team sent their code to the ops department, whose job it was to keep the system running. In the cloud, a team procures and releases to its own infrastructure, and there is no one else to deal with the inevitable problems that occur. Product teams must have the capability, the charter, and the mindset to accept 24/7 responsibility for their deployed code.

Friction Generator #3: The new technology stack is designed to be fault tolerant, not failure proof. This means that any service or app must be able to fail and get restarted at any time, and not produce problems due to these interruptions. But writing "restartable" code [idempotent modules with immutable data sets] is new to most software engineers and is rarely taught in schools. Software engineers skilled at writing code for the new technology stack are in short supply and demand is intense. Good leadership, training, and support are required to help interested software engineers transition to the new languages and paradigms needed to thrive in the cloud.

Friction Generator #4: The old technology stack and associated batch processes encouraged extensive outsourcing, leaving many IT departments without software engineers or even data centers. Today, as software drives differentiation, many firms are attempting to bring software technology back in-house. But they often lack the management experience, organizational structure and personnel policies necessary to attract and retain the skilled software and reliability engineers they need for the new technology stack.

Today, almost every business has to face the fact that their most serious competition is likely to come from companies living in the new technology stack, unencumbered by the old way of doing things. Governments and non-profits must realize that the people they serve have their expectations set by experiences with the cloud. If your organization is living in the old paradigm, it’s time to move on; big back end systems are rapidly becoming the COBOL of the 21st century.

To assess the current situation, take a look at the value stream – the stream of activities that deliver value to customers – and identify areas of friction. In the modern technology stack, friction generators tend to be either deeply technical or highly organizational in nature, as you can see from the discussion above. Unfortunately, these are not usually the problems that companies tackle when they move to modern software development. Why? Quite often the organizational structure is so entrenched that changing it is not considered. Or perhaps the people leading the transition do not understand the underlying technology and the problems presented by the new stack. In either case, the underlying problem becomes an elephant in the room that everyone ignores, while easier challenges - like adopting agile processes - are taken up.

It is important to confront the deep-seated friction generators that people would rather ignore. Start by talking about the elephant, and then actively imagine what your world would be like without that elephant. Once you have a clear vision of the future, you can work out how to move constantly toward that vision by eliminating the most pernicious friction generators, one step at a time. This approach has helped teams and organizations around the world make steady progress in the right direction, and eventually the steady progress adds up to amazing accomplishments.

Identifying, addressing, and overcoming challenging problems is one of the most engaging activities there is. People thrive when their day-to-day work involves getting good at conquering meaningful challenges. Companies do much better when they wake up the sleeping giant in each employee by encouraging them to reduce the friction that gets in the way of delivering value to customers.

If your company is not the highly successful leader-in-its-field that you hoped it would be (and no company ever is), then waiting around for things to change is not likely to make the situation better. Round up your colleagues and assess the situation. Find the elephant in the room and imagine what things would be like if it were gone. And then – since you are smart engineers – you need to engineer a way to get that elephant out of the room. Quit waiting for someone else to do this for you. You’re on.
______________________
Footnote:
1. One proven set of principles for tackling tough technology problems are the Lean principles: Focus on Customers, Energize Workers, Reduce Friction, Enhance Learning, Increase Flow, Build Quality In, Keep Getting Better.