An Agile-Lean Transformation: How Getty Images Transitioned to Scrum and Evolved to Kanban
Find out how applying Agile, Lean, and Kanban thinking helped the IT organization at Getty Images adopt a culture of continuous improvement, moving them first from waterfall, then to sprints and timeboxes, and finally to a continuous flow of features. Read their story below.
Good Migrations: Getty Images Scrumban Marathon
This is an excerpt from Beyond Agile: Tales of Continuous Improvement © 2013, a collection of 10 case studies showing how Lean, Agile, and other schools of management thought can support continuous improvement.
Company: Getty Images
Location: Seattle, Washington State, United States
Industry and Domain: Media
Insights by: Jeff Oberlander
Jeff is currently the Director of Delivery Leadership at AIM Consulting Group in Seattle, Washington. At the time of this story, he was the Senior Director of Application Development at Getty Images. He has more than 25 years of professional software leadership, development, design, and architecture experience at both large and small companies throughout the American Northwest. Jeff is an expert practitioner, coach, trainer, and mentor in Lean and Agile principles and practices, working as a developer on XP teams, transforming teams to Scrum, and working as an enterprise Lean-Agile coach.
In 2005, Getty Images was building their software using traditional waterfall practices. At the time, they were generating over US$500 million in annual revenue, most of which flowed directly through their e-commerce website. The site was seeing 11 million unique visitors per month, delivering over 10 million images, and handling 6 million searches per day. This hugely popular site was a 24-7 global cash register for Getty Images and any site downtime affected revenue instantly.
The Getty Images customer base included major advertising agencies and companies across the world, as well as heavy photo users like Sports Illustrated, People Magazine, Time-Life, and nearly all American morning newspapers. Content delivery had to respond as fast as new worldwide events occurred. Time to market of new features was critical.
Despite the fact that Getty Images had built a booming e-commerce business between 1995-2005, the company found itself with a technology platform that had accumulated a significant amount of technical debt, making it difficult to support further business growth. The development process as well as the underlying code and technology platform were all creating delays in bringing enhancements to market. Simply put, the business had outgrown its current processes and technology. Changes were needed on many fronts if Getty Images were to remain as successful as before.
In August 2005, Getty Images duly embarked on the ambitious task of rebuilding its primary website and all of its primary back-end processing systems from the ground up. This included the image ingestion process, controlled vocabulary for tagging images, and the image search engine. (Nothing much really, just the heart of the entire web site.) It almost goes without saying that this was a critical undertaking for the company, both technically and operationally, and most certainly strategically.
As is often the case with such critical undertakings, there were a few hiccups along the way…
The “Web Vision” project, as it was dubbed, included over 20 project teams and over 150 people, including operations. The project ran for more than two years—a year longer than originally planned. When the new site was eventually delivered, its performance was terrible. It was extremely slow from the end-user perspective and many of the features did not make sense to them, leaving Getty Images with a slew of upset and frustrated customers.
This created the risk of losing both new and existing customers to competing sites. For the following four months, Getty Images found themselves fixing problems, unable to add any enhancements to the site that could deliver business value.
As high-profile and important as the site was to Getty Images, the executive team simply had to find a better way to do things. They could not continue to place their revenue stream at risk by continuing to develop and maintain the gettyimages.com site using processes and technology that were patently not working for them. They had to step out of their comfort zone—and quick.
In search of better answers, the leadership team attended a Lean-Agile Conference for the first time in January 2008.
Enthused by what they saw and learnt, they came back to Getty Images committed to bringing Lean-Agile thinking into the Waterfall world of Getty Images. This is the story of their journey—of how they rolled out Scrum and subsequently Kanban across their entire enterprise of over 25 teams and 200 people, and the lessons they learnt along the way.
The Frying Pan and the Fire
In 2005, the Getty Images web site was under severe strain, seriously affecting both business value and work conditions. Clients and stakeholders could not get new features into the system in time to meet the changing demands of their customers or the market. Production bugs were at an all-time high and software releases took weeks to stabilize both before and after launch.
Production support costs were sky-rocketing. The technical teams were spending long nights and weekends at the office trying to resolve the myriad of issues.
The development teams in particular were bearing a large measure of the pain. With deadlines looming, they regularly endured working late into the night, only to discover breaking issues in the final hour that caused management to hold off on the release altogether. Moreover, after a release, the development team had no insight into customer satisfaction. There was no feature feedback from either customers or stakeholders during development cycles in this period. For the development teams, this was like shooting blindfolded.
Planning for development also involved a considerable amount of wasted effort. Development work was planned using micro-estimation down to the hour by team leads or managers. These estimates were widely inaccurate, and served little purpose. Product owners would write extensive use case scenario documents that didn’t help developers do their work. And on top of that, people were organized in silos according to organizational structure or specialization, creating an unnecessary communication barrier between functional teams when collaborating on a single feature.
In this environment, everyone felt like they were failing. The business could not get the work they needed from IT to meet the demands of the business, and the development and technology services teams worked long hours only to end up with frustrated and unhappy business stakeholders.
Delivery of Requirements
With the way things were at Getty Images, a business owner who wanted features that might take 3-4 days to develop would have to wait a minimum of 3 months to see it materialize on the website. This is because software releases were scheduled every 2 months and requirements had to be defined 1 month before the release cycle. In effect, this resulted in any feature—regardless of priority, size, or business value—having a minimum of 3 months lead time to market. In theory, that is.
Note how Jeff says “everyone felt like they were failing” and that “everyone” means developers, managers, literally everyone. The team felt their work wasn’t good enough. The business felt it wasn’t keeping up with market needs. Managers felt stuck in the middle. This low morale frustrated the entire organization.
Typically, it was much longer because teams first had to plan which features would hit the requirements definition cycle first. Throughout this timeline, requirements would change, features would be moved in and out of releases, and time-to-market continued to stretch while business owners were spending time on wasteful work, creating feature inventories. In reality, time-to-market was 6 months or more in most cases. This was incredibly demoralizing for staff.
Across this 200+ IT department, the organization was structured as follows: Tech Services (traditional operations group), Project Management Office (PMO), Quality Assurance (QA) and Application Development. Each of these areas had its own Vice President (VP). The PMO and QA oddly were organized under the same VP, but separated from development. The developers were organized under multiple directors around their own specialties, and never really cross-pollinated across those specialties. Not a single team was cross-functional.
Handoffs and Dependencies
Within this organizational structure, cross-team dependencies often affected productivity adversely. In the typical lifecycle of a feature, Project Managers (PMs), started by writing requirements for a feature. Then lead developers would write application architecture documents for the features, developers would implement the features, after which it was finally handed off to QA for testing.
The hidden opportunity cost reflected in this convoluted requirements cycle is staggering. It goes beyond demoralizing your staff. This is business value being held ransom by inefficient process. The process itself creates additional waste making a self-perpetuating cycle geared against release of quality product.
Once a feature entered the testing stage, developers and testers iteratively passed the feature back and forth to each other, each time making minimal effort to address each other’s needs. Bug tracking systems abounded. Issues bounced from one team to the next, and all teams struggled to achieve the ever-elusive Zero Bug Bounce (ZBB) and Zero Resolved Bugs (ZRB) statuses before a release. The last person holding the bug felt the wrath of the VPs looking at open bug lists before the drop-dead date on release windows.
Broken process leading to low morale and productivity
Given the nature of requirements flow and change, coupled with a waterfall process that relied on strict deadlines and handoffs, and the siloed and specialized nature of teams, very little business value was being delivered, yet there was high cost to the business. Simply put, the Information Technology (IT) organization was unable to effectively meet the demands of Getty Images’ digital media business in the time required by this very competitive market. The staff in Getty Images IT always felt busy and overworked and that they were failing to satisfy their business stakeholders.
Infancy of Transition
An unofficial Scrum team in a Waterfall world
In 2005, Jeff was in charge of the Search team which was tasked to rebuild the company’s search engine for the new Web Vision project. Before joining Getty Images, he had spent the previous 5 years adopting and leading Agile practices in other companies. Upon arriving at Getty Images, Jeff transitioned his team to Scrum, making it the only Agile team in a Waterfall organization. This Scrum setup had a virtual cross-functional team that included developers, testers, and a project manager. He worked with the project manager to establish a small product backlog, and then worked with the team to establish a sprint. The team set a technical goal for their first sprint, and proceeded each day with daily Scrums.
At the end of the first sprint, Jeff conducted a sprint review for the business stakeholders, the Project Management Office group, and many others from the technology management team. People were thrilled to see the makings of the new search engine live only one month into the start of the project. This was the beginning of the first cross-functional and agile team at Getty Images.
FAST CHANGE SHOCKS SYSTEMS
Process adoptions often fail due to attempts to “drive change” rather than nurturing it to grow organically. Slower, more deliberate change tends to be more stable because more people involved understand why change is occurring and what specific changes are hoped to provide.
The search team continued to adopt and ingrain this process for themselves over the next two years. At first, they had four-week sprints and monthly demos. The demos were not only about delivering requirements, but also focused on improving stakeholders’ visibility into the team process. The team could not influence the process of planning releases, nor could they change any deadlines or requirements, but they changed the development process within their team, as that was something they could actually influence.
They employed every Scrum tactic and tool. The team had its own product backlog, burndown charts, daily Scrums, and retrospectives. To create the new product backlog, the project manager wrote up stories based on material she pulled out of the larger out-of-date requirements documents.
Over time, other development teams started participating in the Search Team demo and sprint reviews. The demos were increasingly frequented by new teams, as well as their accompanying stakeholders, including executive management, program managers, and anyone interested in visibility into what was happening in development (which was most people).
It is also deeply motivating for the people doing the work to be given the opportunity to deliver achievable, measurable tasks that, when seen together, combine to form an important whole.
As the demos became more popular, the Search Team developed a reputation for its productivity and predictability—delivering sprints of work month after month reliably. Since this was new development which would eventually launch a new website, the sprints were building up to the final product which would, when complete, flip all customers from the old web site to the brand new one.
Compared to many of the other teams, the Search team’s ongoing success month after month was extremely refreshing for the company. Their performance started receiving a lot of attention from the executive leadership, enabling Jeff to hire another agile development manager. This manager also converted his team to Scrum, and they too showed an incredible productivity improvement in a very short timeframe.
Scrum and Agile were starting to catch on in Getty Images.
After investing two extremely long and arduous years, only to be hammered repeatedly by poor customer experience, the executive team was finally ready for some big process changes. In January 2008, Jeff organized a leadership outing to a free one-day Lean-Agile overview workshop in Seattle. The SVP of Technology, SVP of E-commerce, and several VP’s, PM’s and Getty Images management staff attended. The principles discussed were very enthusiastically received by the entire group. Given that there were proven success stories within Getty Images at this time, a decision was made to convert the whole development organization to Scrum and focus on Agile principles as an organization.
The question on the table was how? While converting one team at a time was a nice safe concept, the reality was that—given all of the dependencies between teams and Getty Images’ tightly coupled release system—everyone really had to make the jump at the same time. Fortunately, by this time, there were several grass roots agile advocates and leaders appearing on the floor—developers, project managers and Product Owners. Jeff also had agile partners in the PMO, who were very enthusiastic advocates and coaches within their own group. So even though the IT organization was still siloed, the leadership/coaching partnership between IT and the PMO really set the stage for success.
The first order of business was to create independent, cross-functional teams. This change meant that there would no longer be a separate Quality Assurance group in the organization. All teams would henceforth consist of both developers and testers under a single manager. Each team was given full vertical ownership of a particular product.
Product Owners and Backlogs
The next step was redefining the roles of project managers in the Project Management Office and converting them to Product Owners. These newly minted Product Owners started creating product backlogs and wrote user stories for the teams they were designated to. For the first time user stories, product backlogs, and acceptance test concepts were being used across the enterprise.
Automated Testing Up Front
Another big change followed soon after—moving testing up front in the process and automating it. At the time, Getty Images had many black box UI testers who were used to testing the product visually from the outside, and did not have a high level of technical skill. This change required the business to hire several new testers who could program. Many internal people were also re-trained.
Single Week Sprints
With the fundamentals in place, the basic Scrum framework was rolled out across all of the teams. One of the key decisions here was to put all teams on the same sprint schedule. The leadership decided the best way to move the concepts along quickly was to make the sprints as small as possible. So they chose one-week sprints across the board. This was extremely uncomfortable for most people and teams. They wondered how they would write requirements, develop, and test in a week and actually get to Done—where Done meant a ready-for-production release story. This was precisely the Waterfall mindset that the short sprint cycles set out to break.
DISCOMFORT AS A TOOL
Used well, like here, sprints can achieve very positive results, as it demands an entire re-imagining of how work gets done, as long as the integrity of the story—actual, workable software—is maintained. Here we see the unnatural one-week delivery cycle be so uncomfortable that it invited immediate focus and promoted future experimentation.
It forced teams to move testing to the front, to automate, and to not get too bogged down by design up front. The transition was a lot of hard work. Teams had to figure out, in their own organic way, how to develop and test collaboratively, sometimes even at the same time.
All previous estimation techniques were set aside in favor of story points. Each team had their own scale, although most used the Fibonacci scale. Teams were coached on the value of relative sizing and use of actual figures was strongly discouraged. The idea that story points were a tool to help the team determine what they could get done in a week was the primary benefit that was taught.
Getty Images realized that a wholesale Agile transition could not be achieved successfully without thorough training at all levels of the organization. To start with, all Getty Images IT executives, managers, and project managers were given an overview of Lean-Agile over the course of two days. After that, each team was required to find a ScrumMaster. The chosen ScrumMasters all received Certified ScrumMaster (CSM) training. Each team was also assigned a Product Owner from the Program Management Office, and each Product Owner received Certified Product Owner training.
Creating 25 product backlogs and allowing visibility across the enterprise of each backlog called for the adoption of a Scrum tool that could work across the enterprise. With so many backlogs, the teams needed more clarity and a single place to store the backlogs and product roadmaps. All development teams migrated to a new Agile enterprise tracking tool at once. The teams now had a new process and tools to help make their new approach visible in the organization.
Embedding the Process and Principles
The Getty Images leadership recognized early on that once-off class training would not be sufficient to fully embed Lean-Agile principles and processes across so many teams and people, especially once those people left the cozy classroom, and went back to their work areas to face the reality of delivering product to market. To augment the initial training, Getty Images established an ongoing internal training program that incorporated a Lean-Agile steering committee, designated internal Agile coaches, and set forth a series of standing weekly training meetings.
Informal but sanctioned learning drives intentional organizational change and frees us from overly focusing on the team level.
The Lean-Agile steering committee was comprised of a cross-section of people from upper management and run by Jeff. The committee served as a check-in point to give visibility to progress made and to flag areas in the organization where internal struggles were occurring. Discussing these struggles was essential so that people could explicitly examine both the personal and team issues that were bound to occur with such a big organizational change. Jeff and one other colleague were designated the go-to Agile coaches for the company.
As coaches, they set up training sessions with each team to walk through Lean and Agile principles. The foundation of their approach was to focus on principles rather than the process—the “why” behind the changes. The coaches established weekly training sessions with all the ScrumMasters where they walked them through agile training content and scenarios. Jeff also instituted a weekly Lean-Agile Q&A which was open to everybody in IT. This was a standing meeting where people could come, ask questions, and bring their issues regarding the change to the table for discussion.
From Scrum to Kanban
Getty Images worked in this Scrum mode for two more years with a continuing focus on coaching and mentoring the principles behind Lean and Agile development, continuous improvement amongst the teams, maturation in story writing, acceptance test driven development, automation, paired programming and team metrics (e.g. velocity, delivery of value). Along the way, there were two specific process issues that were never fully resolved.
For one, changing business priorities regularly made it difficult for Product Owners to have great stories ready at the start of a sprint. Although sprints were only one week, and that might seem to be a short enough iteration, it wasn’t. Business needs would change within a few days.
The other issue was being felt in the development teams. Teams found themselves arbitrarily splitting work that was naturally larger than one week into multiple stories. Often, one or both of the smaller stories did not actually deliver business value in isolation. They were writing the stories to fit the sprint, rather than writing the stories to achieve business value. Teams were also reluctant to take on new work towards the end of a sprint (e.g. Thursday or Friday) because doing so meant that they would risk not meeting their sprint goal—a success measure that was actively monitored in the organization.
These two challenges were hampering the smooth flow of work and the delivery of business value.
Initially, Getty could not produce. Much of this was due to a lack of focus. Scrum helped teams improve and release, but as their organization matured, time boxes started to show their limitations: breaking down work beyond what is valuable/shippable and interfering with the natural flow of work.
As with the early adoption of Scrum, a few mature teams started working with Kanban. They put up physical kanban boards and started using Kanban in the context of their Scrum process. Kanban proved to be a highly successful model for these teams. They no longer needed to think about how to split a story arbitrarily, but could focus on story completion, swarming around a single story if required, and achieving consistent throughput and flow of work.
Seeing the success achieved by these teams, Getty Images management recognized that Kanban could help them address the anti-patterns that had evolved in the development teams. More and more teams switched to Kanban and dropped the Scrum framework altogether so they, too, could benefit from the more flexible approach to scheduling and having stories to implement that were actually meaningful. They did, however, retain many of the positive continuous improvement and communication practices such as retrospectives and daily standups.
While team Kanban started solving issues at team level, it didn’t address the other big item. There was the remaining issue of changing business priorities mid-sprint, just before sprints started or faster than Product Owners could do any analysis. So in February 2011, Getty Images made another big leap. They switched to an enterprise pull model. Part of this transition included moving away from user stories, to what Getty Images called Minimum Valuable Features (MVFs), based on the concept of Minimum Marketable Features (MMFs), yet acknowledging that some features add value before being shipped to market.
There was one prioritized enterprise backlog of MVFs that was prioritized by the PMO and the pull was regulated by Development directors being responsible to pull from the top when a team had the capacity (and capability) to take on work. The pulling of work signaled a commitment to the work and Product Owners now only started to write stories once they were committed to.
No more sprints, no more arbitrary timeboxes—just continuous enterprise flow of features!
What’s going on with Getty Images Now
Editor’s note: These insights were gathered prior to publishing the book in March 2013.
After two years of using this enterprise pull model things are still going well. The concept of time-boxed sprints has completely disappeared. The focus is on flow. Their next step is to create better visibility into the metrics around time to market. The need to make cycle time visible has become key. More emphasis is now being placed on Cumulative Flow Diagrams to understand real value delivered across the business, and less on individual team velocity.
Getty Images Application Development looks, performs, and feels like a completely different company than it was in 2005. Morale runs high with employees as work-life balance improved and the stress of business pressure was greatly reduced, all the while delivering much more business value and throughput at any given time. In 2005 a world without project schedules, huge coupled releases, bug databases, highly collaborative one-hat teams, and shipping new software to the business on demand did not seem conceivable. But that is exactly the world of Getty Images today thanks to the foundation of Agile and Lean principles, and Scrum and Kanban as process frameworks for guiding that transformation.