“Why isn’t my code doing what I expect?”
That’s a question we ask ourselves every day as software engineers, and we’ve built a repertoire of techniques to help us debug code when things go wrong. Tools like breakpoints and strace let us inspect the state of a running system. Assertions and print statements enable us to validate our assumptions about a program’s behavior. Minimal reproducible test cases eliminate distractions so we can direct our attention to just the relevant parts of a faulty system. Without these techniques, day-to-day development would slow significantly.
And yet, when it comes to answering a similar question — “why aren’t my users doing what I expect?” — many of us often throw the best practices we know about debugging code behavior out the window.
We might acquire new users who abandon the product after their first use — and struggle to inspect their mindset the same way we inspect our code. We might endlessly debate how to design a new feature — and not validate the basic assumptions behind the designs we propose. Or we might spend weeks or months adding layers upon layers of functionality — without ever testing a minimal version on actual users to see which parts they find relevant.
Fortunately, it doesn’t have to be this way. The same mindsets and tools that we use to debug our code can help us demystify our users’ behaviors and build products more effectively.
Debug Your Users’ Behavior Like You Debug Your Code’s
Many years ago, I attended a tech talk at Google by Joshua Bloch — the lead architect behind the Java collections API and the author of Effective Java — on API design. When you’re designing interfaces for thousands or millions of developers, getting the API wrong can be costly. It can mean more ramp-up time and more errors from incorrect usage. A good design, on the other hand, is easy to understand and hard to misuse.
Before even writing any code, Bloch explained that he would write up the interface in one or two pages and then actually survey other engineers (either in person or online) on how they would use the API. This was his minimal viable product, or MVP — a tool that allowed him to debug any problems in the API design without writing any code.
The minimal viable product we build for users is analogous to the minimal reproducible test case we build for our code. With a minimal reproducible test case, we ask the question: What’s the smallest piece of code that exhibits the faulty behavior we care about? With a MVP, we ask: What’s the smallest unit of functionality we can build to validate that a user will behave as we expect? Both of these techniques achieve similar goals: they help us focus our energy and efforts on what’s actually important.
Learn to Read Your Users’ Minds
Moreover, just as inspecting program state is critical to debugging code, so too is understanding what goes on inside a user’s head when we’re debugging user behavior.
When a product has significant traffic available, analyzing web and click logs may be sufficient to debug how a user behaves. Etsy, for example, uses a process of continuous experimentation to design their products. 1 When redesigning a product page, for example, they’ll formulate a hypothesis such as “users will purchase more products if they see images of related products on the product page.” They’ll validate that hypothesis by running an A/B test, for instance by showing a banner of related products to a segment of users. And they’ll use their behavioral learnings from that test to shape the next iteration of their design.
Running A/B tests in this fashion is a powerful technique when a clear metric — such as clickthrough rate, signup rate, or purchase rate — defines success. When we don’t have a large enough sample size to run a statistically significant test, or when we want to explore deeper patterns in user behavior, another powerful approach is to study session logs. By combining all behaviorial data from an individual user in the logs, we can build a time-ordered transcript of events for a user’s session. Studying how a user navigated through the product and what actions she took can paint a detailed portrait of a user’s intentions that aggregate numbers from an A/B test often cannot. Google, for example, uses tools like their Session Viewer to study and visualize how users complete search-related tasks. 2
Sometimes, even session logs can’t provide the depth of insight that we need. User tests can be useful in these situations. For example, at Quip, even as we’re iterating on a new product or feature, we’ll actively test new changes we plan to make on actual users. We might let engaged customers or early adopters beta test a feature and then collect their feedback in a doc or talk to them in person. Or we might use a service like usertesting.com to distribute a script of tasks to online users and c detailed videos of them using the product and explaining their thought processes, often within the hour. By making user tests a regular part of the product development process, we build confidence around which designs work, which ones break down, and where we should be investing our efforts.
Tools for debugging user behavior are often right in front of us. We would never write thousands of lines of code without testing smaller pieces to ensure that they worked individually, so why would we use this ineffective approach for building products? Aggressively seek ways to apply the same mindsets you use when debugging your code to debugging your users’ behaviors.
Dan McKinley, “Design for Continuous Experimentation: Talk and Slide”↩
Heidi Lam, et al., “Session Viewer: Visual Exploratory Analysis of Web Session Logs”, Symposium on Visual Analytics Science and Technology (VAST), IEEE (2007), pp. 147-154↩