The beginning of discovery

The beginning of discovery

Lisa Crispin picked up on this small seemingly innocent sentence about the importance of ‘shallow tests’ in discovery

“These [shallow] tests are often essential beginnings to discovery”Anne-Marie Charrett  “Shallow Testing gets a bad rap”

I’m glad she did, thanks Lisa! The idea that cheap easy confirmatory tests are essential to exploration has been a concept I’ve been working on for a few years now. I’ve shared it with a few testers, but have never put the concepts down on paper. Her tweet has prompted me to do so. And I’m happy, as I get to reference one of my all time favourite software testing books( that don’t mention software testing). That is David Khlar’s book, Exploring Science. In fact, I like this book so much, I’ve adapted his work and created a workshop on discovery and exploration.

Typically, when people talk about shallow tests in software testing they mean tests that are cheap to design, execute and observe by the business and confirmatory in nature. Personally, I’m not a huge fan of that term but that’s not the topic of today. (To find out why back read “Shallow Testing gets a bad rap” and Black Box Testing)

In his book “Exploring Science”, David Klahr examines the process of discovery. To do that, he conducts a series of studies where he gives people a robot (BigTrak) and asks them to identify the purpose of a RPT button. In order to figure out what RPT does, the participants hypothesise, then design and conduct tests, then observe & evaluate results. All the time, Khlar is observing their process.

He noticed that participants 1) leant to using a positive test strategy (tests that confirm a hypothesis) and 2) the tests were typically typically cheap, easy to conduct and observe output. In short they used shallow tests.

Here’s a summary of what he found and his observations.

Test Design and Discovery

regions khlar discovery
Region 1, Region 2, Region 3 – Exploring Science – David Khlar

Khlar analysed the design of the tests. He categorised the tests into three regions. The cheap, easy to observe tests reside in region 1. Most participants started with tests in this region.

The trouble is when they pass, they’re ineffective in evaluating information. That is, it was hard to draw a conclusion from the result. Sure, the result indicates the “hypothesis at hand” is right, but it also confirms unconsidered hypothesis. This made it hard to know if the hypothesis was the right one. Khlar describes these tests as having low discriminatory power.

Introducing Region 2. Tests from region 2 are more effective at providing useful evidence. Essentially region 2 test are better designed with evaluation in mind. Because of that, they had greater discriminatory power. When a test is run using tests in Region 2, we can evaluate if this particular hypothesis is true or not. (I’m not going into Region 3 today, go to read the book. )

But here’s the interesting thing. Yes, the tests that passed in region 1 where less effective in evaluation than region 2. But, when they failed, they provided huge value! In fact, they became the blockbusters of the discovery process. (Ok, I made that up). What is going on?

Discovery & a Positive Test Strategy

Khlar observed that most participants followed a positive test strategy. They reasoned that “my theory is that RPT does X. If I am right and I write program Y, then BigTrak will do Z”.

Khlar saw this positive test strategy as beneficial. He writes:

..a positive test strategy may be a useful heuristic in the early stages of investigation, as it allows the participant to determine types [variables to you and me] and instances[tests] that are worthy of further investigation. ..Exploring Science  — David Khlar.

In addition, Khlar observed that these failed tests encouraged participants to re-evaluate their understanding of the RPT button and the nature of the tests they were conducting. In essence, it caused them to double think and go back and assess their understanding.

This was important as the toy has over 30 billion distinct programs to chose from for each experiment. Using a positive test strategy that failed, allowed participants to quickly identified key dimensions that impact the experiments narrowing the search space considerably.

Khlar also highlights another benefit. He adds:

A positive test strategy provides at least a sufficiency test of one’s current hypothesis. Exploring Science  — David Khlar.

In short, at the very least, running a test that’s simple to execute and observe and passes, guarantees that at minimum you have everything in place that you need in order to run your experiments. In the case of Big Trak, this would be, yes I understand how the toy operates and yes, the batteries are not dead etc.

Discovery & Uncertainty

An interesting aside, Khlar also cites Klayman and Ha (1987)  who theorise that when there’s high uncertainty and plenty of unknowns, a positive test strategy is a solid approach to take. (I got lost in the maths here but take a look).

Business Friendly

I’m a big fan of cheap, easy to run tests, not just because of all the good stuff above, but because of the value they represent to people who don’t spend all their days in the bowels of tech and experimentation. People outside of tech have plenty of work to do, it’s not on them to understand the subtle differences in different testing strategies. It’s on us, to provide the information in a language they understood and can get their job done.

Khlar seems to think so too, suggesting how you portray data is critical for discovery, recommending data representation be included in discovery, along side hypothesis and experimentation.

There is no question that search for an effective representation can play a crucial role in the discovery processExploring Science  — Adding a New Space — David Khlar.

My Thoughts

As a software tester, this all makes total sense to me. A huge part of starting to test software involves building a mental model of the system under test. I want to figure out its purpose, its constraints, its idiosyncrasies. I do this not by looking for negative tests, but by drawing on past knowledge, making some simple assumptions, and then running easy to execute tests design that validate my thinking.

This is an essential part of the discovery process.

These tests come from experience, know how and are not to be hidden under the carpet to be dismissed. Instead, they should be worn with pride, a badge to proudly proclaims your curiosity, your willingness to use past knowledge and acquired heuristics.

What’s not to love?