Wednesday, October 3, 2018

Deep Testing and Test Levels

Back in my days of 2002, I have written an article for an academic conference that basically centered around the idea that test levels (as they were taught much then without a "test automation pyramid") while not time-based are useful in agile. These days, I rarely speak of this idea any more, but it is a foundation I speak from.

I came back to think about this after my Deep Testing post a few days ago, as Lisa shared:

Since I have written about the very same levels, I felt like I wanted to express how I model test levels as a very different idea than the depth of testing. Depth works as a synonym for words like "bad quality" = shallow and "good quality" = deep, and multi-dimensional coverage. Levels as a concept for me is both more shallow and serves a different purpose.

Levels of testing tell me that as an observer of testing, there is one helpful set of glasses I can wear to notice information about the system. Looking at the details of the leaf in a tree, it may be hard for me to appreciate what makes up the tree and why it matters, or how trees make up a forest or how forests belong into the world as lungs of it. Looking at things on different levels leads me to generate a little bit different ideas. I may or may not act on those ideas. I may or may not recognize that those ideas even exist.

That is where depth comes in. If I don't have the skill to use the heuristic of levels to see things, my testing, even if it happens on all of the different levels is shallow. It finds easy to spot bugs, that I'm ready to spot with the learning of the system I have done so far.

Depth speaks about my perceptions of trustworthiness of the testing performed. Shallow is testing that you perform with your mind's eye more closed, with single heuristic applied and not doing complex modeling on multiple dimensions. Deep is testing you do that finds more of the important things, things that are not straightforward, things that are not just stuff users find when left alone, but that users trip on when you watch them using the system and they don't even understand they could be asking for more and better. Deep testing is for the problems where your system is down for 5 minutes and everyone just accepts that because no one can reproduce how you get there and why no one even needs to do anything to recover from the problem. Users just know to go for coffee when that happens.