A Seasoned Tester's Crystal Ball: Exploratory Testing

Showing posts with label Exploratory Testing. Show all posts

Thursday, December 13, 2018

A Pesky Bug that Exploring Would Help With

I work with a particularly great team, and even great teams make mistakes. Many other teams, great or less so, would choose to hide their mistakes. I find I wear our mistakes as a metal of honor, as in having looked at them, figured out what I could try doing differently and going into the future again an experience richer. And looking forward to a different mistake.

In the last weeks, we've dealt with a particularly pesky mistake to make from a tester point of view, because it is a failure in how we test.

As bugs go, different ones show themselves in different ways. This particular one has limited visibility to our customers, as they can only see second order symptoms. But the cost of it has been high - blocking work of multiple other teams, diverting them from their intended use to create some good valuable items for our users, and instead making them create tooling to keep their system alive as we're oversharing data towards them.

So there was a bug. A bad bug. Not a cosmetic one. But also not one visible easily for an end user.

The bug was created by one of our most valued developers.

Since it was created by someone who we've grown to rely on, other people in the team looked at the pull request feeling confident in acceptance. After all, the developer is valued and for a reason of consistency in great work. No one saw the bug.

As we were testing the system, we made few wrong judgements:

We relied on the unit and system level test automation, that tests the functionality from a limited perspective.
We didn't explore around the changes because exploring from another system as user perspective requires special attention and we did not call for it.
We relied on repeating tests as we had before, and none of the tests we did before would have paid attention to the volume of information we were sending.
We had limited availability of team members, and we only see in hindsight that the changes were into a critical component.

So we'll be looking at changes:

Figuring out how the pull requests could work better to identify problems or if they are more about consistency of style and structure as they've grown to be
Figuring out how to better integrate deep exploratory testing activities towards system functionalities (over user functionalities)

I have a few (ehh, 50) colleagues that wasted a relevant amount of time on keeping the mistake from surfacing wider while we did our remedies.

These kinds of bugs would be the ones I'd want to find through exploring. And it would be a reasonable expectation.

Less managing, more testing. My kind is more valuable as not a manager. The work happens hands-on.

Tuesday, December 4, 2018

Testing a Modify Sprite Toolbar

I've been teaching hands-on exploratory testing on a course I called "Exploratory Testing Work Course" for quite many years. At first, I taught my courses based on slides. I would tell stories, stuff I've experienced in projects, things I consider testing folklore. A lot of how we learn testing is folklore.

The folklore we tell can be split to the core of testing - how we really approach a particular testing problem - and the things around testing - conditions making testing possible, easy or difficult as none of it exists in a vacuum. I find agile testing still talks mostly about things around testing, and the things around testing, like the fact that testing is too important to be left only for testers and that testing is a whole team responsibility, those are some great things to share and learn on.

All too often we diminish the core of testing into test automation. Today, I want to try out describing one small piece in the core of testing from my current favorite application under test while teaching, Dark Function Editor.

Dark Function Editor is an open source tool for editing spritesheets (collections of images) and creating animations out of those spritesheets. Over time of using it as my test target, I've come to think of it as serving two main purposes:

Create animated gifs
Create spritesheets with computer readable data defining how images are shown in a game

To test the whole application, you can easily spend a work week or few. The courses I run are 1-2 days, and we make choices of what and how we test to illustrate lessons I have in mind.

Testing sympathetically to understand the main use cases
Intentional testing
Tools for documenting & test data generation
Labeling and naming
Isolating bugs and testing to understand issues deeper
Making notes vs. reporting bugs

Today, I had 1.5 hours at Aalto University course to do some testing with students. We tested sympathetically to understand the main use cases, and then went into an exercise of labeling and naming for better discussion of coverage. Let's look at what we tested.

Within Dark Function Editor, there is a big (pink) canvas that can hold one or more sprites (images) for each individual frame in an animation. To edit image on that canvas, the program offers a Modify Sprite Toolbar.

How would you test this?

We approached the testing with Labeling and naming. I guided the students into creating a mindmap that would describe what they see and test.

They named each functionality that can be seen on the toolbar: Delete, Rotate x2, Flip x2, Angle and Z-Order. To name the functionalities, they looked at the tooltips of some of these, in particular the green arrows. And they made notes of the first bug.

The green arrows look like undo / redo, knowing how other application use similar imagery.

They did not label and name tooltips nor the actual undo/redo that they found from a separate menu, vaguely realizing it was a functionality that belonged in this group yet was elsewhere in the application. Missing label and name, it became a thing they would have needed to intentionally rediscover later. They also missed label and name of the little x-mark in the corner that would close the toolbar, and thus would need to discover the toggle for Modify sprite -toolbar later, given they had the discipline.

The fields where you can write drew their attention the most. They started playing with the Z-order, giving it different values for two images - someone in the group knew without googling that this would have impact on which of the images were on top. They quickly run into the usual confusion. The bigger number would mean that the image is in the background, and they noted their second bug:

The chosen convention of Z-order is opposite to what we're used to seeing in other applications

I guided the group to label and name every idea they tried on the field. They labeled numbers, positive and negative. As they typed in the number, they pressed enter. They missed label and name for the enter, and if they had, they would have realized that in addition to enter, they had the arrow keys and moving cursor out of focus to test. They added decimals under positive numbers, and a third category of input values of text.

They repeated the same exercise on Angle. They quickly went for symmetry with Z-order, and remembered from earlier sympathetic testing they had seen positive value 9 in the angle work already. They were quick to call the category of positive covered, so we talked about what we had actually tested on it.

We had changed two images at once to 9 degree angle.

We had not looked at 9 degrees in relation to any other angle, if it would appear to match our expectations.

We had not looked at numbers of positive angles where it would be easy to see correctness.

We had not looked at positive angles with images that would make it easy to see correctness.

We had jumped to assuming that one positive number would represent all positive numbers, and yet we had not looked at the end result with a critical eye.

We talked about how the label and name could help us think critically around what we wanted to call tested, and how specific we want to be on what ideas we've covered.

As we worked through the symmetry, the group tried a decimal number. Decimal numbers were flat out rejected for the Z-order, which is what we expected here too. But instead, we found that when changing angle from value 1 to value 5.6, the angle ended up as 5 as we press enter. Changing value 4 to 4.3 showed 4.3 still after pressing enter, and would go to 4 only with moving focus away from the toolbar. We noted another bug:

Input validation for decimal numbers would work differently when within same vs. other digits.

As we were isolating this bug, part of the reason why it was so evident was that the computer we were testing with was connected to a projector that would amplify sounds. The error buzz sound was very easy to spot, and someone in the group realized there was asymmetry of those sounds on the angle field and the Z-order field. We investigated further and realized that the two fields, appearing very similar and side by side would deal with wrong inputs in an inconsistent manner. This bug we did not only note, but spent a significant time writing a proper report on, only to realize how hard it was.

Input validation was inconsistent between two similar looking fields.

I guided the group to review the tooltips they did not label and name, and as they noticed one of the tooltips was incorrect they added the label in model, and noted a bug.

Tooltip for Angle was same as for Z-order description.

In an hour, we barely scratched the surface of this area of functionality. We concluded with discussion of what matters and who decides. If no one mentions any of the problems, most likely people will imagine there are none. Thinking back to a developer giving a statement about me exploring their application in Cucumber podcast:

She's like "I want to exploratory test your ApprovalTests" and I'm like "Yeah, go for it", cause it's all written test first and its code I'm very proud of. And she destroyed it in like an hour and a half.

You can think your code is great and your application works perfectly, unless someone teaches you otherwise.

I should know, I do this for a living. And I just learned the things I tested works 50% in production. But that, my friends, is a story for another time.

It's not What Happens at the Keyboard

"What if we built a tool that records what you do when you test?", they asked. "We want to create tooling to help exploratory testing.", they continued. "There's already some tools that record what you do, like as an action tree, and allow you to repeat those things."

I wasn't particularly excited about the idea of recording my actions on the keyboard. I fairly regularly record my actions on the keyboard, in form of video, and some of those videos are the most useless pieces of documentation I can create. They help me backtrack what I was doing, especially when there are many things that are hard to observe at once, and watching a video is better use of my time than trying the same things again on the keyboard - not very often. Or trying to figure out a pesky condition I created and did not even realize was connected. But even on that, 25 years of testing has kind of brought me better mechanisms of reconnecting with what just happened, and I've learned to ask (even demand!) for logs that help us all when my memory fails as the users are worse at remembering than I will be.

So, what if I had that in writing. Or executable format. It's not like I am looking for record-and-playback automation, so the idea of what value those would provide must be elsewhere. Perhaps it could save me from typing details down? But from typing just the right thing - after all, I'm writing for an audience - I would need to clean up to the right thing or not mind the extra fluff there might be.

I already know from recording videos and blogging while testing, that the tool changes how I test. I become more structured, more careful, more deliberate in my actions. I'm more on a script just so that I - or anyone else - could have a chance of following later. I unfold layers I'm usually comfortable with, to make future me and my audience comfortable. And I prefer to do this after rehearsal, as I know more than I usually do when I first start learning and exploring.

A model of exploratory testing starts to form in my head, as I'm processing the idea of tooling from the collection of data of the activity. I soon realize that the stuff the computer could collect data on is my actions on the computer. But most of exploratory testing happens in my head.

The action on the computer is what my hands end up doing, and what ends up happening with the software - the things we could see and model there. It could be how a page renders to be displayed precisely as it is, so that for future, I can have an approved golden master to compare against. It could be recognizing elements, what is active. It could be the paths I take.

It would not know my intent. It would not know the reasons of why I do what I do. And you know, sometimes I don't know that either. If you ask me why I do something, you're asking me to invent a narrative that makes sense to me but may be a result of the human need of rationalizing. But the longer I've been testing, the more I work with intentional testing (and programming), saying what I want so that I would know when I'm not doing what I wanted. With testing, I track intent because it changes uncontrollably unless I choose to control it. With programming, I track intent because if I'm not clear on what I'm implementing, chances are the computer won't be doing it either.

As I explore with the software as my external imagination, there are many ways I can get it to talk to me. What looks like repetitive steps, could be observing different factors, in isolation and chosen combinations. What looks like repetitive steps, could be me making space in my mind to think outside the box I've placed myself in, inviting my external imagination to give me ideas. Or, what looks like repetitive steps, could be me being frustrated with the application not responding, and me just trying again.

Observation is another thing human side of exploratory testing brings. We can have tools, like magnifier glass, to enhance our abilities to observe. But ideas of what we want to observe, and its multidimensional nature are hard to capture as data points, and even harder to capture as rules.

Many times the way we feel, our emotion is what gives another dimension to our observations. We don't see things just with our eyes, but also with how we experience things. Feeling annoyed or frustrated is an important data point in exploratory testing. I find myself often thinking that the main tool I've developed over years comes from psychology books, helping me name emotions, pick up when they come to play and notice reasons for them. My emotions make me brave to speak about problems others dismiss.

Finally, this is all founded on who I am today. What are my skills, habits and knowledge I build upon. We improve every day, as we learn. We know a little more (knowledge), we can do a little more (skills) and can routinely do things a little more (habits). In all of these we both learn, and unlearn.

I don't think any of the four human side parts of exploratory testing can be seen from looking at the action data alone. There's a lot of meaning to codify before tooling in this area is helpful.

Then again, we start somewhere. I look forward to seeing how things unfold.

Wednesday, October 3, 2018

Chartering for Exploratory Testing

As exploratory testing is framed around learning and discovery, done by a person, it is unnatural to split it as per test cases and instead we use time, often referred to as session. Some folks have given suggestions that a session (time-box) is uninterrupted and focused, and that is quite natural thinking of the learning nature of exploratory testing. If you find yourself distracted and interrupted, the likelihood of doing the same starting work many times and not making much of a progress is high. There's different ideas of what the uninterrupted time can be, and also on what types of interruptions really matter so much that you need to break out of your reporting unit.

Some talk about doing a pomodoro - 25 minutes referring to research on how us people can focus. Some talk about at most 2 hours. My personal preference is to deal with a unit of "days of work" or at most "before lunch" and "after lunch" half days and mind a little less about the interruptions.

With session as the unit, before going into that unit of time, it makes sense to stop and think about what would you be doing. Since test cases make little sense, in exploratory testing we've come to talk about charters. Charter is an idea guiding you while you are going into exploration. What would you try to do? What would you focus on? How would you tell if you're done as in task completed, or done as it time run out?

Elisabeth Hendrickson proposed in her book Explore IT a template that would be helpful in agile all-team sharing exploring type of context in sharing the ideas of what needs to be tested with charters. The template to help thinking is:

Explore . . .
With . . .
To discover . . .

I’ve not cared much for the charter template, and rather than looking for a particular form of a charter, I rather think of the timeframe and goal setting for myself. I have no issues of using a user story as my charter, and even using the same user story with an idea of paying attention to a particular perspective on consecutive sessions. A lot of times I cannot even say I have a charter for a specific session other than “get started with testing, figure out what you got done”.

Today, my team’s tester brought in a list of features and perspectives. They were not organized as charter, but it was clear that they could have been. But that would have meant then they would be fixing their ideas of how they combine them prematurely. Sometimes the need to charter (in writing) in agile teams is creating this idea of “check this, done”, where each of them is an open ended quest for information, and can / should both create new charters and transform older charters into something better using the learning the testing done is giving.

If I write charters, I write one for each who is testing, and debrief to create the next ones after the first ones are completed.

A lot of times I don’t need to share charters exploring with others. I need to share questions, ideas of documentation (automation), and and bugs.

There is a problem before chartering where a lot of testers stumble, as per my observation - having the skills to generate versatile ideas. I was watching a candidate for a job today test in front of my eyes and slightly surprised on the low number of ideas they would consider given an application, expecting a specification to prompt them for all things relevant. At best times, spec exists and is useful, never complete. Charters are only as good as the ideas we have to put into them.

Deep Testing and Test Levels

Back in my days of 2002, I have written an article for an academic conference that basically centered around the idea that test levels (as they were taught much then without a "test automation pyramid") while not time-based are useful in agile. These days, I rarely speak of this idea any more, but it is a foundation I speak from.

I came back to think about this after my Deep Testing post a few days ago, as Lisa shared:

For me @maaretp's idea of "deep exploring" relates to @janetgregoryca's "levels of testing". In agile we tend to focus on small granular stories, but we need to test at feature, release & "big picture" levels too - maybe that's broad as well as deep? https://t.co/VhzMBOGgUp
— lisacrispin (@lisacrispin) October 1, 2018

Since I have written about the very same levels, I felt like I wanted to express how I model test levels as a very different idea than the depth of testing. Depth works as a synonym for words like "bad quality" = shallow and "good quality" = deep, and multi-dimensional coverage. Levels as a concept for me is both more shallow and serves a different purpose.

Levels of testing tell me that as an observer of testing, there is one helpful set of glasses I can wear to notice information about the system. Looking at the details of the leaf in a tree, it may be hard for me to appreciate what makes up the tree and why it matters, or how trees make up a forest or how forests belong into the world as lungs of it. Looking at things on different levels leads me to generate a little bit different ideas. I may or may not act on those ideas. I may or may not recognize that those ideas even exist.

That is where depth comes in. If I don't have the skill to use the heuristic of levels to see things, my testing, even if it happens on all of the different levels is shallow. It finds easy to spot bugs, that I'm ready to spot with the learning of the system I have done so far.

Depth speaks about my perceptions of trustworthiness of the testing performed. Shallow is testing that you perform with your mind's eye more closed, with single heuristic applied and not doing complex modeling on multiple dimensions. Deep is testing you do that finds more of the important things, things that are not straightforward, things that are not just stuff users find when left alone, but that users trip on when you watch them using the system and they don't even understand they could be asking for more and better. Deep testing is for the problems where your system is down for 5 minutes and everyone just accepts that because no one can reproduce how you get there and why no one even needs to do anything to recover from the problem. Users just know to go for coffee when that happens.

Tuesday, October 2, 2018

Finding bugs serendipitously

Serendipity means 'lucky accident'. As I speak of doing shallow exploratory testing, a colleague expressed their fear of finding all the bugs they find serendipitously.

"I feel like most of my bugs are serendipitous, and that concerns me."

I wanted to share a story, and a perspective.

As I joined a new job, the one before this, I was determined to do hands-on good quality testing on my first week at the new job. I've had the experiences of joining companies before, where I find myself being trained into the company, without actually doing any of the work I was hired for in the first weeks. And I wanted things to be different. I wanted the old saying of "people taking months in a new job before they are productive" not to be true and set that out as my goal.

As I arrived office, they gave me access to the system I was to test. I could barely get my computer open, log into the system with my credentials, bookmark the page to remember where the system was and I was already dragged into a four-something-hour meeting spree where they poured information into my head I have absolutely no recollection on.

In the afternoon, I returned to my computer with the original determination, and I opened the application only to see a big visible crash.

I had done NOTHING. No use of my brilliant testing skills. Very very shallow testing at best. Anyone would see this problem. Except they did not.

I had serendipitously found a bug of linking one particular subpage of the application (I had managed to click ONE thing after logging in before linking it) that crashed when login was no longer valid, and when we investigated the bug with the developers, we learned it was also the ONLY subpage of that type. I honestly got lucky that day, but I would have over time increased my likelihood of running into this with the ideas to do exactly this I was in control of because of the Elisabeth Hendrickson's Cheat Sheet.

A lot of the depth in testing comes with skill, and knowing how to exercise a variety of ideas. But much of it also comes from serendipity combined with recognizing problems when you see them (a skill!) and just sticking with the applications longer.

Serendipity sounds like just luck, but it is particular kind of luck combined with skill and perseverance.

Monday, October 1, 2018

Deep Exploratory Testing

There's a famous saying by Linus Torvalds:

Given enough eyeballs, all bugs are shallow.

Crowdsourcing references often like to quote this, pointing out that out of the bugs we could find in testing, the users in production end up finding over masses all the relevant ones, even if they did not report. A crowd could do well in hitting a bunch of bugs.

For the purposes of me doing and guiding exploratory testing, I find it really beneficial to think in terms of shallow vs. deep testing. Shallow can be done with less skills, and with less time. Deep testing requires more skills, more insights, a foundation of learning that is built in layers and requires time.

Many people find that agile somehow guides them to only doing shallow testing. They feel their testing is always squeezed to the end of the sprints, and that it is so that development schedule is flexible, while testing schedule is fixed. However, they may fail to see the opportunity of testing continuing after the release, focusing on going deeper.

Shallow testing find shallow bugs. Shallow bugs are easy to find, they are obvious and would become a problem in production immediately. Deep testing finds deep bugs. It may lead us shallow bugs that just take a bit more of an effort to see, combinations and conditions that take time to set up. But it also may lead us to bugs some don't consider bugs: things that threaten the value of the product, things that should be different to be better.

Going deep happens in layers. You don't repeat the same, but you go further, deeper. You start before implementing. You continue while implementing. You don't stop for releasing. You don't have to, because you are not on a project. Agile made it a continuous process where there is no end.

Sum it up, and it totals to deep testing. Miss the skills, and all you get is shallow. The additive way of doing testing is not regression testing. It is finding new perspectives and exploratory testing is the core practice in doing that.

Friday, September 28, 2018

Managing Testing based on Threads

Session-based Test Management is a form of managing testing based on sessions - time boxes where ideally we could maintain focus, have a clear charter and generate some metrics by counting things around time boxes and results. I believe it is not one practice but many, but way too many ways we manage testing based on session get bundled with the original description of a technique that is very specific.

Later on emerged the idea that sometimes assuming you could maintain focus for a time box may not be possible. You might be interrupted, for various reasons. You may be jumping between this thing and that thing, and that way of managing testing was then dubbed Thread-Based Test Management.

Here, I wanted to describe one very recent experience from my work, that utilized managing testing based on threads, and some observations around how we ended up organizing and what unfolded in the activity.

Yesterday, we were wrapping up a release. Making a release in general is still a bit of an effort in my team, even if we have moved from it taking a week into it taking a few hours. But yesterday's release was special. It was a release we were shaping together to introduce two major product upgrade paths. each with their own logic, risk of dependencies in a complex environment and need of many pairs of eyes in verifying the functionality, each configuration and environment and problems we may be experiencing and resolving on the go.

We had strategized on a whiteboard the day before, in a session I considered magical. The three people around the whiteboard built on each other, added knowledge, clarified priorities and made an action plan we were ready to execute. The overall work would be done by eight people, which means a bit of coordination is required unless we somehow magically dance just right together.

In the morning, we got the build we through could be the one. We knew there were many things to do, and started a discussion thread - one single thread - in out teams chat. It did not take long for it to grow to hundreds of messages, and confusion to emerge on who was doing what, who was talking about what and what was really done and what was just a miscommunication. A colleague jumping into the discussion at some point exclaimed the awfulness of mixing it all there, and while I and the two others around the whiteboard could track with the model inside our heads, the other five people and everyone watching was in pain.

Seeing it did not work, we came to two immediate solutions.

First a Jira ticket got created by one of us just announcing they would now track the work there because the discussion was awful. They wrote down 13 steps to making a release, mentioning the two upgrade paths each as one step.

Almost same time, I pulled out the tasks intertwined and introduced them as threads: things we'd try to drive through, with some idea of priority and their status as they were named.

Thread 1 was about "release as usual" - all the moves we had done and practiced for all the previous releases and it was just business as usual.
Thread 2 was about the most business critical of the two upgrade paths, and we had not set up the environment to be able to test that at all.
Thread 3 was about the second upgrade path and we had just identified a blocking bug that needed addressing before it would make sense to finalize it
Thread 4 was a surprise path of fast forwarding thread 3 in a specific way
Thread 6 was doing thread 2 in production environments
Thread 7 was doing thread 3 in production environments

Half a day later, we added thread 5 (because just for the fun of it, of course someone needed to make a joke of an off my one error) on yet adding test automation for thread 2&3 and not accepting to have to do this stuff without help of some tooling ever again.

Teams wasn't making the thread management that easy, jumping focus where ever someone typed in something, and due to the abilities for us to use teams in anything but one long thread, we did not always trust a comment hit the right one. But they did, and we clarified and worked through each thread. The visibility per thread enabled people to call out for help in a more specific way, identify problems with each of the separate goals we were working towards and make priority calls of what we'd do first and later as this was heavily changing as we identified and investigated problems.

The Jira ticket looks clean, as if one person did it all. The threads in the discussion enabled us to start with a task that we discovered as it was happening, and coordinated many people pitching in information from test results, investigations, availability of fixes and plans for next steps and definition of being done with each of the threads.

I wanted to share this experience as a reminder that threads may be a thing you want to visualize to share the load. Uninterrupted time is not something to use on all tasks. Sometimes threads are the best thing you can do. And they may enable discovering the work that really needs doing, whereas Jira ticket gets people to deliver what was asked and forget to discover many things that are implicit.

Friday, September 7, 2018

Three Kinds Of Testing

This week brought me a couple of reminders of a past I wish I had left behind, but one that is still very much day to day to some other testers. This is a past of writing test cases. And when I say writing test cases, I mean the non-automated kind. The documents that help drive testing. The idea that if only we wrote them well enough, anyone could pitch in to any of the testing.

There are organizations that put careful thought into their test case documentation. I'm lucky to be in an organization that puts careful thought into their test execution with emphasis on learning.

Some weeks ago I tweeted that I don't think we need to use both automated and exploratory testing because these are not the same. With this weeks realizations, I think there is three kinds of testing.

I don't think these two are separate. They are only separate if automation is driven by a non-learning mindset and assuming that is and needs to be creates bad automation. https://t.co/NzD2yF2fcF
— Maaret Pyhäjärvi (@maaretp) August 18, 2018

There's the kind of testing I work with. I would call that exploratory testing. It engulfs smart use of tools, programming and even regression test automation in a frame of learning.

There's the kind of testing that test case folks work with. I would call that manual testing. It includes creation of manual procedures for testing, with emphasis on planning ahead of time not so much on learning.

And then there's the kind of testing that all too many test automation folks do. They take a manual test idea, turn it to automation so that whatever is hard is left out. They take their "designs for tests" from somewhere outside their own work.

The first kind is really the only kind. And people doing that kind of testing may identify as testers, test automation specialists, or software developers. It's not about the role, but about the mindset of learning through empirical evidence that seeks to disprove to build a stronger case for the idea that things might work after all.

Thursday, August 30, 2018

Seeing Negative Space

Have you ever heard the saying: "There is no I in the TEAM"? And the proper response to it: "Yes there is, it is hiding in the A-holes". This is an illustration of the idea of negative space having a meaning, and that with the right font, you can very clearly see "i" inside the A.

I'm thinking this because of something I just tested, that made me realize I see negative space a lot as a tester.

The feature of the day was a new setting that I wanted to get my hands on. Instead of doing what a regular user would do and look at only settings that were revealed, I went under the hood to look at them all. I found the one I was looking for, set it False as I had intended and watched the application behavior not change. I felt disappointed. I was missing something.

I opened a project in Stash that is the master of all things settings. I was part of pushing for a central repo with documentation more than a year ago, and had the expectation that I might find my answers there on what I was missing. I found the setting in question with a vague hint saying it would depend on a mode of some sort, which I deducted to mean that it must be another setting. I asked, and got the name of the setting I needed with an obvious name of "feature_enabled". I wasn't happy with just knowing what I needed to set, but kept trying to find this in the master of all things settings, only to hear that since we are using this one is way 1 out of 4, I could not expect to find it here. I just need to "know it". And that the backend system encodes this knowledge, and I would just be better off if I would use the system end to end.

Instead of obeying, I worked on my model of the settings system. There's the two things that are visible, and the two things that are invisible. All four are different in how we use them.

Finding the invisible and modeling on it, I found something relevant.

It's not only what there is that you need to track, you also need to track what is there but isn't visible. Lack of something is just as relevant than presence of something when you're testing.

Tuesday, August 14, 2018

Options that Expire

When you're exploratory testing, the world is an open book. You get to write it. You consider the audience. You consider their requirements. But whatever those are, what matters is your actions and choices.

I've been thinking about this a lot recently, observing some problems that people have with exploratory testing.

There is a group of people who don't know at all what to do with an open book. If not given some constraints, they feel paralyzed. They need a support system, like a recipe to get started. The recipe could be to start with specifications. It could be to start with using the system like an end user. It could be using the system by inserting testing values of presumed impact into all the fields you can see. It could be a feature tour. It could really be anything, but this group of people want a limited set of options.

As we start working towards more full form exploratory testing, I find we are often debriefing to discuss what I could do to start. What are my options? What is possible, necessary, even right? There is no absolute answer for that question, but a seemingly endless list of ways to approach testing, and intertwining them creates a dynamic that is hard if not impossible to describe.

I find myself talking about the concept of options that expire. Like when you're testing, there is only one time when you know nothing about the software - before you started any work on it. The only time to truly test with those eyes is then. That option expires as soon as you work with the application, it is no longer available. What do you then do with that rare moment?

My rule is: try different things different times. Sometimes start with the specification. Sometimes start with talking to the dev. Sometimes start with just using it. Sometimes pay attention to how it could work. Sometimes pay attention on how it could fail. Stop then to think about what made this time around different. If nothing else, the intentional change makes you more alert in your exploration.

I've observed other people's rules to be different. Some people always start with the dev to cut through the unnecessary into the developer intent. That is not a worse way or better way, but a different way. Eventually what matters for better or worse is when you stop. Do the options expiring fool you to believe things are different than they are? Make you blind to *relevant* feedback?

Wednesday, August 1, 2018

My first job and what has changed since

Falling for testing is a story I've shared multiple times in various places. It's not like I intended to become a tester. I had studied the Greek language on the side of high school, and as I was on my first years into University and Computer Science, someone put things together. They talked me into just trying the entry exercise comparing English and Finnish versions of Wordpad with seeded bugs, writing step by step problems reports. And when offered a job on the side, I just did not know how to say no.

Thinking back to that time, I had no clue what I was getting myself into. I did not know that falling for testing was like falling in love.

So I ended up testing Microsoft Access, the database-ish extension to the Office family - in Greek language. As new testers, we were handed out test cases used across multiple languages, and I think my office had four languages to test, Finnish being one of them. Looking back, I went to work whenever I had promised, did whatever was assigned for me, and probably did a decent job in following orders including tracking the work and diligently comparing the English and the Greek versions to identify if there were functional differences to log as bugs. I remember the fear of "QA" which back then meant that some of the senior testers at Microsoft would sample some of our test cases and see if they found problems we missed.

I had a very nice test manager, and as I was generally interested in how programs work, I was allowed to do something called "exploratory testing". I had absolutely no guidance on how to do it. I was just told how many hours I could use on doing whatever I wanted with the application.

Thinking back, I found myself stuck a lot. I had no strategies of how to approach it. I had a database project in mind, so I was basically implementing that stuff, creating some screens. I wasn't particularly diligent in my comparisons to the English version here like with the test cases. I had no ideas of how to think around coverage. With the information I have today, I know I did a bad job. I found no problems. I was handed a blank check and for all I know, I could have used that for just sitting at the coffee table drinking something other than coffee I never learned to enjoy.

Nowadays, if I'm handed a blank check like that (and I regularly am), I pay attention to value that investment provides. I create coverage outlines helping me make sense of what I have covered and realized I could cover. When I feel stuck, I decide on something I will do. I often find myself starting with tutorials or technical help documentation. I select something and figure it out. All of these are things no one told me to do back then.

The pivotal moment between then and now is the time I first time entered a project that had no test cases unless I created some. The change from a passive user of test cases to an active explorer is what sealed the love I still feel for testing.

The book I'm working on (https://leanpub.com/exploratorytesting/) hopes to capture some of the things I wish someone would have taught me when I was new. It builds on the basics to take people closer to testing I do now. That's the vision. Writing it down felt urgent enough to get up in the middle of the night.

Testing is the thing, not testers.

Tuesday, July 31, 2018

Stop thinking like a tester

I'm very much an advocate for exploratory testing, and yet I find myself seeking kind of what Marlena Compton seems to be doing in space of Extreme Programming and Pairing - seeking the practicality, inclusion and the voices that keep shouted down by the One Truth.

Whenever I find people doing good testing (including automation), I find exploratory testing plays a part. The projects lacking exploratory testing are ones that I can break in two hours.

So clearly the focus and techniques I bring into a project, as I apply them are something special.

In this particular project, some of the observations I shared lead to immediate fixes and easing things for whoever came after me. Some of the fixes (documentation) were done in mid-term timeframe, and looking at the documentation now, I don't want to test it, I want to write it better. And some of the fixes remained promise ware (making the API discoverable, which it isn't and the message was well delivered by making a group of people with relevant skills fail miserably with its use).

So sometimes I've found myself saying that I think like a tester. I do this stuff that testers do. It's not manual, so it must be the way I think, as a tester.

I've seen same / similar curiosity and relentless will to believe that things can be different in other roles too. My favorite group of like minded peers is programming architects, and I get endless joy in those conversations where I feel like I'm with my people.

So I came to a conclusion. Saying that we teach how to think like a tester is like brute forcing your thinking patterns on others. Are you sure the way the people think wouldn't actually improve the way you're building things, if you carefully made sure everyone in the teams is celebrated for their way of thinking.

I sum this up as this.

I can't teach you how to think like a tester because my thinking is a mix of being a manager, tester and developer. I don't think learning to think my way is a thing. Being your whole true self, growing is a thing.
— Maaret Pyhäjärvi (@maaretp) July 31, 2018

Be your own, true, unique self and help others do that too. Growing is a thing, but while growing, be careful to not force the good those people already have in the hiding.

It took me so much time to realize what things I do because they are expected of me and my kind, and what I do because I believe it is the right thing for me to do. Appreciating differences should be a thing. Think your way.

Monday, July 30, 2018

The line between Exploratory Testing and Managing It

There's no better way of clarifying one's own thoughts than writing in a blog where one has given themselves permission to learn to be wrong. This is one of those posts that I probably would not write yet if this blog wasn't an ongoing investigation into the way I think around various topics.

A friend shared piece of feedback on what I might be missing from my "What is Exploratory Testing" article, and I cannot decide if I feel it is an omission or if it is how I structure what is what. What they shared is:

What I miss when ppl say they have used ET is visibility and learnings from it too. we should have debriefs to share what had been tested, what learned and what to do next. Then tester will learn more from other ppl (vs learning from own tests only).
— Marko Rytkönen (@_mry) July 30, 2018

I believe that exploratory testing is a separate concept of managing exploratory testing.

Exploratory testing is the idea of skilled testing where learning continuously and letting the learning change next steps is the core. To manage something like that, you end up with considerations like what if you need to convince others that what you are doing is worthwhile beyond reporting discussions starters like bugs or questions? What if you're not given an area you work on by yourself but you need to figure out how to share that area with others?

When I've been trying to understand that line between doing it and managing it, I've identified quite many things some people find absolutely necessary for managing it to a degree they would not be comfortable calling it exploratory testing without it. I've come to the idea that as long as we're not the testers who are like fridge lights only on when the door is closed with bug reporting, any structures around the days of work of tester are optional. They become necessary as there is a group rather than an individual.

For visibility and learnings from the testing I do, it's been years in doing exploratory testing when no one cares in the detail I care. I find myself introspecting, looking at a wall or writing one of these blog posts at times when others did not notice anything was different between this time and another. Learning to learn, learning to critique your own way of doing things, identifying things you can do differently and diligently doing them differently are all parts of self-management within the "days of work" doing exploratory testing.

Exploring in a group

There's such thing as low quality exploratory testing

I'm picking up weak signals, and one of those signals recently has been suggesting that exploratory testing isn't every tester's bread and butter.

First it was an organization that introduced the agile testing idea that developers test. This left testers of traditional background wondering what they would now contribute, and finding themselves unable to figure out what depth in testing would look like. There was cries of dislike for exploratory testing, not knowing what they should do not to repeat the tests developers were already doing and realizing they were no longer able to find problems.

Then it was was an organization that tried out exploratory testing for a limited timeframe before it was time for the traditional test case lead manual testing. The testers were again filled with despair on needing more structure, and expecting the structure to emerge from somewhere outside themselves.

If and when there is something core to exploratory testing in addition to learning, self-management is it. You need to be able to make your own plans, create your own structures of support by selecting from multitudes of examples and reflecting to your own results. Here's one example of what exploring with intent could look like.

The other side of the coin is that people who are not working in exploratory testing find themselves frustrated with the lack of challenge, having to manage and maintain tons of test cases and still getting continuously feedback of missing relevant bugs. Yet when they get out of it, they struggle if they cannot find the depth: the multidisciplinary nature of testing and the vast option of perspectives the others testing could still be missing.

Exploratory testing is skilled. It means the most common way of it showing up in projects is with low quality exploratory testing, and that has very little to add on top of what developers are already capable of doing.

Going Meta: Writing an Article about What Is Exploratory Testing

I'm working on my book: Exploratory Testing, published on LeanPub. LeanPub is a lovely platform, because I can publish versions of my book and the magic of people paying me for the work I'm still in progress of contributing is the best cure for procrastination as an author I know of. Also, paying for this book is a way of financially supporting all the work I put into defining testing and helping people learn it. Obviously, you can also get it for free.

There was a chapter I wanted to write, that was particularly difficult for me: What is Exploratory Testing?

My first version was a list of bullet points:

WHAT IS EXPLORATORY TESTING

more productive
better testing
multidisciplinary
intertwined test design and execution
difference to *manual* and *scripted* and *automated*
starting from scratch vs. continuing on a product
product as external imagination
empirical, evaluation as intent
allows moving into scripting and out of it based on feeling - discretion of the tester centered
recognizing exploring based on what it gives you
it’s about HOW we do testing in a skilled way. Not when, on in what kind of process, or by whom.
performance, improvisation, intentional
learning and modeling for multidimensional coverage
next test influenced by lessons learned on previous tests
can’t tell in advance which tests should be run or in particular how they should be run in detail
test cases documented as an output of testing
scripted approach takes ideas out of designers head and puts them on paper, and assumes people follow that as instructions
three scopes, many ways to manage
premature writing of instructions hinders intellectual processes
limited only by breath and depth of our imagination and willingness to go investigate
enable intake of new ideas into the work immediately
automation is a modern form of documentation
focused intent on what to evaluate, appropriate documentation
discover patterns, opportunities and risks
instead of pass/fail, “is there a problem here?”

I wanted to write something that Medium says you can read in five minutes. This is what I ended up with: https://medium.com/@maaret.pyhajarvi/what-is-exploratory-testing-88d967060145

So now you see the short and the long version. What should I leave out of what I wrote to include more of what I left out?

Friday, July 27, 2018

Three cool recipes to bring exploratory testing to stage

Going into the fourth year of European Testing Conference, where one type of testing I've wanted people to teach practical lessons on is exploratory testing, I find myself still in need of actively convincing skilled and awesome testers to teach testing instead of the things around it.

I'm all for agile testing yet I see it as mostly discussions around testing (who can/should do it, what size of chunks we do it in, how can we do more of it earlier and continuously) - the person doing testing and looking at the application is largely left at their own device.

Exploratory testing is when you pop into a design meeting, what are the questions you choose to ask, problems you choose to pinpoint? It is when you sit next to a developer to pair on programming, what are the problems you pinpoint immediately, what you keep track of for after the pairing, what you know you will need to spend private time on in an environment that would feel it slows you down while pairing if you initiated the move. It is when you listen to people, read documentation and plan for the hands-on time learning with the application, to empirically figure out what you really know and don't know.

I want to see more of stuff on how to really do that. And I know it isn't easy.

On year one of European Testing Conference, I got on the stage myself to deliver a demo talk on Exploratory Testing. I called it "Learning in Layers - A Demo of Exploratory Testing", and started my session off my removing my personal access to the keyboard by inviting just someone I had never met from the crowd to be my hands. For an idea from my head to the keyboard, it needed to be spoken out as intent, followed through with location and details on where in the screen I wanted things to happen. This style of pairing is called Strong Style Pairing, and my demo pair was awesome on speaking back in questions, pointing out things I wasn't seeing.

On year two of European Testing Conference, I convinced Huib Schoots to do a practical exploratory testing session. His version was called "Testopsy" and it was a fun session where the audience first listed what activities we expect to see while someone is testing, and then mapping out which of those activities we were actually seeing. If you need a vocabulary for the activities, Exploratory Testing Dynamics list gives a nice basis for that.

On year three of European Testing Conference, I had Alex Schladebeck step up to the challenge. Her version was a show and tell, with deep insights of what the audience can take away from seeing others.

So, here's three recipes:

Do it strong style paired so that everything must be spoken out loud
Focus on the activities that get intertwined so that you can develop skills in each activity not just the umbrella term
Show what you do and what it finds with a real live in production application

I've since added a fourth recipe I like showing: show how you explore through creating automation. I took that version on stage at Agile Testing Days USA and Selenium Conference India.

What is your recipe? Come tell us in European Testing Conference Call for Collaboration.

Refining a 34 year old practice

Exploratory testing is a term that Cem Kaner coined 34 years ago to describe a style of skilled testing work that was common in Silicon Valley, uncommon elsewhere. When the rest of the world was focusing on plans and test cases and separation of test design and execution, exploratory testing was the word to emphasize how combining activities (time with the application) and emphasizing learning continuously about the application and its risk created smarter testing. The risks exploratory testing is concerned of are not limited to just the application right now, but everything the application goes through in its lifecycle. Automation of relevant parts of tests was always a part of exploratory testing, as the tangible ideas of what to automate next are a result of exploring the application and its risks.

There are a few things in particular that refine what exploratory testing ends up looking like in different places:

Testing skill
Programming skill
Opportunity cost
Outputs required by the domain

Testing skill

Testing skill is about actively looking at an application in a deliberate way of identifying things worth noting in multiple dimensions. It's about knowing what might go wrong, and actively making space for symptoms to show up and building a coherent story of what the symptoms indicate and why that would be relevant.

The less ideas people have about how we could approach an application for testing, the easier job they feel they have at their hands. Shallow testing is still testing.

Programming skill

Programming skill is about identifying, designing and creating instructions the computer can execute. It's about making a recipe out of a thing, and using computer to do varying degrees of the overall activity. When applied with tests, it leaves behind executable documentation of your expectations, or enables you to do things that would be hard (or impossible) do without.

Computers only look at what they're programmed to look at, so the testing skill is essential for test automation.

Opportunity cost

When testing (or building software for that matter), we have a limited amount of effort available at any given time. We need to make choices of what we use the effort on, and one of those choices is to strike a personal and team level balance of how we split the effort between tests worth trying once and tests that turn out to be worth keeping, documenting and/or automating.

We strike a balance of investing into information today and information in the future. We find it hard, if not impossible, to do both deep investigative thinking with the real application and maintainable test automation at the same time. But we can learn to create a balance with time boxing some of each, intertwined in a way that appears as if there was no split.

Outputs required by the domain

Sometimes exploratory testing produces discussions initiated around potential issues. Other times those discussions are tracked in a bug tracking tool and bug reports are the minimum visible output you'd expect to see. And sometimes, in domains where documentation as proof of testing is a core deliverable, test cases are an output of the exploratory testing done.

Some folks are keen on managing exploratory testing with sessions, splitting the effort used into time boxes with reporting rules. Others are keen to create charters for making it visible what time is used on in agile teams as a means of talking / sharing what is the box of exploration.

Your domain defines what outputs look like in scale from informal to formal.

All skilled work relies on availability of that skill. Exploratory testing is an approach, not a technique.

Agile testing is the idea that developers can test, and that testers can contribute before development. Exploratory testing is the idea that testing is skilled activity.
— Maaret Pyhäjärvi (@maaretp) July 26, 2018

Sunday, July 15, 2018

Testing does not improve quality - but a tester often does!

Being a self-proclaimed authority in exploratory testing, I find it fun when I feel the need of appealing to another authority. But the out of the blue comment the awesome Kelsey Hightower made today succinctly puts together something I feel I'm still struggling to say: Testers do their work to save time for stakeholders next in chain.

Exploratory testing is the key to high quality software; a highly intellectual and time consuming activity often pushed off to end-users.
— Kelsey Hightower (@kelseyhightower) July 14, 2018

Actually, nothing in this tweet says that you need a *tester* to do this. It just refers to highly intellectual and time consuming activity, which to me implies that doing something like that might take a bit of time to focus.

With the European Testing Collaboration Calls, I've again been privileged to chat with people who trigger my focus to important bits. Yesterday it was someone stating their observation very much in sync with what Kelsey here is saying: for many of the organizations we look at that go for full hybrid roles, it turns out that the *minority perspective* of the exploratory testers tends to lose in the battle, and everyone just turns into programmers not even realizing why they have problems while in production on scale beyond "this is what we intended to build".

Today in prep for one of the Collaboration Calls, I got triggered with the sentence "testing does not improve quality". I sort of believe it doesn't, especially when it is overly focused on what we intended to build and verifying that. The bar is somewhere and it is not going up.

But as a tester, I've lived a career of raising the bar - through what I call testing. It might have started off like in one organization that 20 % of users were seeing big visible error messages, and that is where the bar was until I pointed out how to reproduce those issues so that the fixing could start. But I never stop where we are now, but look for the next stretch. When the basics are in place, we can start adding more, and optimizing. I have yet to find an organization where my tester work would have stalled, but that is a question of *attitude*. And that attitude goes well with being a tester that is valuable in their organization.

How do you raise the bar through your tester (or developer) role?

Saturday, July 14, 2018

All of us have a test environment

This post is inspired by a text I saw fly by as I was reading stuff in the last hours: All of us have a test environment, but some of us are lucky enough to have a production environment too.

Test environments have been somewhat of a specialty of mine for the 25 years I've spent in testing, yet I rarely talk about them. So to celebrate that, I wanted to take a trip down memory lane.

Lesson One: Make it Clean

I started as a localization tester for Windows applications. After a few decades, I still remember the routines I was taught early on. As I was starting to test a new build (we got them twice a week back then), I would first need to clean my environment. It meant disk imaging software and resetting the whole of Windows operating system to a state close to factory settings. Back then it wasn't a problem that after doing something like this, you'd spend the next hours in receiving updates. We just routinely took ourselves to what we considered a clean state.

As you'd find a problem, the first thing people always would ask you was if it was tested on a clean machine. I can still remember how I felt with those questions, and the need of making sure I would never fail that check.

Lesson Two: You Don't Have to Make it Clean

Eventually, I changed jobs and obviously took with the a lot of the unspoken attitudes and ideas my first job had trained me in. I believed I knew what a test case looked like (numbered steps, y'all!) to an extent that I taught university students what I knew ruining some of them for a while.

I worked on another Windows application, in a company with less of a routine on cleanliness of the test environments, and learned a lot about the fact that when environments are realistic as opposed to clean, there's a whole category of problems of relevance that we are finding. It might have made sense to leave those out as long as we were focusing on localization testing, but it definitely did not make any sense now that I was doing functional testing.

I realized that we were not just testing our code, but our code in an environment. And that environment has a gazillion variables I could play with. Clean meant less variables in play and was useful for a purpose. But it definitely was not all there was.

Lesson Three: The Environment is Not My Machine but It's Also My Machine

Time moved forward, and application types I was testing became more varied. I ended up working with something I'd label as client - server and later on web. The Client - Server application environments no longer were as much under my personal control as the Windows applications I started off with. There was my machine with the client on it, but a huge dependency to a Server somewhere, often out of my reach. What version was where mattered. What I configured the Client to talk to mattered. And I learned of the concept of different test environments that would be under fairly regular new deliveries.

We had integration test environment, meaning the Server environment where we'd deliver new versions fairly often and that was usually a mess. We had system test environment, where we'd deliver selected versions as they were deemed good enough from whatever was done with the Integration test environment. And we had an environment that was copy of Production, as most realistic but also not a place where we could bring in versions.

For most people, these different environments were a list of addresses handed to them, but that was never my approach. I often ended up introducing new environments, rationalizing existing ones with rules, and knowing exactly what purpose each of them could give me with regards to how it impacted the flow of my testing.

Lesson Four: Sometimes They Cost a Million

Getting a new environment wasn't always straightforward, it was usually a few months of making a business case for it and then shopping some rack servers we could hide in our server lab. I remember standing in front of one of these racks, listening to the humming both from it and the air conditioning needed to run that much hardware and being fascinated. Even if it took a few months arguing and a few more months delivering, it was still something that could be done.

But then I started working with Mainframes and a cost of a new environment went from some thousands to a million. It took me two years to get in a new environment while in this environment.

Being aware of the cost (not just hardware but the work to configure), I learned the environments we were working on in even more detail. I would know what data (which day's production copy scrambled) would reside where. I would know which projects would do what testing that would cause the data to change. In a long chain of backend environments, I knew which environments belonged together.

In particular, I knew how the environments were different from the production environment to the extent that I still think of a proud moment in my career when we were taking a multi-million project into production as big bang, and I had scheduled a test to happen in production as the first thing to do, one that we couldn't do elsewhere as the same kind of duplication and network topology wasn't available. And the test succeeded, meaning the application failed. It was one of those big problems and my proudness was centered around the fact that we managed to pinpoint and fix it within the 4 hour maintenance windows because we were prepared for it.

Lesson Five: It Shouldn't be Such a Specialty

Knowing the environments the way I did, I ended up being a go to person for people to check of which of the addresses to use. I felt frustrated that other people - in same kinds of positions that I was holding - did not seem to care enough to figure it out themselves. I was being more successful than others in my testing for knowing exactly what I was testing, and what pieces it consisted of. My tests were more relevant. I had less of "oops, wrong environment, wasn't supposed to work there".

So think about your environments and your attitude towards the environments? Are they in your control or are you in their control?

A Seasoned Tester's Crystal Ball