A Seasoned Tester's Crystal Ball: July 2021

Friday, July 30, 2021

How Would You Test A Text Field?

Long, long time ago we used a question "How would you test a text field?" in interview situations. We learned that there seemed to be a correlation of how well the person had their game together to test for such a simple question, and we noted there were four categories of response types we could see, repeatedly.

Every aspiring tester and a lot of developers aspiring to turn into testers approached the problem as simple inputs and automate it all approach. They imagined data you can put into the field, and automating data when there is a simple way of imagining recognizing success is a natural starting point. They may imagine easily repeatable dimensions even like different environments or speed, and while they think in terms of automating, they generally automate regression not reliability. Typical misconceptions include thinking hardware you run on always matters (well, it may matter with embedded software and some functionalities we use text fields for) or someone else telling them what to test. It used to be that they talked about someone else's test cases, but with agile, the replacement word is now acceptance criteria. Effectively, they think testing is checking against a listing someone else already created, when it is at most half the work.

Functional testers are only a notch stronger than aspiring testers. They come packed with more ideas, but their ideas are dull - like writing SQL into a text field in a system that has no database. It only matters if there is a connection to SQL database somewhere further down the line. So while the listing of things we could try has more width, it lacks in depth of understanding what would make sense to do. Typical added dimensions for functional testers are environments, separating function and data, seeing function from the interface (like enter vs. mouse click), and applying various kinds of lists and tools that focus on some aspect like html validators or accessibility checkers or security checkers. Usually people in this category also talk about what to do with the information that testing provides and writing good bug reports. On this level, when they mention acceptance criteria, they expect to contribute to it.

The highest levels are separated only by what the tester in question starts with. If they start with the *why would anyone use this?* and continue questioning not only what they are told but what they think they know based on what they see, they are Real senior testers, putting every opportunity to test in context of a real application, a real team, and a real organization with real business needs and constraints. If they start with showing off techniques and approaches and dimensions of thinking, they still need work on the *financial motivations of information* dimension. The difference to Close to Senior tester level is in prioritizing in the moment, which is one of the key elements of good and solid exploratory testing. Just because something could be tested it does not mean it has to be, and we make choices on what we end up testing every time we decide on our next steps.

If we don't have multidimensional ideas of what we could do, we don't do well. If we don't put our ideas in an order where we are already doing the best possible work in the time available when we stop without exhausting our ideas, we don't do well.

With years of experience with the abstract question, I started moving to making the question more concrete and sharing something that was a text field on the screen and asking two questions:

What do you want to test next?
What did you learn from this test?

I learned that the latter question in general helps people do better testing than they would without the coaching that sort of takes place there, but I don't want to hire a tester who is so stuck on their past experiences that they can't take in new information and adapt. I've use four text fields as typical starting points:

Test This Box. This is an application that is only a text field and a button, and provides very little context around it. Seniors do well in extracting theory of purpose, comparing it to given purpose, deal with the idea that it is first step to incrementally building the application, learn that while the field is not used (yet), it already displays and that the application has many dimensions in which it fails that are not intended.
Gilded Rose. This is a programming kata, a function that takes three inputs, and inputs could just as well be text fields. Text field is just an interface. The function has a clear and strong perspective to code coverage but also risk coverage - like who said you weren't supposed to use hexadecimal numbers? Using this text field I see ability to learn and this is my favorite one when selecting juniors I will teach testing but who will need to be picking up guidance from me. Also, if you can't see that code and IDE is just a UI when someone is helping you through it, I feel unsure in supporting you in growing to be a balanced contemporary exploratory tester who documents with automation and works closely with developers.
Dfeditor animations pane. This is a real size application where UI has text fields, like they all do. The text field is in context of a real feature, and a lot of the functionality is there by convention. This one reveals me if people discover functionality, and they need to be able to do that to do well in testing.
Azure Sentiment API. This is an API with a web page front end, but ML implementation recognizing sentiments of text automatically. This one is hardest to test and makes aspiring testers overfocus on data. For seniors it really reveals if people can make a difference between feedback that can be useful and feedback that isn't so useful through connections of business and architecture.

Watching people in interviews and trainings, my conclusion is that more practice is still needed. We continue to treat testing as something that is easy and learned on job without much focus.

If I had the answer key to where bugs are, wouldn't I trust that the devs can read it too and take those out? But alas, I don't have the answer key. My job is to create that answer key.

Thursday, July 29, 2021

Tester roles and services

An interesting image came across on my twitter timeline. It looked like my favorite product management space person had been thinking and modeling, and created an illustration of the many hats people usually have around product creation. Looking at the picture made me wonder where is testing? Is it really that one hat for one category of hats? Is it the reverse side of every single one of these hats? Don't pictures like this make other people's specialties more invisible?

As I was talking about this with a colleague (like you do when you have something on your mind), I remembered I had created a listing of the services testing provides where I work. And reading through that list, I could create my own image of the many hats of testing,

Feature Shaper focuses on hat we think of as feature testing.
Release Shaper focuses on what we think of as release testing.
Code Shaper focuses on what we think of as unit testing.
Lab Technician builds systems that are required to test systems.
On-Caller provides quick feedback on changes and features so that no one has to carry major responsibilities alone.
Designer figures out how we know what we don't know about the products.
Scoper ensures there's less promiseware and more empirical evidence.
Strategist sets us on a journey to the future we want for this project, team and organization.
Pipeline Architect helps people with their chosen tools and drives the tooling forward.
Parafunctionalist does testing on the top skills areas extending functional: security, reliability, performance and usability.
Automation Developer extends test automation just as application is extended.
Product Historian remembers what works and what does not and if we know so that we know.
Improver tests product, process and organization and does not stop with reporting but drives through changes.
Teacher brings forward skills and competencies in testing.
Pipeline Maintainer keeps pipelines alive and well so that a failing test ends up with an appropriate response.

With all these roles, the hats overall in my team are distributed to entire team, but already create a reality where no two testers are exactly the same. And why should they be: we figure out the work that needs doing in teams where everyone tests - just not the same things, the same way.

Wednesday, July 28, 2021

The Most Overused Test Example - Login

As I am looking for a particular slide I created to teach testing many, many years ago, I run into other ones I have used in teaching. Like the infamous, most overused test example in particular in the test automation space - the login.

As I look at my old three levels of detail example, I can't help but to laugh at myself.

Honestly, I have seen these all. And yet while it is only a year since I last tested a login that was rewritten, I had zero test cases I wrote down.

Instead, I had to find a number of problems with the login:

Complementing functions. While it did log me in, it did not log me out but pretended it did.
Performance. While it did log me in, it took its time.
Session length. While it did log me in, the two different parts of it disagreed on how long I was supposed to be in, resulting in fascinating symptoms while being logged in long enough combined with selected use of features.
Concurrency. While it did log me in, it also logged me in a second time. And when it did so, it got really confused on which one of me did what.
Security controls. While I could log in, the scenarios around forgetting passwords weren't quite what I would have expected.
Multi-user. While it logged me in, it did not log me out fully and sharing a computer for two different user names was interesting experience.
Browser functions. While it logged me in, it did not play nicely with browser functions remembering user names and passwords and password managers.
Environment. While it worked on test environment, it stopped working on test environment when a component got upgraded. And it did not work in production environment without ensuring it was setup (and tested) before depending on it.

I could continue the list far further than I would feel comfortable.

Notice how none of the forms of documenting testing suggest finding any of these problems.

Testing isn't about the test cases, it's about comparing to expectations. The better I understand what I expect, the better I test. And like a good tester, if I know what I expect, I tell it in advance and it still allows me to find things I did not know I expect - with software under test as my external imagination.

Feeling Testing

I have noticed I feel testing from three perspectives.

I do 1st person testing for any and all software that I personally create.
I do 2nd person testing for my work as a testing specialist in a software team.
I do 3rd person testing by using software.

I feel very different about testing depending on which perspective I take.

When I do 1st person testing, I don't care where testing begins, it is everywhere. I explore my problem, identify the next piece of capability I will be adding, and test everything I rely on as well as what I create. When people later tell me things I already know of, I am annoyed. When people later tell me things that surprise me, I'm delighted. I appreciate everything more if they walk with me, implement with me and not just guide me from the side without walking in these shoes. What helps me do well is having testing always there and not only after it's otherwise complete. Applying my 2nd person hat while doing 1st person testing happens naturally with meetings and end of days interrupting me. I used to hate this feeling of knowing a few ways to go forward and a hundred ways of going backward. With years in sitting with the feelings I deal with it a little better.

When I do 2nd person testing, I go for actively figuring out what the 1st person could have missed. Even when working in close collaboration, I step away from our mutual agreements. I create new models to see what we're missing. I start with the why (value), pay equal attention to what this depends on and what we're creating, and use all of my energy in aspiring for better. I care of timing of my feedback, but I care for sufficiency (completeness) even more. I seek conversations, and expect conversations to change things. Good conversations - clarity of bugs reports, mutual learning - is what gives me joy.

When I do 3rd person testing, every single problem annoys me. I carefully tread in ways that don't show me the most likely problems, because I'm doing my thing not testing thing. Users stumble on problems, testers go seek them. If I find and report, it comes with extra - visibility they wouldn't want or a request to compensate for my losses of service.

The world is currently moving more and more work from 2nd person testing to 3rd person and 1st person testing. We know the 3rd person isn't willing to accept the work and do free labor. We know the 1st person needs that pair because software development is a social effort and working together helps us move (and learn) faster.

I still feel testing is awesome - and too important to be left just for testers by profession. Looking forward to seeing where the mix goes.

Friday, July 23, 2021

Ensemble Programming as Idea Integration

This week with Agile 2021 conference, I took in some ideas that I am now processing. Perhaps the most pressing of those ideas was from post-talk questions and answers session with Chris Lucian where I picked up the following details about Hunter ensemble (mob) programming:

Ideas of what to implement generally come to their team more ready than what would be my happy place - I like to do a mix of discovery and delivery work and would find myself unhappy with discovery being someone else's work.
Optimizing for flow through a repeatable pattern is a focus: from scenario example to TDD all the way through, and focus on the talk is on habits as skill is both built into a habit and overemphasized in the industry
A working day for a full-time ensemble (mob) has one hour of learning in groups, 7 hours of other work split to a timeframe of working in rotations, pausing to retrospect and taking breaks. Friday is a special working day with two hours of learning in groups.

The learning time puzzled me in particular - it is used on pulling knowledge others have, improving efficiency and looking at new tech.

If you recognize others could teach you something, ask for a learning session on that. If you recognize inefficiencies, that is also source of learning sessions. Also pure exploratory stuff of emerging tech is a learning session. @ChristophLucian #Agile2021
— Maaret Pyhäjärvi (@maaretp) July 21, 2021

A question people seem to ask a lot about Ensemble Programming (and Testing) is if this would be something we do full-time and that is exactly what it is as per accounts from Hunter that originated the practice. Then again, with all the breaks they take, the learning time and the continuous stops for retrospective conversations, is that full time? Well, it definitely fills the day and sets the agenda for people, together.

This lead me to think about individual contributions and ensembling. I do not come up with my best ideas while in the group. I come up with them when I sit and stare a wall. Or take a shower. Or talk with my people (often other than the colleagues I work with) explaining my experience trying to catch a thought. Best work-related ideas I have are individual reflections that, when feeling welcome, I share with the group. They are born in isolation, fueled by time together with people, and implemented, improved and integrated in collaboration.

Full 8 hour working days with preset agenda would leave thinking time to my free time. Or making a change in how the time is allocated so that it fits. With so much retrospectives and a focus on kindness, consideration and respect, things do sound like negotiable when one does not fold under group's different opinions.

I took a moment to rewatch amazing talk by Susan Cain on Introverts. She reminds us: "Being best talker and having best ideas has zero correlation.". However, being the worst talker and having the best ideas also has zero correlation. If you can't communicate your ideas and get others to accept them and even amplify them, your ideas may never see the light of day. This was particularly important lesson for me on Ensemble Programming. I had great ideas as a tester who did not program, but many - most - of my ideas did not see the light of day.

Here's the thing. In most software development efforts, we are not looking for the best ideas absolutely. But it would be great that we don't miss out on the great ideas we have in the people we hired just because we don't know how to work together and hear each other our.

And trust me - everyone has ideas worth listening to. Ideas worth evolving. Ideas that deserve to be heard. People matter and are valuable, and I'd love to see collaboration as value over competitiveness.

Best ideas are not created in ensembles, they are implemented and integrated in ensembles. If you can’t effectively bind together the ideas of multiple people, you won’t get big things done. Collaboration is aligning our individual contributions while optimizing learning so that the individuals in the group can contribute their best.

Tuesday, July 20, 2021

Mapping the Future in Ensemble Testing

Six years ago when I started experimenting with ensemble testing, one of the key dynamics I set a lot of experiments around was *taking notes* and through that, *modeling the system*.

At first, I used that notetaking/modeling as a role in an ensemble, rotating in the same cycle as other roles. It was a role in which people were lost. Handing over document that you had not seen and trying to continue from it was harder than other activities, and I quickly concluded the notes/model were something that for an ensemble to stay on common problem, this needed to be shared.

I also tried a volunteer notetaker who would continuously be describing what the ensemble was learning. I noticed a good notetaker became the facilitator, and generally ended up hijacking control from the rest of the group by pointing out in a nice and unassuming way what was the level of conversation we were having.

So I arrived at the dynamic I start with now in all ensemble testing sessions. Notetaking/modeling is part of testing, and Hands (driver) will be executing notetaking from the request of the Brains (designated navigator) or Voices (other navigators). Other navigators can also keep their own notes of information to feed in later, but I have come to understand that in a new ensemble, they almost never will, and it works well for me as a facilitator to occasionally make space for people to offload the threads they model inside their heads into the shared visible notes/model.

Recently I have been experimenting with yet another variation of the dynamic. Instead of notes/model that we share as a group and use Hands to get visible, I've allowed an ensemble to use Mural (post-it wall), on the background to offload their threads with a focus on mapping the future they are not allowed to work on right now because of the ongoing focus. It shows early promise of giving something extra to do for people who are Voices, and using their voice in a way that isn't shouting their ideas on top of what is ongoing but improving something that is shared.

Early observations say that some people like this, but it skews the idea of us all being on this task together and can cause people to find themselves unavailable for the work we are doing now, dwelling in the possible future.

I could use a control group that ensemble together longer, my groups tend to be formed for a teaching purpose and the dynamics of an established ensemble are very different to the first time ensemble.

Experimenting continues.

Wednesday, July 14, 2021

Ensemble Exploratory Testing and Unshared Model of Test Threads

When I first started working on specific adaptations of Ensemble Programming to become Ensemble Testing, I learned that it felt a lot harder to get a good experience on exploratory testing activity than on a programming activity, like test automation creation. When the world of options is completely open, and every single person in the ensemble has their own model of how they test, people need constraints that align them.

An experienced exploratory tester creates constraints - and some even explain their constraints - in the moment to guide the learning that happens. But what about when our testers are not experienced exploratory testers, nor experienced in explaining their thinking?

When we explore alone, we start somewhere, and we call that start of a thread. Every test where we learn creates new options and opportunities, and sometimes we *name a new thread* yet continue on what we were doing, sometimes we *start a new thread*. We build a tree of these threads, choosing which one is active and manage the connections that soon start to resemble a network rather than a tree. This is a model that guides our decisions on what we do next, and when we will say we are done.

The model of threads is a personal thing we hold in our heads. And when we explore together in ensemble testing, we have two options:

We accept that we have personal models that aren't shared, that could cause friction (hijacking control)
We create a visual model of our threads

The more we document - and modeling together is documenting - the slower we go.

I reviewed various ensemble testing sessions I have been facilitating, and noticed an interesting pattern. The ensemble was more comfortable and at ease with their exploratory testing if I first gave them a constraint of producing visible test ideas before exploring. At the same time, they generally found less issues to start conversations on, and held stronger to the premature assumptions they had made of the application under test.

Over time, it would be good for a group to create constraints that allow for different people to show their natural styles of exploratory testing, to create a style the group shares.

Wednesday, July 7, 2021

Working with Requirements

Long, long time ago I wrote down a quote that I never manage to remember when I want to write it down. Being on my top-10 of things I go back to, I should remember it by now. Alas, no.

"If it's your decision to make, it's design. If it's not, it's a requirement." - Alistair Cockburn

Instead, I have no difficulties in recalling the numerous times someone - usually one of the developers - says that something *was not a requirement* is overwhelming. With all these years working to deliver software, I think we hide behind requirements a lot. And I feel we need to reconsider what really is a requirement.

When our customers ask us of things they want in order to buy our solution, there's a lot of interpretation around their requirements. I have discovered we do best with that interpretation when we get to the *why* behind the *what* they are asking, and even then, things are negotiable much more often that not.

In the last year, requirements have been my source of discontent, and concern. In the first project we delivered together, we had four requirements and one test case. And a thousand conversations. It was brilliant, and the conversations still pay back today.

In the second project we delivered together, we had more carefully isolated requirements for various subsystems, but the conversation was significantly more cumbersome. I call it success when 90% of the requirements vanished a week before delivery, while scope of the delivery was better than those requirements let us to believe.

In another effort in the last year, I have been going through requirements meticulously written down and finding it harder because of the requirements to understand what and why we are building.

Requirements, for most part of them, should be about truly the decisions we can't make. Otherwise, let's focus on designing for value.

Thursday, July 1, 2021

Learning while Testing

You know how I keep emphasizing that exploratory testing is about learning? Not all testing is, but to really do a good job of exploratory testing, I would expect centering learning. Learning to optimize value of our testing. But what does that mean in practice? Jenna gives us a chance of having a conversation on that with her thought experiment:

Testing though experiment:

An ATM allows a daily withdraw limit of $300

You can’t ask the stakeholder any questions about the requirement, what tests would you run to understand the requirement better and test this functionality?
— Jenna Charlton she/they 🏳️‍🌈 (@SheWrestlesTest) July 1, 2021

When I first came about Jenna's thought experiment, I was going to pass it. I hate being on the spot with exercises where the exercise designer holds the secret to what you will trip on. But then someone I admire dared to take the challenge on in a way that did not optimize for *speed of learning* and this made me wonder what I would really even respond.

It Starts with a Model

Reading through the statement, a model starts to form in my head. I have a balance of some sort that limits my ability to withdraw money, a withdrawal of some sort that describes the action I'm about to perform, an ATM functionalities of some sort, and a day that frames my limitation in time.

I have no knowledge on what works and what does not, and I don't know *why* it would matter that there is a limit in the first place.

The First Test

If I first test a positive case - having more than that $300 on my balance, withdrawing $300 expecting I get the cash at hand and then any smallest sum on top of that that the functionalities of the ATM allow for, I would at least know the limit can work. But that significantly limits anything else I can learn then on the first day.

I would not have learned anything about the requirement though. Or risks, as the other side of the requirement.

But I could have learned that even the most basic case does not work. That isn't a lot of learning though.

Seeing Options

To know if I am learning as much as I could, it helps if I see my options. So I draw a mindmap.

Marking the choices my 1st test would be making makes it clear how I limit my learning about the requirements. Every single branch in itself is a question of whether that type of a thing exists within the requirements, and I would know little other than what was easily made available.

I could easily adjust my first test in at least giving myself a tour of the functionalities the ATM has before completing my transaction. And covering all ways I can imagine going over after that first transaction getting me to limit would lift some of the limitations I first see as limiting learning over time.

Making choices on learning

To learn about the requirement, I would like to test things that would teach me around the concepts of what scope the limit pertains to (one user / one account / one account type / one ATM) and what assumptions are built into the idea of having a daily limit with a secondary limit through balance.

For all we know, the way the requirement is written, it could be ATM specific withdrawal limit!

Hands on with the application, learning, would show us what frames its behavior fits in, and without time to test first things out, I would just want to walk away at this point.

A Seasoned Tester's Crystal Ball