A Seasoned Tester's Crystal Ball

Saturday, September 26, 2020

A Step-Wise Guiding into Automation

Looking at a group of testers struggle with automation, I listened to their concerns:

- Automating was slow. When automating, it would easily take the whole week
to get one thing automated.

- Finding stuff to automate from manual test cases was hard. It was usually a step,
not the whole thing that could be automated.

- It was easy to forget earlier ideas. Writing them down in Jira was encouraged, but
Jira was where information goes to die. If it didn't die on its way like often happened.

I'm sure all their concerns and experiences were true and valid. The way the system of work had been set up did not really give them a good perspective of what was done and not, and things were hard.

In my mind, I knew what was expected of the testing they should do. Looking at the testing they had done, it was all good, but not all that was needed. Continuing exactly as before would not introduce a change we needed. So I introduced an experiment.

We would, through the shared test automation codebase, automate all the tests we thought we could document. No separate manual test cases. Only automated. We would split out efforts so that we could see coverage though the codebase adding first versions of all tests that were quick to add and then add actual automation into them a test at a time. Or even, a step of a test at a time if it made sense.

We would refactor our tests so that mapping to manual tests to automated tests was not an issue, as all tests were targeted to become automation.

None of the people in the team had ever heard of an idea that you'd create tests that had only name and a line of log but agreed to play along. Robot Framework, the tool they were already using, made this particularly straightforward.

Since I can't share actual work examples, I will give you the starter I just wrote to illustrate the idea on documenting like this while exploring, using prime from eviltester as a test target.

*** Settings ***
Documentation       Who knowns what, just starting to explore
...                             https://eviltester.github.io/TestingApp/apps/eprimer/eprimer.html

*** Test Cases ***
Example test having NOTHING to do with what we are testing but it runs!
    [Tags]      skeleton
    Log         Yep, 1/1 pass!

This is already an executable test. All it does is that it logs. The name and log message can convey information on the design. Using a tag shows numbers of these in the reports and logs.

Notice that while the Documentation part of this identifies my test target, there is actually nothing automated against it. It is a form of test case documentation, but this time it is in code, moving to more code, and keeping people together on the system we are implementing to test the system we have.

As I added the first set of skeleton tests and shared them with my team, they already were surprised. The examples showed them what they were responsible for, which was different from their current understanding. And I designed my placeholders already in a way that can be automated. I had placeholders for keywords that I recognized while designing, and I had placeholders for test cases.

Finally, at work, I introduced a four level categorization of "system tests" in automation:

Level 1 is team functionality on real hardware.

Level 2 is combining level 1 features

Level 3 is combining level 1 for user relevant flows

Level 4 is seeing the system of systems around our thing.

The work on this continues at work, and the concreteness of this enables me to introduce abilities to test the team may have assumed they can't have. Also, it enables them to correct me, a complete newbie to their problem domain on any of the misunderstandings I have on what and how we could test.

The experiment is still new, but I am also trying it out with the way I teach exploratory testing. One of the sessions I have been creating recently is on using test automation as documentation and reach. With people who have never written any automated tests, I imagined browser tests in Robot Framework might do. They do, but then it takes the time from testing to tool learning. Now I will try if the first step of creating skeletons enables a new group to stick with exploration first, and only jump into a detail of automating after.

Introducing Exploratory Testing

"All testing is exploratory".

I saw the phrase pop up again in my good friend's bio, and stopped to think if we agree or disagree.

All testing I ever do is exploratory. All testing that qualifies as good testing for me is exploratory. But all testing that people ask from me definitely is not exploratory.

I worked at one product company on two separate batches of about 3 years each, and with 10 years in between. If there is a type of company that needs good, exploratory testing to survive, product companies are this. The whole concept originated 35 years ago from Silicon Valley product companies Cem Kaner was working in. Yet when I first joined, exploratory testing was something we did after test cases.

We wrote test cases, tracked running of those test cases, and had some of the better tooling in continuously moving test target version with our Word-Excel in-house tooling. What made the tooling better than any of the ones I have now in my use was the in-built idea of primarily needing to understand when you last verified a particular test idea, since every change effectively cancels out results of all the things you've done before.

On top of test cases, we were asked to do exploratory testing. We were guided to test our tests so that we did not obey the steps, but introduced variance. We were guided to take a moment after each test to think what we would test because of what we had just learned, and do it. We were guided in taking half-a-day every week in just putting the test cases aside and testing freely.

It was clear that all testing was not exploratory testing.

10 years later, there was no test cases. There was one person who would do "exploratory testing" meaning they would follow user flows to confirm things without documentation, without any ability to explain what they were doing other than rare bugs they might run into, missing out a lot of problems. And then there was test automation that codified lessons of how to keep testing continuously, discovered through detailed exploration finding problems.

It was clear that testing they called now exploratory testing was not exploratory testing. It was test automation avoidance. And the testing they called test automation was the real exploratory testing.

I get to go around companies and inside companies, and I can confirm that we are far from all testing being exploratory. We have still tyranny of test cases in many places. We will have test automation done without exploring while doing it, taking away some of its value.

We have managers asking for a "planned" approach to testing, asking for "requirements coverage". They ask those as a proxy to good testing, and yet in asking those, often end up getting worse testing, testing that is not exploratory.

Opposite to exploratory is confirmatory. It does not seek the holes between the things you realized to ask for, but only things you realized to ask for. And you want more than you know to ask for.

So I keep going to companies, to introduce exploratory testing.

I bring down test cases, I move documenting tests to automation.

I convince people to not care so much for who does what, but work together to learn. Developers are great exploratory testers if only they let go of the recipe where they do what they were told, and own up to creating a worthwhile solution.

I break ideas of exploratory testing as something you do on top of other things testing. I introduce foundations of confirming what we know being true and learning, and then building figuring out unknown on top of it.

We read requirements, we track requirements, we use them as testing starters. But instead of confirming what is there, we seek to find what is not.

All testing will not be exploratory testing while we write and execute tests cases. All testing may have an exploratory angle. But we can do better.

Monday, September 21, 2020

Exploratory Testing and Explaining Time

If you've looked into exploratory testing, chances are you've run into two models of explaining time.

The first one is around the observation that there are four broad categories of "work" when you do exploratory testing, and only one of them is actually taking your testing forward, and thus visualizing the portion of this may be useful. In that model, we split our time to setup, test, bug and off-charter. Setup is anything we do to get ready to test. Test is anything we do to amp up coverage from just starting to getting close to completion. Bug is when we get interrupted by reporting and collaboration on results of testing. And off-charter is when we don't get to do testing but to exist in the organization that contains the testing we do.

The broad categories of work model has been very helpful for me explaining testing time to people over the years. It really boils down to a simple statement: Getting to coverage takes focused time, and if we report the time on it, you may have an idea of testing progressing. Let's not measure test cases or test ideas, but let's measure time that gives us a fighting chance of getting testing done.

The three other categories of time use outside "test" are set up as the possible enemy. Setup takes time - it's away from testing! Finding many bugs - not only away from testing, but requiring to repeat all testing done so far! Off-charter - you're having me sit in meetings! They can also be set up as questions to facilitate a positive impact on the "test", as in investing in setup that makes test time be sufficient, or investing in pairing on bugs that make future bugs less frequent.

The second model making rounds includes a more fine-grained split of activities happening within exploratory testing sessions, that people could even use to explain their sessions for things like daily meetings. Instead of saying you are doing "testing" day after day for that big login feature, you could explain your focus with words like intake (asking around for information and expectations), survey (using the software to map, but not really test), setup (infra and data specifically), analysis (creating a coverage outline), deep coverage (get serious testing done), closure (retesting and reporting).

If we map these activities to the four categories, there's a lot of explanation for setup here: intake, survey, setup, analysis and closure are all mostly setup - they don't really build up coverage, but are necessary parts of doing testing properly.

While the first model has been valuable for me in years of use, I would replace the latter model by finding the words that help you communicate with your team. If these words help, great. If these words silence your team members and create a distance of them not understanding your work, not so great.

The words I find myself using to explain how I progress through a change / feature related exploratory testing are:

what am I investing in: in the now, or for later; getting the job done quick vs. enabling myself and others in the future
what kind of outputs I'm generating: story of my testing, bug reports, mindmap, executable specifications
what kind of output mindset my work has: generative or completion-oriented; some work generates more work, some gets stuff done
whether you see movement: working vs. introspecting; some work looks like hands on keyboard, other look like staring at a whiteboard

For me it is important to add more words to "test" too: mapping, acquiring coverage, completing as well as for "bugs": isolating, documenting, demonstrating.

Looking at the way I work, I explain very little of testing in daily meetings and don't write a report of any kind. Documentation I leave behind is automation. For transferring deeper knowledge, I pair with people, and to get new people started, I write a one page playbook of testing that sets the stage. The people I explain testing to are ones I pair with for either automation or for doing particular testing.

The real question of explaining your time is: Why are you doing it? Do you grow others with it? Do you explain yourself to people who hired you for them to trust you? Or maybe you explain it to yourself as introspection, to figure out how things could be different for you tomorrow.

Sunday, September 20, 2020

There is Non Exploratory Testing

Celebrating my personal 25 years of growth as an exploratory tester doing exploratory testing either all or much of my work time, I regularly stop to think what makes it important to me.

It has given my results as a tester a clear boost over the years. The better I am at that, the more magical the connections of information I can make seem to people who are not as practiced at it. I see everything as a system - people, working conditions and constraints, the software we test, the world around us. And as complex systems, I know I can design probes to change them, but I can'd design the change exactly.

Exploratory testing is how I can get more done with less time and effort.

On side of appreciating how it helped me grow, I look at people around me, some of which are now growing on their own paths, and some which have grown tired but do testing as it is all they know for a career. I recognize plenty of non exploratory testing around me, even if we like to say think that all (good) testing is exploratory. Not all testing is exploratory testing. You can use the very same techniques in testing in a non exploratory and exploratory fashion. Some people still box exploratory testing into that one Friday afternoon a month when they let go of their stiff harnesses and see what happens when it's just a person with the software created, on a quest to learn something new worth reporting.

Exploratory testing is testing with mutually supportive set of ideals and practices. Today, I want to talk about the four ideals that I seek to recognize testing as exploratory testing.

Learning

At core of exploratory testing is learning. And not just learning every now and then, but really centering learning, focusing on learning, and letting learning change your plans.

As you are testing, you come to an idea. Perhaps what you were doing now is tedious or boring, and the negative feelings you had let your subconscious roam free and you remember something completely different and connect it with your application. Perhaps what you came to realize was exactly about what you were observing - asymmetry in functionalities, or feeling that you've seen something like this before.

When learning guides you, now you have a choice: you can park the new idea - maybe make a note of it, you can do the new idea - letting go of what was ongoing now, or you can discard it.

The learning impacts you in the moment, in your short term plans, in your long term plans. It impacts what and how you test, but also what and how you communicate to others. The mindset of learning has you thinking about yourself, your abilities and your reactions, creating tests to yourself. Experimenting new ways that don't come natural, seeing if what you believe of yourself is true, and becoming a backbone on a journey to be a better person.

The tool you're sharpening through learning is you and other people around you.

When you see testing efforts with very little learning happening, test cases being repeated, and focus on reapplying recipes with a low quality retrospects and lack of introspection, you are likely not doing exploratory testing.

When you're being trained to become an exploratory tester, you may see it as you're not given the answers. You're taught to figure out the answers. You're not given a test case with expected values, you're told there is a feature in your software that you need to figure out if it works. You're expected to turn that work that appears 5 minute work to 5 hour work and 5 days work by understanding how it it connected to information and value in software creation. All work of testing has surprising depth to it, and it is easy to miss.

Agency

Another significant ideal in exploratory testing is agency. A word that does not even translate to my own language has become a core of the way I think of exploratory testing. In sociology, agency is defined as the capacity of individuals to act independently and to make their own free choices. Free choice is not exactly what we give to testers in organizations as per common experience. Testers and testing is often very constrained activity, like a project manager of my past once told me: "No one in their right mind could enjoy testing. Marking test cases done daily is the only way to force people to do it.".

Exploratory testing isn't fully free choices, but it's making the testing box of choices free and adding agency to other choices testers can do as full members of their work communities.

Agency is constrained by structure, such as social class (power assigned to testers), ability (skill of exploratory testing, skill of programing, skill of business domain understanding...) and customs (tester's don't make decisions, testers don't program). Exploratory testing is a continuous act of changing those structures to enable best possible testing.

And obviously, best possible testing is a journey we are on, not a destination. The world around us changes and we change with it.

Not everyone in current organizations has agency. Testers are considered low social class somewhere. Testers are not given much choices in how testing is done, process determines that. Testing without testers does not include much choices, process determines that. Our organizations become constrained to our test cases, barely adding per recipe and keeping added tests afloat. That is not exploratory testing.

For me agency explains what draws me to exploratory testing. Having agency, and allowing others around me agency is my core value. I frame it as being a rebel, finding my own path, and ensuring that while serving the world I am in, I'm a free agent, not an item.

Opportunity Cost

Either learning or agency don't yet capture the constraint in the world of exploratory testing. We are all constrained by the time we have available. Opportunity cost is the ideal that we respect our limited time, and use it for best possible outcome. We need to make choices between what we do, and whatever we choose to do, leaves out something else we could not do on that time. Being aware of those choices and making those choices in a way where we also see the cost of things we don't do is central to exploratory testing.

We can spend time hands on with the application or we can spend time creating a software system that does testing of our application. Both are important, valuable activities. We intertwine them under whatever constraints we have in the moment optimizing for the cost towards value.

Not all organizations allow testing to consider opportunity cost. The experiment is set somewhere else. The skills constrain possible choices, and changing skills isn't considered a real possibility.

Systems Thinking

Finally, the fourth ideal around exploratory testing is systems thinking. Software my team creates runs with software someone else created, and our users don't care which of us causes the problem they see. A lot of times teams have developers feeling responsible for their own code, but testers are responsible for the system. Software is part of something users are trying to achieve, that too is part of the system. There's other stakeholders. Software does not exist in a bubble. Exploratory testing starts with rejecting the bubble.

Not all organizations break the bubble for all their teams. They create multiple bubbles and hierarchies.

Conclusion

I've seen too much of testing that is test-case based, recipe based, founded on bad retrospective capabilities and limited to a scope smaller than it needs, due to more reasons that I can list in this moment. The change, as I see it, starts with agency used for learning.

Sunday, August 30, 2020

Pondering on Requirements

I remember the day on my career when I understood that Requirements was something special.

I had spent years with software testing where feedback was welcome even if the resulting changes got prioritized to wait, but this project was different.

I could have seen it coming from the advice people were giving on exact requirements traceability, like we were preparing for the war to defend what was rightfully ours.

I had tested a system to find out it had been built on a version of an open source component that was end of life, with problems that would as the project progressed lead us to a dead end with regards to our ability to react. It had been built in a way where the component change was not straightforward, quite the opposite. I reported this, getting the attention of all of my own organizations management team. We scheduled a meeting with the subcontractor's architect and he played the Requirements card. We had not been specifically saying that putting us into this position with a brand new in progress multi-million system should rely on something different.

My years with this company were filled with experiences like this. Continuous fights over contractual clauses. Meetings where we would discuss movement of money for yet another newly discovered Requirement like one saying that a machine intended to calculate monetary benefits should calculate at least roughly correctly. No, all of those were our mistakes in the Requirements.

Years passed, and I learned to choose my work so that instead of focusing on Requirements, we focused on value and features and progress. Requirements stayed in the role they should be: points of communication, aiming for mutual understanding and benefit to our customers. Useful for testing to know as the rough idea of what we were building, but not the focus or the limit of what there is for that system.

With agile, we learned that epics and stories were not requirements, they were a new kind of intermix of a mini project plan and something requirements-like. With continuous delivery, we could do small slices at a time, and running tested features supported by test automation and a caring team were our new normal.

When Requirements card gets played, it is now played to avoid responsibility on one side of the mutual relationship of building something good. It's played to say there needs to be a list and proof of covering all those, because someone expects something they are sure you cannot do without that. The cost of the proof - not just the direct work but the impact on being able to see things and be motivated - was irrelevant.

Friday, August 21, 2020

A Tester Hiring Experiment - Test with Them

For two summers in a row, for two different companies, I have been in the lovely position of being able to offer a temporary trainee position for some person for that summer. As one can imagine, when there is a true beginner position available, there are a lot of applicants.

Unfortunately, it is really hard to make a difference between an applicant and another when what you look for is potential. Our general approximations of potential are off, and the biases we have will impact our choices.

The first trainee that we selected for last summer came through the HR pipeline. The lucky 5 selected out of thousands to be viewed at the final stages were fascinating people to talk with.

First of all, they all had programming experience. I particularly remember a woman with 2 years in a position in another company that my co-interviewer rejected based on "not knowing this piece trivia means she does not know anything" and a 15-year-old boy with 6 years of programming experience. While I'm delighted the promising young man got a chance, that recruitment filled my heart with hopelessness for anyone who starts later in life.

Again, we are recruiting on potential. Obviously we want the person to contribute in the work. But we want also the person to learn, to grow, and become someone they are not yet when they start their work with us. And you can't see that potential from their past achievements when we talk about an entry level position.

Entry level positions are like placing bets. The chances are we will never know we missed an awesome candidate. Or, chances are, we will know later in life when that person we rejected on trivia shows up as our boss.

With my hopeless heart, I needed an experiment to bring out hope. To balance the last line of trainees all be men, I facilitated creation of another position open to everyone but primarily marketed in women's spaces. I did so well with targeting my marketing that men did not apply. Where you post matters on who you get.

Also, I wanted to try a different criteria. Instead of selecting them based on how they write and talk about their aspirations and experiences, I refused to read their CVs to prioritize who I would talk to. I talked to every single one of them, for 15 minutes. Or rather, as I set expectations already in the invite to apply, we wouldn't talk of them. We would pair test an application because the main skill I would look for in a candidate under my supervision is learning under my supervision.

Looks like I tweeted Fri Jun 21 2019 that I went through 28 people.

    "created_at" : "Fri Jun 21 08:29:11 +0000 2019",
      "full_text" : "Gilded Rose brought us ‘junior developer’, our summer intern. I made 
       28 people do it for 15 minutes each. Never read their CV. It shows interesting 
       things about how you approach a problem. https://t.co/iWebQMcJwx",
      "lang" : "en"

I saw the people on the video calls as we worked together on the same problem, over and over again with the candidate changing. I would write notes of how they approaches the problem and how they incorporated my guidance as we were strong-style pairing on test code. And after a lot of calls, I had a small group of people with tester kind of thinking and learning pattern that I would select from. The final one was based on luck, I put the three names in a hat and pulled one out.

Turns out she was a 47-year old career changer. The moment where I felt like I should have given this rare opportunity to a younger woman was a great revelation on my personal built-in ageism. From acknowledging my bias, I set out to help her succeed.

During the summer, she learned to write test code in python and include that into a continuous integration system. She explored and analyzed a feature, and got multiple things in it corrected. The tests she contributed were our choice, and in hindsight, our choice sucked. She was part of starting a larger discussion on what types of tests are worth it, and what are little value. Her coded tests didn't fail because she couldn't code or analyze a feature, but because we pointed her at a feature that we really should have thought twice on.

The lessons I drew of this are invaluable to me:

Choosing a tester by testing with them is a better foundation
Choosing a tester by testing can happen in short sessions and overall time is better used in this activity over deciphering a CV
The work we allocate to someone starting as new does matter, and their success is founded on our choices
Diversity of our work force will never change if we expect our 15-year old summer trainees to come with 6 years of programming experience. The field evidence shows that late start does not hinder later usefulness.

So this summer, my experiment has been around how I teach at work. I throw new people at the versatility of real project and protect their corner less. I work to make myself somewhat available on moving them forward. And I am delighted with the results I watch after two months of attending on how well they do the basic tester job: finding information, driving fixes and doing some themselves, and automating with a selection of multiple programming languages. Obviously they have more work to do on learning, but so do I, and I have been on this for 25 years.

Saturday, August 8, 2020

Recall Heuristics for Test Design

Good exploratory testing balances our choices what to do now so that whenever we are out of time, we've done the best job testing we could in the time we were given, and are capable of having a conversation about our ideas of risks we have not assessed. To balance choices, we need to know there are choices and recently I have observed that the amount of choices some testers make is limited. A lot of what we call test design nowadays is recalling information to make informed selections. Just like they say:

If the only tool you know is a hammer, everything starts to look like a nail.

We could add an exploratory testing disillusionment corollary:

It's not just that everything starts to look like a nail, we are only capable of noticing nails.

The most common nail of testers that I see is the error handling cases of any functionality. This balances the idea that most common nail programmers see is the sunny day scenario of any functionality, and with the two roles working together, we already have a little better coverage over functionality in general.

To avoid the one ingredient recipe, we need awareness of all kinds of ingredients. We need to know a wide selection of options for how to document our testing from writing instructional test cases, to making freeform notes to making structural notes on individual level to making structural notes on group level to documenting tests as automation as we are doing it. We need to know a selection of coverage perspectives. We need to know that while we are creating programs in code, they are made for people and a wide variety of people and societal disciplines from social sciences to economics to legal apply. We need to know relevant ways things have failed before, being well versed in both generally available bug folklore as well as local bug folklore, and to consider both not failing the same way, but also not allowing our past failures to limit our future potential and drive testing by risk, not fear.

This all comes down to the moment you sit in a team meeting, and you do backlog refinement over the new functionality your team is about to work on. What are the tasks you ensure the list includes so that testing gets done?

In that moment, what I find useful being put on the spot is recall heuristics. Something that helps me remember and explain my thoughts in a team setting. We can't make a decision in the moment, without knowing our options.

I find I use three different levels of recall heuristics to explore what I need to recall my options in a moment. Each level explores at a different level of abstraction:

change: starting from a baseline where the code worked, a lot of times what I get to test is on a level of code commit to trunk (or about to head to trunk).
story: starting from a supposingly vertical slice of a feature, a user story. In my experience though people are really bad at story-based development in teams, and this abstraction is available rarely even if it is often presented as the go-to level for agile teams.
feature: starting from value collection in the hands of customers where we all can buy into the idea of enabling new functionality.

For a story level recall heuristic, I really like what Anne-Marie Charrett has offered in her post here. Simultaneously, I am in a position of not seeing much of story-based development but backlogs around me tend to be on value items (features and capabilities) and the story format not considered essential.

Recall on level of change

The trigger for this level of recall is a chance in code. Not a Jira ticket, but seeing lines of code change with a comment that describes the programmer intent for the change.

Sometimes this happens in a situation of pairing, on the programmer's computer, the two of you working together on a change.

Sometimes this happens on a pull request, someone having made a change and asking for approval to merge it to trunk.

Sometimes this happens on seeing a pull request merged and thus available in the test environment.

This moment of recall happens many times a day, and you thinking quickly on your feet under unknown change is a difference in fast feedback and delayed feedback.

How I recall here:

(I) intent: What is supposed to be different?
(S) scope: How much code changed? Focused or dispersed?
(F) fingerprint: Whose change, what track record?
(O) on it: How do I see it work?
(A) around it: How do I see other potentially connected things still work?

Recall on level of feature

The trigger for this level of recall is need of test planning on a scale of feature to facilitate programmers carrying their share of testing but also to make space for testing.

Sometimes this happens in a backlog refinement meeting, the whole team brainstorming how we would test a feature.

Sometimes this happens in a pair, coming up with ideas of what we'd want to see tested.

Sometimes this happens alone, thinking through the work that needs doing for a new feature when the work list is formed by process implying "testing" happens on every story ticket and epic ticket level without agreeing what it specifically would mean.

(L) Learning: Where can we get more information about this: documents, domain understanding, customer contacts.
(A) Architecture: What building it means for us, what changes and what new comes in, what stays.
(F) Functionality: What does it do and where's the value? How do we see value in monitoring?
(P) Parafunctional: Not just that it works, but how: usability, accessibility, security, reliability, performance...
(D) Data: What information gets saved temporarily, retained, and where. How do we create what we need in terms of data?
(E) Environment: What does it rely on? How we get to see it in growing pieces, and where?
(S) Stakeholders: People we hold space for. Not just users/customers but also our support, our documentation, our business management.
(L) Lifecycle: Features connect to processes, in time. Not just once but many times.
(I) Integrations: Other folks things we rely on.