A Seasoned Tester's Crystal Ball: 2021

Tuesday, December 28, 2021

Reflections on a Cadence

When last year was ending, I remember the distinct feeling of now wanting to move as I had before with my end of year reflections. Instead of looking back on yet another year, I wanted to look forward. So instead of writing a summarizing blog post, I thought about what I might want to find my balance on, and came up with this.

I identified five areas of focus to seek balance: paid for hours, paid for value, networking, up skill and self-care. I did many of the things, but some of the things I had plans for turned out less successful.

Let's summarize the numbers:

47 external talks, with 30 different topics/titles
58 blog posts, 3 #TalksTurnedArticles and 1 full course turned text on dev.to
16 rackets recorded with Ru
a regular 7,5 hrs a day job doing details I can take inspiration from but not detail
someone created a wikipedia page of me
4 times Tester of the Day

This totals my talks to 474 since I started public speaking in 2001, and my blog posts to 784 with total of 783,567 page views and my twitter follower number to 7835. That means two years has added me 152 560 page views on my blog and 2173 followers on my twitter. With my blog and twitter being more of public notetaking than writing to an audience, I still find it amazing that some people find value in what I write on.

Overall, the year at work was amazing. Did good stuff with good people.

Instead of reflecting on all of this and feeling overwhelmed, I want to write down some of my failures.

I committed to contributing to "cloud center of excellence" and "company-wide process improvement", only to learn that I hated it. The ideas of finding common denominators for essentially different teams and needs felt like I was not doing anyone a favor, and I walked out of the effort. Being true to how I felt was more important that struggling with work I was not cut out for.
I struggled with exerting energy with people I felt disconnected with regularly. If I did not have my home group of many awesome people around the organization, I would have found myself completely depleted. Tried to fix that by introducing a change and trying a different group, different approach, different challenges.
I failed to notice someone was not speaking to me or listening to me until it blew up 3 months later when I said something they needed to listen to.
I decided to do measuring, and did some, gave up on most of it as there's enough trouble to measure the physical world, I can rethink measuring the invisible in software later.
I avoided writing my books by writing tons of talks and articles. Productive procrastination, but procrastination all the same.
I did not learn to draw, but I learned to test better (again, and still more to do), and I did learn much on python.
I did not learn to communicate without insulting the other party in search of change, and probably never will. The "extending use of pytest to system testing" feel much more dishonest than "removing Robot Framework" even though they essentially mean the same thing and the latter causes multiple people to actively attack me even though I am still only seeking improving efficiency and effectiveness of our testing.
I replaced all other ideas of "self-care" with lots of hugs, deep conversations with my teens and digging a hole for myself on not doing bookkeeping in time but finding family come to rescue when I need them.

Every year I learn about what I can do and can't do. I make choices on accepting some, and changing some. I get many things right, and many things wrong. But I total on the positive.

Because I don't have to go to office, I have more time in my hands. I'm healthier on old dimensions, since I have always been allergic to people with animals, and there's plenty of those at the office. I'm still connected with people who matter to me, and physical location isn't what determines my best connections outside my immediate family.

We built something relevant at work, and I leave things better than they were when I found them. My new place finds use for me, and has lovely group of people, and all this leaving and joining fits the storyline we've built for me. I'm happy that I dare to try things that don't always work out, and step out when that is the right thing to do.

Doing many small things continuously gets you through a lot. So while I can't explain it all of work here, I will get to reflect that - on a cadence - next.

Thursday, December 23, 2021

Robot Framework and the Myth of Embedded Devices

Coming in to projects, I find myself shaping my work and perspectives to balance. Back in the days of being a test manager, working with project managers, I noticed that working with an optimist made me a pessimist; working with a pessimist made me an optimist. Extreme views on one end lead to opposing extreme views on my end.

Robot Framework is like this for me. I respect the fact that I have been dropped into a puddle of Robot Framework love, both in terms of where I work and in terms of being in Finland, and I just don't buy into the love.

Instead, I look at what goes on. Like noticing a Robot Framework Test Automation Expert, who due to technology choices ends up having to live in a team where that wasn't a tool of choice for the team becomes unable to contribute, period. Like noticing the difference in speed of problem to solution in test system space comparing someone specializing in Robot Framework and someone who can also equally fluently use pytest. Like noticing how moving away from Robot Framework increases whole team testing using test automation.

Or like most recently, noticing how a colleague asks for an example on "how to run modbus-protocol device with python and playwright", showing a common example of not thinking what pieces Robot Framework consists of and recognizing that the lovely modbus/nimbus-R library we use at work is a piece of python code we ourselves created for the purposes of running modbus-protocol device with python.

Surely short questions and answers will always be inaccurate and lead to conclusions on understanding, so I decided to take a longer text form - even if to create more confusion.

The key thing here is understanding architectures just enough to know what we connect with. On the rest of it, the chosen framework may help us create clarity of structure we can enable future of our organizations with.

While on hobbies side I do a lot of my examples on either web UI (choosing an appropriate driver for it) or REST APIs (choosing an appropriate driver for it), work wise I have had a mix of web in cloud, web from device, embedded devices, windows UIs, and Windows/Linux Servers just in the last few years. Somehow people think embedded is so different, but it really isn't. It's all hard, it's a little different kinds of hard.

Let's talk about this a bit on the level of architecture and what it might mean for test automation.

Imagine you have two teams, each building an embedded device.

The first one (pink) has a very light set of operating system services, including the idea that instead of thinking of something as advanced as a file system, you have a set of registries you store values in. These devices tend to be highly specialized for their purpose, but purpose from one device like this to another that is seemingly similar can be all different. There's some way of connecting with the device, in form of a cord you can plug in, or some form of wireless messaging. There might be a button somewhere, and there might be nice blinking lights. Be it wired or wireless, there's one or more communication protocols to read and/or write data. If you're in the team testing the pink device, the likelihood is you won't be running your tests inside that device but controlling inputs and observing outputs on one of the connections.

The second one (blue) has embedded linux on it, and even if it is an embedded device, it's not the kind of embedded device you read in the age-old books. It's a full-fledged computer, and with linux, everything is a file! The options of what structures you end up implementing are endless - and uncertain. Again you can connect, wired and/or wireless, most likely with a different set of protocols. You have a button to press, and some lights to see blink in colors. But most of all, you can drop code in this, general purpose test code, and make it do things inside, and collect results when done. That is, should you want to.

To automate, you control whatever your control points are.

Power: you turn on/off power supply the device relies on (requires hardware interface for remote control)
Communication: you send/receive stuff in batches or continuously to run the device on any of the protocols and interfaces you have.
Visual: you "see" lights with light-sensitive test devices or read filesystem for logically turning the lights on/off.

No matter what your test automation tool is, this is an exercise of recognizing interfaces and having libraries available to making accessing those interfaces possible.

Robot Framework nor pytest will run on the pink device. It runs on the controller PC connected to it.

Both could run on the blue device, neither one needs to, unless your test design for a specific feature demands it for proper testing.

Both run on the PC connected to the device(s) available interfaces. Both are capable of orchestrating multiple interfaces.

Instead of asking why I am so vehemently against Robot Framework, try asking why you are so vehemently for it? Did you really compare what the same people can get to in maintaining a test framework where they can name every driver rather than thinking it "comes with Robot Framework" which is, just like pytest, an extendable ecosystem. The first has a "vibrant community" but so does the latter.

You already see the first (because it is the local market leader), why are you so against seeing the latter that you think attacking me for saying I see problems with the first is warranted?

2022 Session Sensemaking

As this year is coming to an end, I have my eyes set firmly on things I want to do next year. Earlier this year I was trying to figure out what is my next career progress theme. Earlier I wanted to:

Become a keynote speaker ✅
Get a page of me on wikipedia ✅

I can still become a better speaker, and a more frequent and more awesome keynote speaker. No one is ever ready, speaking is continuous analysis of themes you would choose to invest time on, and keynote speaking is about choosing some talks you really finetune. I choose new experiences and attempt of learning something new, and don't see keynoting now as a goal I would want to pursue actively, just continue.

I can still get people to collect more of my legacy on wikipedia, and move from Finnish wikipedia to English one, but I am happy how my local community fulfilled my aspiration - and how I don't really know the people who wrote my page. I couldn't value what I have now more than if it was bigger and more visible. Again, not a goal I would want to pursue actively, just continue.

I can't dream of a raise in salary, I have a director level salary as a tester. And I don't want to become a director or VP. I could become a consultant, but I have reserved that goal a few years later, connected with an idea of making money to buy apartments for my two kids before they move out.

I love hands-on work, and improving real results. The theories and words to explain what great testing looks like are just words if we can't turn them into action. And I want to be better, still, at turning my words into actions.

So I chose what is new for 2022:

Upskilling as board member
Building up Finnish community of testing

I applied and got accepted to TIVIA board, and I'm now actively learning what boardroom work would look like. I will seek opportunities in this space beyond non-profits in due time.

I resurrected from hibernation the Finnish testing non-profit I have been leading: Software Testing Finland (Ohjelmistotestaus ry) will see a steady monthly session investment from me in 2022. Because I pair up with Ru Cindrea, the language we use is English / Finnish mix.

So I will run sessions with four "entities":

inside Vaisala: broadcasts (anyone can hear me inside the company if they opt into it), test automation demos and test experiences discussions.
Ohjelmistotestaus ry: monthly sessions to build up local testing community, my primary place to try out sharing new talks that aren't internal
Ëxploratory Testing Academy: free sessions on hands-on activities I am turning into creative commons training materials. I'm adding a theme trio later today with unit, api and UI exercises.
General Conferences, Meetups: I try to say yes to being invited within what my schedules allow as I did this year, creating new talks and repeating what I had already created.

I have my own company (maaretp) on the side, through which I do open and company-specific trainings occasionally, invoicing for the teaching work. I have limited availability to this, as I *want* to invest my regular work weeks in making testing and quality even better at Vaisala, in scale.

Saturday, December 11, 2021

Transforming Agile Ceremonies

Time and time again, I join teams with the usual agile ceremonies and can barely hold myself together. The daily meetings make me cringe. The retrospectives are post-its but conversations are prioritized for majority meaning my things never get talked about. The planning for next increment feels like a forced routine, and what fits in 2 weeks is on center of the stage. The long term planning, refinement, is half-hearted since there is already heads full with what is going on. And the demos, they are either non-existent or the best bit of the whole set.

In a greater picture, "agile" as the ceremonies and discussions on what is the correct way of doing things is something I would prefer to step away from. But I do care about how I feel, and how my colleagues feel, and I care about the results we are able to provide.

What I would like us to do is to take the ceremonies and turn them to their better versions.

Daily should not be about each individual explaining what they did. It should be about each individual synchronizing and pulling work to advance the highest priority items the team co-owns. A good daily creates a shared understanding and we make decisions on the next day more than explain that we continue on the plan. We should already know what goes on, after all we are collaborating through chats and calls on the themes throughout the working day. Or at least we should be.

A better daily centers around the "epics" or "stories" we are delivering - value to the customer. We optimize all our work so that we don't progress everything at once but the topmost item the most and fastest with whatever limits we currently have on our abilities.

Retrospectives should not be about minimizing our difficult conversations to post-its that don't get discussed. It should be a space in which we come together to hear what others have in mind, and sometimes turn that into actions right away. With one of our teams, we had a homework questionnaire for the retro, showing that the team was heavily divided in the opinions. This was never visible in the shared session where loud majority creates an appearance of the truth. We did not agree on actions to fix it, but the mere understanding changed things within just one month - each individual chose their own ways of showing up better for for that particular theme.

A better retrospective is versatile continuous improvement conversation. Sometimes we collect views. Sometimes we agree on solutions. Sometimes we follow one structure and other times another. Everyone's voices should be heard. Minority voices should be amplified. We build the working environment for us all.

Planning should not be about effort estimates and fitting a sprint, but agreeing on the next smallest possible scope of delivering something of value. Estimating should be replaced with not estimating, and the task split should be replaced with trusting people that the value card is enough, and when it isn't enough, the team can use tools to make notes that support them. The cards are best written as part of work intake, a person taking work is describing work they take on, and the person prioritizing can review and collaborate.

A better planning enables us to start together, and make sense of the threads we have ongoing - releases, epics/stories, capability improvements and support us in getting stuff done, even stuff we did not agree for as long as it makes sense. And we know if it makes sense when we understand how the changes fit our overarching vision of what good would look like.

Refinement should not be about co-existing in a meeting room when someone reads us the new things they are hoping we would work on next. It should be about checking changes in understanding of what work we need to discover in ways that requires calendar time, and moving all of us in the team to a common understanding where we can work further from.

A better refinement discusses customer problems and our ideas of how we could prepare for solving them.

None of these require an agenda to be great. We can use the first 5 minutes on discovering the agenda together, and prioritizing it to fit into the timebox. They are just reminders on conversations we usually want to have on different cadences.

For testing, when stories are actually stories instead of tasks we call stories, there is a lovely structure to work from. But not having that structure isn't stopping us from always working on the different timeframes: start early on things that require time (refinement); think about what we're doing and how that fits a bigger picture; follow the changes over the plans and learn continuously.

Friday, December 10, 2021

TDD makes a bad programming test for testers

I'm a tester by trade and heart, meaning that looking at a piece of code, I get my thrills on thinking how it will fail over how can I get it to work. Pairing with a developer who don't understand the difference can be an uncomfortable experience. Seeking weaknesses in something that exists is a bit of a different exercise than building up something.

Imagine an interview situation, going in with "Set up your IDE on language of choice ready to go". A tester pairing with a developer on a "programming skills tests". That setup alone is making me uncomfortable.

With an exercise out of the blue, the usual happens. The expectations in what we're about to do get muddled. They share a gist of numbers written in English as text. Working on code they start with asking for "signature" over explaining the problem. The usual troubles with pairing with someone new.

With an empty canvas, we write the first test selecting something from that list in gist.

def test_1_becomes_one():
    assert int_to_english(1) == "one"

Following red, the test won't pass without implementation. So you add implementation.

def int_to_english(n):
    return "one"

Expectations are about to get more muddled. While I know where this is going with TDD, this is not the thing I signed up for.

I know that I can build up the implementation from here. And that it shows how I would solve this puzzle. But my work starts from seeing one of the many ways the puzzle is solved, and I try to steer towards it (with little success, of course).

I ask to just say the problem in English. Knowing the IDE runs co-pilot, I am hoping to steer the conversation from designing code with TDD to testing code someone designed - critiquing the design that emerged, figuring out what values reveal information that we missed something, seeking limits of the solution.

The English comes off as

#function to convert integer to text

It's not particularly good English of the problem, and if I knew what the pair holds in their head, I could have improved the English to

#function to convert integer to English for values from 0 to 999

Copilot is super-helpful giving multiple solutions to choose from, with ctrl+enter. Now we're talking testing!

The first option is next to hilarious.

def int_to_english(n):
    if n == 0:
        return "zero"
    elif n == 1:
        return "one"
    elif n == 2:
        return "two"
    elif n == 3:
        return "three"
    elif n == 4:
        return "four"
    elif n == 5:
        return "five"
    elif n == 6:
        return "six"
    elif n == 7:
        return "seven"
    elif n == 8:
        return "eight"
    elif n == 9:
        return "nine"
    elif n == 10:
        return "ten"
    elif n == 11:
        return "eleven"
    elif n == 12:
        return "twelve"
    elif n == 13:
        return "thirteen"
    elif n == 14:
        return "fourteen"
    elif n == 15:
        return "fifteen"
    elif n == 16:
        return "sixteen"
    elif n == 17:
        return "seventeen"
    elif n == 18:
        return "eighteen"
    elif n == 19:
        return "nineteen"
    elif n == 20:
        return "twenty"
    elif n == 30:
        return "thirty"
    elif n == 40:
        return "forty"
    elif n == 50:
        return "fifty"
    elif n == 60:
        return "sixty"
    elif n == 70:
        return "seventy"
    elif n == 80:
        return "eighty"
    elif n == 90:
        return "ninety"
    elif n == 100:
        return "one hundred"
    elif n == 200:
        return "two hundred"
    elif n == 300:
        return "three hundred"
    elif n == 400:
        return "four hundred"
    elif n == 500:
        return "five hundred"
    elif n == 600:
        return "six hundred"

You don't need to test beyond a glimpse of a review that this is not going to work out. Browsing further, you see more promising options.

def int_to_english(n):
    #dictionary to store the values of each digit
    num_dict = {
        0: 'zero',
        1: 'one',
        2: 'two',
        3: 'three',
        4: 'four',
        5: 'five',
        6: 'six',
        7: 'seven',
        8: 'eight',
        9: 'nine',
        10: 'ten',
        11: 'eleven',
        12: 'twelve',
        13: 'thirteen',
        14: 'fourteen',
        15: 'fifteen',
        16: 'sixteen',
        17: 'seventeen',
        18: 'eighteen',
        19: 'nineteen',
        20: 'twenty',
        30: 'thirty',
        40: 'forty',
        50: 'fifty',
        60: 'sixty',
        70: 'seventy',
        80: 'eighty',
        90: 'ninety',
        100: 'hundred',
        1000: 'thousand'
    }
    #if the number is less than 20
    if n < 20:
        return num_dict[n]
    #if the number is less than 100
    elif n < 100:
        #if the number is a multiple of 10
        if n % 10 == 0:
            return num_dict[n]
        #if the number is not a multiple of 10
        else:
            return num_dict[n // 10 * 10] + ' ' + num_dict[n % 10]
    #if the number is less than 1000
    elif n < 1000:
        #if the number is a multiple of 100
        if n % 100 == 0:
            return num_dict[n // 100] + ' ' + num_dict[100]
        #if the number is not a multiple of 100
        else:
            return num_dict[n // 100] + ' ' + num_dict[100] + ' ' 
                   + int_to_english(n % 100)
    #if the number is 1000
    else:
        return num_dict[1000]

Now I have something to test!

I would already have lovingly crafter tests for each of my branches if I had stuck stuck to the process of TDDing the solution out but that is not what I personally consider interesting. I'm a tester, and the test target is my external imagination. And I wouldn't hand-craft the examples, there's much more effective strategies for this.

I turn this into approvaltest of all values from zero to 999 - and beyond if I feel like it.

def numbers_list(num):
    num_list = []
    for i in range(0, num+1):
        num_list.append(i)
    return num_list

from approvaltests.combination_approvals import verify_all_combinations 

def test_all_cases():
    verify_all_combinations(int_to_english, [
        numbers_list(1000)])

Now I can visually verify if I like the pattern I see.

args: (0,) => 'zero'

args: (1,) => 'one' args: (2,) => 'two'

...

args: (19,) => 'nineteen'

args: (20,) => 'twenty'

args: (21,) => 'twenty one'

...

args: (998,) => 'nine hundred ninety eight' args: (999,) => 'nine hundred ninety nine' args: (1000,) => 'thousand'

The ... added for omission of a very long list beautifully doing what I was expecting.

I have worked with programmers for 25 years knowing that the solution could be different. It could be any of the other suggestions co-pilot gives me, and I - an exploratory tester extraordinaire, care for results matching what is necessary for the success of the business with the user and other stakeholders.

Another solution I could have ended up with is this one:

def int_to_text(n):
    if n < 20:
        return ["zero", "one", "two", "three", "four", "five", "six", 
                "seven", "eight", "nine", "ten", "eleven", "twelve", 
                "thirteen", "fourteen", "fifteen", "sixteen", 
                "seventeen", "eighteen", "nineteen"][n]
    elif n < 100:
        return ["twenty", "thirty", "forty", "fifty", "sixty", 
                "seventy", "eighty", "ninety"][(n // 10) - 2] 
                + (["", " " + int_to_text(n % 10)][n % 10 > 0])
    elif n < 1000:
        return int_to_text(n // 100) + " hundred" 
               + (["", " " + int_to_text(n % 100)][n % 100 > 0])
    else:
        return "one thousand"

Which is nice and concise.

Comparing its output from the same test to the previous implementation, the difference is glaring:

I coul could have also have ended up with this:

ones = ["one", "two", "three", "four", "five", "six", "seven", 
        "eight", "nine"]
tens = ["ten", "twenty", "thirty", "forty", "fifty", "sixty", 
        "seventy", "eighty", "ninety"]
teens = ["eleven", "twelve", "thirteen", "fourteen", "fifteen", 
        "sixteen", "seventeen", "eighteen", "nineteen"]

def int_to_english(n):
    if n < 0:
        return "negative " + int_to_english(-n)
    if n == 0:
        return "zero"
    if n < 10:
        return ones[n]
    if n < 20:
        return teens[n - 10]
    if n < 100:
        return tens[n // 10] + " " + int_to_english(n % 10)
    if n < 1000:
        return int_to_english(n // 100) + " hundred " 
               + int_to_english(n % 100)
    if n < 1000000:
        return int_to_english(n // 1000) + " thousand " 
               + int_to_english(n % 1000)
    if n < 1000000000:
        return int_to_english(n // 1000000) + " million " 
               + int_to_english(n % 1000000)
    if n < 1000000000000:
        return int_to_english(n // 1000000000) + " billion " 
               + int_to_english(n % 1000000000)

And with the very same approach to testing, I would have learned that

args: (0,) => 'zero' args: (1,) => 'two'

...

args: (8,) => 'nine'
args: (9,) => IndexError('list index out of range')

...

args: (18,) => 'nineteen' args: (19,) => IndexError('list index out of range') args: (20,) => 'thirty zero'

And trust me, at worst this is what I could expect to be getting functionally. And with all this explanation, we did not get to talk about choices of algorithms, if performance matters, or if this can or even needs to be extended, or where (and why) would anyone care to implement such a thing for real use.

With co-pilot, I wouldn't have to read the given values from a file you gave me in the first place. That I did because I was sure it is not complicated after the interview and some of that work feel like adding new failure modes around file handling I would deal with when they exist.

Instead of us having a fun thing testing, we had different kind of fun. Because I still often fail in convincing developers, especially in interview situations where they are fitting me into their box, that what I do for work is different. And I can do it in many scales and with many programming languages. Because my work does not center on the language. It centers on the information.

Conclusion to this interview experience: a nice story for a blog post but not the life I want to live at work.

Friday, December 3, 2021

Testing is a fascinating word

We all think we know what testing is. We can even whip up a few options to choose from on the definitions, but definitions don't matter as much as our perceptions. When I ask you to think of testing, you think of of testing. It just may well be that the scope of testing you think the scope of testing I think have little, if anything in common.

I ended up thinking about this again seeing the infamous words in a memo: "We automate all our testing". Well true. And well, not true at all. There is something that creates all that test automation that now does all the testing you care to name, and that too is testing. Confused yet?

To continue with the confusion, let's point out that testing is both a verb and a noun. It is a great general word, a bit like thinking and learning. And those are the core elements of testing, not the documentation we create our of thinking and learning. Test automation is to a large degree documentation with some awesome features like alerting when it is no longer up to date!

Charity Majors said it well - testing we have automated isn't all there is to testing, because *past* mistakes aren't all we are searching for.

Correct. Tests are not there to tell you that software is "working", lolololno. All they can do is boost your confidence that you haven't repeated any of your *past* mistakes by a few (valuable!) smidges. https://t.co/r3wKLgV1e4
— Charity Majors (@mipsytipsy) December 2, 2021

Adding to an ambiguous word "testing" more words does not make it less ambiguous, but it does reveal how versatile the word is. Like this one:

Someone asked me today what is the difference between unit testing, approval testing and exploratory testing. First is a scope in which we consider testing. Second is a method for capturing test results we want to keep up with. Third an approach for learning & results in testing.
— Maaret Pyhäjärvi (@maaretp) December 2, 2021

I propose we focus on the one thing we agree on: testing is something important for us to figure out and controlling words out of our control isn't how we figure it out. Automating all tests means we are adding automation. Automated testing is created manually.

Correcting language is a silencing technique. Correcting language is an investment of time. Is that really how we end up understanding the others when in fact sometimes we already do.

Friday, November 26, 2021

Removing Robot Framework From My World

Making progress in removing Robot Framework from my world. Why, you might ask, especially when so many managers in Finland have been taught to expect it. If you do something where you frequently need help of a community to learn it further, why would you choose something where information search isn't giving you a large community?

I put a complete newbie through a month of Robot Framework and a month of pytest. Pytest won, hands down. My key takeaways:

Information search is completely different experience: both quantity and *tone* in responses the communities provide is different and developer-first communities do better in leveling materials for fast-tracking newbies
Debugging tests in IDE, you'll need it and save tons of time avoiding abstractions that take some of that power away from you. Running a single test from a suite - pure bliss in a suite that takes an hour to run
One less layer, one less source of problems. Tools have bugs. You can always let the bugs hit others and carefully select the time when you allow for new, but foot on brake is energy away from where it should be. And when it is not bugs, you are waiting (or contributing) to the translation layer to get the new cool stuff from the underlying library into your use.
Learn more. Robot Framework has become a synonym for bad programming combined with bad testing. In recruiting, it is more likely to mean "not good" than the other way around. Sadly, but this is the trend. Making easy things easier is great, but making hard things harder gets people to stop at easy over important. Look at the behaviors people have with tools, there is a difference that matters

In an hour of contemporary exploratory testing, we can go from this:

to this - with people who have never written a line of code, with power of ensemble programming.

Next up, recognize the ten bugs these tests document as "works as implemented".

Sunday, November 21, 2021

Balancing the Speaker Circuit

On this lovely Sunday, two different conferences representatives ended up in the things I read, with a very similar message. Both conferences would love to have more women speak, but both also suffer from the same problem - women won't submit.

With this thought on my mind, I browsed further on the random things people share and came by a visualization on why it is a problem that hospitals have as many vaccinated and unvaccinated people. This lead me to think on the tech conferences problem on equal representation even further.

When the size of population is large enough, a small location such as a conference stage is easy to fill. All you need is five "best people" of your two categories, and your stage is full. With a population large enough, you can't claim that the people you choose while using some helpful categorization aren't the best - you can't have all on display anyway, and for almost any topic that a representative in one group could present, you could find a representative in another group.

This takes us to why it matters. Conferences model the world we expect to have. Every speaker invites speakers who identify with them to feel included - for their topics, for their experiences and for their representation. The world has a little more women than men, and that is what tech of the future should look like. Intelligence is distributed in the world, and we want to bring in intelligent people to contribute in tech.

The current reality we source our speakers from, however, is the current tech industry where we still have a lot more men than women. And with this dynamic, it means there is extra work included in being part of the minority group. We still have plenty of people in both groups to choose from, but the majority group uses less energy in just existing and thus has more energy to exert in putting themselves forward.

Expecting equal showing up for people in both groups to respond to call for proposals, would expect existing was equally laborious.

While we don't have equality, we need equity - particularly reaching out to the minority group so that we can have a balanced representation. We shouldn't choosing the "best out of people who did the free work for us" in CFP, we should be choosing best lessons our paying audiences benefit from in creating the software of the future.

They say that they won't come to your home to find you, but with power of networks, we could easily source the people - both men and women - from the companies doing the work and thus qualified to share the work without making the people themselves to do the work of submitting to a call for proposal.

With 465 talks under my belt, I still get rarely paid for teaching I do from all the stages. But the longer I am at this, the more I require that the conferences who won't pay for my work do their own work of reaching out for a balanced representation. I'm personally not available without an invitation and consider the invitation a significant part of not having to invest quite so much into speaking. But I think this is true to people who need the extra step of inviting to feel welcome, and to dare to consider taking the stage.

Find the ones who aren't in current speaker circulation, and invite them. Nothing less is sufficient in this time of connectedness.

Tuesday, November 16, 2021

Increasing Understanding of Modern (Exploratory) Testing

Many, many years have passed since I published an article with a title: Increasing Understanding of Modern Testing Perspective (2003). What I argued on back then is on V-model being harmful, and funny enough, it has been years since I have run into that model anywhere but in academic writings. We treat unit, integration, system, and acceptance testing very differently these days.

Now that I seek to understand and explain modern testing, I seek to understand and explain exploratory testing - the approach. A few months ago I wrote about how we have plenty of ways around to talk about it and confuse people, so adding labels to make sense of the difference between two is necessary. Just as we did not need acoustic guitar before we had electric guitar to distinguish from.

I seek to find ways to talk about contemporary exploratory testing, which has recently given me great results at work.

We now fairly regularly release new versions of our firmware and upgrade the customers automatically for a particular experimental product
We moved from 34 working days of release testing to 2 days of release testing
We release two (soon three) products simultaneously when we used to release only one
We have 39% test automation coverage (as per features we've put to production with a none, some, good enough levels) and reliability of tests have moved to hours to fix from weeks to fix
We find things that used to escape us as per bugs

We do a better job with this idea of intertwining automation into exploratory testing. Same people explore with and without code.

But to explain that what we do is different, I've been seeking ways to visualize that. I tried explaining what we have different ways to talk about exploratory testing.

I've tried explaining that we apply exploratory testing in different scopes, and my scope includes the whole - contemporary exploratory testing is an approach to testing, not a technique.

I've tried explaining we frame what belongs inside the box of exploratory testing differently - contemporary exploratory testing includes test automation, very explicitly.

I'm still processing what might be the helpful ways of explaining which kind of exploratory testing we are sharing results on. Because my fact is, my results now are significantly different to my results back in the days of the other kinds, and I've been through these all.

Labels help me sort out my past to my today, and hopefully share better on what my today is. I've tried doing that with my talks recently - on Contemporary Exploratory Testing, Test Automationist's Gambit and Hands-Off Exploratory Testing to Manage in Scale.

Thursday, November 11, 2021

The Ways Bugs Cost Us

When I was growing up to be a tester, I learned to think in terms of importance when it comes to bugs.

Working with remotely installable antivirus, the bugs that would block us from remotely fixing the broken remotely installed antivirus, those would be tough ones on the global market. And I learned that really well by one day distributing a new version to early adopters where was was a high level exec, and sending someone over to their home to fix their computer that no longer could get online on their remote day. We considered it so important that there was a really insightful design of a feature that would enable fixing.

I was thinking about this bug yesterday, when I was watching a sales colleague wonder about a device installed up in the air out of his reach, figuring out if we really would need to lift him up there with a computer to know what was going on. This time we were lucky - time was enough to resolve the issue but time needed was long enough to make us worry. Positive outcome though was to really be part of the experience and build a better connection with a colleague I don't always work with closely - that relationship usually turns into magic over time.

What the experience with the problem that the kind of problem I thought it would be, I stopped to reflect on how the world as I know it has changed.

Importance of the bug is less central now. The speed of analysis, and applying a fix is the new essential thing.

I've had seemingly small problems (in terms of mistakes we made in creating software) that took a long time to fix because in a multi-team distributed system, finding the right person feels like the jokes in which you knock each door only to be directed at another one, to end up back where you started with more information.

If this throughput time to resolution is intertwined with the importance in scale, time escalates the problem.

Time from report to a fix in the customers environment is key.

Every customer matters.

Sunday, November 7, 2021

Ensemble Programming and Behaviors for Hands, Brains and Voices

While Agile 2021 was ongoing in summer, I was doing an observation activity that I never got around to finish. I was organizing series of new groups trying out ensemble programming, and watching what they do. What I ended up with was material I just run into today, that I had titled "Behaviors for Hands, Brains and Voices".

If you have every been wondering what should you do when ensemble programming, this listing might be beneficial.

The Roles

We have established we have three roles in Ensemble Programming:

Hands (driver) are on the keyboard and don't make decisions.
Brains (designated navigator, talker, translator, pilot) is the current main decision-maker and uses words to enable hands to work effectively.
Voices (other navigators) are everyone else who support brains in getting the work done by providing added timely, correct information.

Sounds easy, but looking at what people do, what are typical things people in these roles do? What are the behaviors we could observe and fine-tune to get our group working really well together?

Behaviors for Hands

Ask clarifying question on what to type
Write/Do intentionally something the Brains did not mean to model correcting
Write slowly to encourage thoughtful navigation
Out of two ways of doing what the Brains ask, choose the one you think is worse to see the ensemble's reaction
Listen to the Brains carefully and do what requested to your best ability
Listen to everyone and ask the Brains to make choices when you recognize multiple requests

Behaviors for Brains

Give instructions to hands on pace they can consume
Navigate on level of intent and drill in through location to details if you see no movement
Choose a solution from the ensemble you would not have chosen and give it a chance to unfold
Invite proposals on the solution or next step from the Voices before deciding where to go
Listen to the Voices making proposals and help the Hands choose what to do
Navigate on high level of abstraction focusing on reviewing direction and implementation and make space for voices to improve end result

Behaviors for Voices

Make an observation about the application and point that to out to others
Make an observation about the group working together in the moment and point that out to others
Notice someone trying to make a point and support them in getting the space
Notice someone not engaging an invite them to contribute
Categorize what you want to say as say now (you need to hear this as it changes what we do to be right), soon (you need to hear this on this thread of conversations) and later (I want to say this but it can wait as long as I remember)
Raise hand to indicate you want to say something but it isn't urgent enough to interrupt
Propose a better way of doing what is being done right now
Ask a question that improves focus and gets the ensemble moving forward
Recognize need to talk about how we work and propose a retrospective discussion
Offload as post-it on a shared wall your ideas of next activity the ensemble should focus on
Make quietly notes of bugs the ensemble isn't seeing to come back to them soon
Propose to make a shared note on the shared computer for documentation / group synch purposes
Correct small mistakes like typos after giving the hands (driver) a chance to correct at time fitting their writing style
Point out possibility of passing-by cleanup or test
Point out possibility of cleanup before changing the area
Invite group to brainstorm solutions
Invite group to choose least likely solution to be implemented first
Point out to group if we are not doing what we agreed to be doing
Point out to group if we appear to be doing what we agreed but not important
Suggest changes in how the current work is done, e.g. "could we test in another browser / with another data sample?"
Directly speak to the Brains to help them improve their navigation

Reacting to Numbers

A significant part of my work is to explore how testing is done where I work, in scale, to figure out what I should help with, what I should focus on, and what might even be going on. And it is not like that would be an easy task.

Talking with people in scale is difficult. I can do questionnaires in scale, but there's only so many meaningful conversations I can have in a day of work. But in the last year, I have started to be able to give shape to describing those conversations with numbers, knowing my monthly people I connect with is about 60 people, where 30 are constant over the project / team I focus on, and other 30 are people I connect with some month, having other group of 30 another month.

From conversations I have had, I have found out where people keep their artifacts, and sampled them. I've had conversations of comparing their artifacts to the things they tell, and come to the conclusion that pulling help actively is a hard thing to do, because help people would know to pull is limited to what they know they know.

In addition to sampling their artifacts, I have counted them. And last Friday I showed to a group of peers numbers around counting changes (pull requests) to a particular artifact (test automation systems code) for a conversation, getting bit of a conversation I did not expect.

Personally I look at the quantitative side of pull requests as as invitation to explore. A small number, a number different than what I would expect and large number all require me to ask what happens behind that number. I am well aware that pull requests to test automation represent only a part of the work we do, and that I could create a higher number artificially splitting the changes. But what a number tells me is that nothing will change if we don't change anything. A number of test automation pull requests in relation to pull requests to the application that automation tests tells me a little bit about how we work on things that go together (app and its test), and number of people contributing to code bases tells me a little bit on how tightly specialized maintaining test code bases is. There's not a number I expect and target, it is a description of what it turned out to be.

If I ask for a number, or go get a number, I find the idea of "you must tell me what exactly what question you are trying to answer" peculiar for someone who is exploring. My questions are absolutes, but probes. Exploring, like with failing test automation, calls me to dig in deeper. It is not the end result, it is a step on my way.

Distrust in numbers runs deep. And while I decided to be ok trusting managers with numbers, I have been learning that the step that was difficult for me is even more difficult for the others. So it's time to make it less special, and normalize the fact that numbers exist. Interpretations exist. And conversations exist, even ones I would like to not have because they derail me from what I would like to see happen.

Friday, November 5, 2021

Turn work samples into a portfolio

Last few weeks, I have had the pleasure of discussing testing in general, test careers and finding your start with Okechukwu Egbete. He lives in Finland (Oulu), completed a financial studies to a degree here, and has been working towards finding a great position to grow into testing. In addition to great inspiring conversations, we've been pair testing together and geeking about peculiarities of software quality and this industry.

One of the conversations we had this week was about him being a little busy with various possible positions, each asking for a homework sample. Imagine this from the job seeker perspective: every single company has 8-16 hours expectation of exercise to show you are worthwhile candidate. And not only that. Every company has a different exercise. And as I was reminded of how this work, some companies have the exercise before they talk with you, without even the basic level of verifying that they aren't wasting the candidates time.

It is easy for me to say that I find that companies are a at least on verge of misusing their position with regards to selecting the candidates. Having spent at worst 2 full days being assessed by psychologists and potential colleagues through delivering training, doing homework and filling complex loops of papers for a single position I rejected for the final interviewers attitude after they offered the position, I can appreciate the load companies feel they have the right to expect without compensation.

When finding that first opening, you won't walk away at the end. But the work needed can be even more significant.

So I propose turning the activity into a +1 for you. When a company sends you the exercise you do, create a private project on github for that activity as well as all other companies' activities - turn the work you do as it is your work, into your portfolio. And include the growing portfolio of samples into your application. Mark clearly what of the things you have you think is your best work. Show level of effort you've put into different samples. Don't publish the test problems companies have, but use your solutions to those problems as part of your continued job search.

It would impress me if people did that - as long as you show you are mindful of not making anyone's future recruiting efforts harder.

Friday, October 29, 2021

Lessons Learned on Working with Data-Intensive Applications

In my years of teaching test design techniques, I have come to teach people that there are (at least) two essentially different types of functionalities we design tests for:

function-intensive applications are ones where you list the tricks the app can do, and a lot the work in designing tests are creating lists of functionalities and exploring when they work.
data-intensive applications are ones where the same functionality is riddled with data-oriented rules, and you are collecting business rules captured in data.

This difference became clear to me as I switched jobs long time ago from antivirus software development (function-intensive) to pension insurance (data-intensive), and spent the next few years trying to wrap my head around the new challenges I had not paid attention to before.

When data became the center of my testing universe, I learned that one of the major challenges we would be spending a significant chunk of our time on would be picking the "test data". If we needed a person who was just about to turn 63 (age of early pension), and we wanted to test today the scenario that they were under the limit and tomorrow the scenario that they were over the limit, we needed to find a precise set of data with those conditions.

And the data was not simple and straightforward row in a database. It was a connected set of databases, some owned by our company, some by some other companies, and for getting a production like experience in test environment, we had tools to choose someone and pull all related information into our systems. Similarly, we know we had an agreement that with 14 days notice, the other pension insurance field players would set their data on request to match what their production had. When I knew who pulls one data set and scrambles the data in the process and who pulls entire copies of their production databases and scrambles the piece we use to match the way we scramble, I could help our testing by the business experts flow a lot nicer.

10 years passes, and I forgot it was difficult because for me it was routine. Until today when I had to explain it in my current place of work why it is that something so easy and obvious to me is so difficult and complex to many others.

This is what we had today. An application hooked into its own database. A separate production and test environment. But connected business application data between production and test environments.

Some applications make it preventively hard to use artificial data in test environments. When your business systems run tens or hundreds of hourly synch batch jobs, have tens or hundreds of users executing their day-to-day manual processing tasks on the user interfaces and the system clock that inevitably changes things because you have time-based logic, you will need to replenish the data from production.

What we had was two different very simple replenish cycles. Once a month on an agreed date, one of the systems would get a refreshed copy from production. Once a year, another of the systems would get a refreshed copy from production.

I had designed last year our test data, independent of production in the other parts of the end to end test environment to be ok when data moves like this. The two systems synchronizing had the necessary data in production, and would be reintroduced with replenishing the data.

Except it did not work.

The application had bugs around not expecting the data to be replenished, but only on the logic that changes once a year.

Someone else had not understood the rule of how to set up the data and had requested data that vanished in the replenish.

I spent significant time teaching how to follow the data across the systems, and how the logic works between connected data sources and different environments. If I did not know this, I could not have tested the functionalities (data-intensive ones) last year when I did.

What I learned though is that:

documenting and knowledge sharing does not help if there is 10 months between being taught and needing the information
what I consider clear may be unclear to others
everything that can fail will fail, but at least it failed in test environment

Sunday, October 24, 2021

Talks Turned Articles

At my blog, I write whatever I feel like writing down. With blog posts, I accept that the posts are windows to my thoughts, representing whatever I had as context in the moment. People read some of it, some of them connect with me on the topics I am discussing. The temporal nature of blogging means it that I don't expect to write here sources that would be useful as articles and references, but different perspectives that I am processing towards those articles and references - and talks.

I am committed to writing proper articles, and I have different collaboration platforms I share those articles on. By an article, I am thinking of something that should be useful even if you don't came to it at the time it was written, that somehow collates work in a more concise package. Articles should stand on their own, and teach you something that new people need to learn, again and again.

Talks delivered, however, are something in between. The spoken format allows for - and requires - contents in a style and format we rarely write in. Talks are created to stay valid for a longer time, but the way they are presented is very temporal. A lot of times when I choose talks to conferences, I find myself using a phrase "not a talk, should be an article".

Thus I find it interesting to experiment with something in between. I have started writing some of my talks after delivering them once. Video is available but who watches videos as replay when there is continuous stream of new content. I call this experiment #TalksTurnedArticles and you can find those on my dev.to profile.

The latest in the collection is the talk I delivered today at TestFlix: Better Ideas at Test Design. Before that I published my favorite experience of transformations Practice Makes Better - 5x to Continuous Releases. And first in this series was Exploring Pipelines.

These are articles in the sense that they are content I believe will stand the test of time. They are in a different place so that this place remains as low bar to write on my experiences, which turn to summarized talks and articles usually on cycle of some years.

In addition to #TalksTurnedArticles, I chose dev.to as the place to host my full courses as text. The first one is available already: Exploratory Testing Foundations.

You follow what I write here, but my best writing - at least to my standards - resides elsewhere.

Tuesday, October 12, 2021

Three Stories Leading Into Exploratory Testing

End of September, I volunteered on a small lunch-time panel on exploratory testing at the conference. I sat down for a conversation and had no idea it would be such a significant one for my understanding. The panel was titled "To Explore or Follow the Map" and I entered the session with concerns on the framing. After all, I explore with a map and follow the map while exploring.

Dorota, our session facilitator, opened the session inviting stories like one she was about to share on first experiences to exploratory testing.

Paraphrasing Dorota from memory, she shared a story of how her first testing experience in the industry was on a military project where the project practice included requirements analysis and writing and executing test cases to do the best possible testing she could. One day the test leader invited all the testers to a half-a-day workshop where they would do something different. The advice was to forget the test cases and explore to find new information. And they did. The experience was eye-opening to all the things the thorough test case writing was making them miss.

I listened to Dorota's recount and recognized she was talking of exactly the expectations I am trying to untangle in my current organization. Designing test cases creates a lovely requirement to test linking but misses all too much of the issues we would expect to find before the software reaches our customers.

Next up was Adam, who shared a story of his first job at testing. His manager / tutor introduced him to the work expected from him giving him an excel with test cases, an a column in which to mark the pass/fail results. Paraphrasing his experience from memory, he shared that after he finished the list, the next step was to start over from the beginning. The enlightenment came with a conference where he met an exploratory testing advocate and realized there were options to this.

My story was quite different. When I first started as a tester, I was given test cases, but also a budget of time to do whatever I wanted with the application that I would consider taught me to understand the application and its problems better. The test cases gave some kind of structure of talking about progress in regards to them, and I could also log my hours on whatever I was doing outside the test cases without very rigid boundaries between the activities. The time budget and expectations was set for testing activity as a whole, and I could expect a regular assessment of my results by the customer organization's more seasoned testers. The mechanism worked so that for a new person, first "QA of testing" was feedback, and latter ones had financial penalty if I was missing information they expected me to reasonably find with the mix of freedom and test cases to start with.

While I was given space for better, I did not do better. No one supported me the way I nowadays aspire to support new joiners. Either I knew what I was doing or a minor penalty on invoicing was ahead, I would still be paid for all of my hours. I never knew anything but exploratory testing, and the stories of injecting it into organizations as Friday afternoon sessions or rebellious use of test cases to stretch from have always been a little foreign to me.

What the three stories have in common is that exploratory testing is part of these pivotal moments that make us love the testing work and do well with results. My pivotal moment came from my second job where I was handed a specification, not test cases and I had to turn my brain on, and I've been on the path of extraordinary agency and learning since.

Also, these stories illustrate how important the managers / tutors are on setting people up on a good path. Given requirements to test cases, you simplify the work to miss the results. Given test cases, you do work better left for computers. Given time without support you do what you can, but support is what turns your usefulness around.

A Seasoned Tester's Crystal Ball