A Seasoned Tester's Crystal Ball: December 2021

Tuesday, December 28, 2021

Reflections on a Cadence

When last year was ending, I remember the distinct feeling of now wanting to move as I had before with my end of year reflections. Instead of looking back on yet another year, I wanted to look forward. So instead of writing a summarizing blog post, I thought about what I might want to find my balance on, and came up with this.

I identified five areas of focus to seek balance: paid for hours, paid for value, networking, up skill and self-care. I did many of the things, but some of the things I had plans for turned out less successful.

Let's summarize the numbers:

47 external talks, with 30 different topics/titles
58 blog posts, 3 #TalksTurnedArticles and 1 full course turned text on dev.to
16 rackets recorded with Ru
a regular 7,5 hrs a day job doing details I can take inspiration from but not detail
someone created a wikipedia page of me
4 times Tester of the Day

This totals my talks to 474 since I started public speaking in 2001, and my blog posts to 784 with total of 783,567 page views and my twitter follower number to 7835. That means two years has added me 152 560 page views on my blog and 2173 followers on my twitter. With my blog and twitter being more of public notetaking than writing to an audience, I still find it amazing that some people find value in what I write on.

Overall, the year at work was amazing. Did good stuff with good people.

Instead of reflecting on all of this and feeling overwhelmed, I want to write down some of my failures.

I committed to contributing to "cloud center of excellence" and "company-wide process improvement", only to learn that I hated it. The ideas of finding common denominators for essentially different teams and needs felt like I was not doing anyone a favor, and I walked out of the effort. Being true to how I felt was more important that struggling with work I was not cut out for.
I struggled with exerting energy with people I felt disconnected with regularly. If I did not have my home group of many awesome people around the organization, I would have found myself completely depleted. Tried to fix that by introducing a change and trying a different group, different approach, different challenges.
I failed to notice someone was not speaking to me or listening to me until it blew up 3 months later when I said something they needed to listen to.
I decided to do measuring, and did some, gave up on most of it as there's enough trouble to measure the physical world, I can rethink measuring the invisible in software later.
I avoided writing my books by writing tons of talks and articles. Productive procrastination, but procrastination all the same.
I did not learn to draw, but I learned to test better (again, and still more to do), and I did learn much on python.
I did not learn to communicate without insulting the other party in search of change, and probably never will. The "extending use of pytest to system testing" feel much more dishonest than "removing Robot Framework" even though they essentially mean the same thing and the latter causes multiple people to actively attack me even though I am still only seeking improving efficiency and effectiveness of our testing.
I replaced all other ideas of "self-care" with lots of hugs, deep conversations with my teens and digging a hole for myself on not doing bookkeeping in time but finding family come to rescue when I need them.

Every year I learn about what I can do and can't do. I make choices on accepting some, and changing some. I get many things right, and many things wrong. But I total on the positive.

Because I don't have to go to office, I have more time in my hands. I'm healthier on old dimensions, since I have always been allergic to people with animals, and there's plenty of those at the office. I'm still connected with people who matter to me, and physical location isn't what determines my best connections outside my immediate family.

We built something relevant at work, and I leave things better than they were when I found them. My new place finds use for me, and has lovely group of people, and all this leaving and joining fits the storyline we've built for me. I'm happy that I dare to try things that don't always work out, and step out when that is the right thing to do.

Doing many small things continuously gets you through a lot. So while I can't explain it all of work here, I will get to reflect that - on a cadence - next.

Thursday, December 23, 2021

Robot Framework and the Myth of Embedded Devices

Coming in to projects, I find myself shaping my work and perspectives to balance. Back in the days of being a test manager, working with project managers, I noticed that working with an optimist made me a pessimist; working with a pessimist made me an optimist. Extreme views on one end lead to opposing extreme views on my end.

Robot Framework is like this for me. I respect the fact that I have been dropped into a puddle of Robot Framework love, both in terms of where I work and in terms of being in Finland, and I just don't buy into the love.

Instead, I look at what goes on. Like noticing a Robot Framework Test Automation Expert, who due to technology choices ends up having to live in a team where that wasn't a tool of choice for the team becomes unable to contribute, period. Like noticing the difference in speed of problem to solution in test system space comparing someone specializing in Robot Framework and someone who can also equally fluently use pytest. Like noticing how moving away from Robot Framework increases whole team testing using test automation.

Or like most recently, noticing how a colleague asks for an example on "how to run modbus-protocol device with python and playwright", showing a common example of not thinking what pieces Robot Framework consists of and recognizing that the lovely modbus/nimbus-R library we use at work is a piece of python code we ourselves created for the purposes of running modbus-protocol device with python.

Surely short questions and answers will always be inaccurate and lead to conclusions on understanding, so I decided to take a longer text form - even if to create more confusion.

The key thing here is understanding architectures just enough to know what we connect with. On the rest of it, the chosen framework may help us create clarity of structure we can enable future of our organizations with.

While on hobbies side I do a lot of my examples on either web UI (choosing an appropriate driver for it) or REST APIs (choosing an appropriate driver for it), work wise I have had a mix of web in cloud, web from device, embedded devices, windows UIs, and Windows/Linux Servers just in the last few years. Somehow people think embedded is so different, but it really isn't. It's all hard, it's a little different kinds of hard.

Let's talk about this a bit on the level of architecture and what it might mean for test automation.

Imagine you have two teams, each building an embedded device.

The first one (pink) has a very light set of operating system services, including the idea that instead of thinking of something as advanced as a file system, you have a set of registries you store values in. These devices tend to be highly specialized for their purpose, but purpose from one device like this to another that is seemingly similar can be all different. There's some way of connecting with the device, in form of a cord you can plug in, or some form of wireless messaging. There might be a button somewhere, and there might be nice blinking lights. Be it wired or wireless, there's one or more communication protocols to read and/or write data. If you're in the team testing the pink device, the likelihood is you won't be running your tests inside that device but controlling inputs and observing outputs on one of the connections.

The second one (blue) has embedded linux on it, and even if it is an embedded device, it's not the kind of embedded device you read in the age-old books. It's a full-fledged computer, and with linux, everything is a file! The options of what structures you end up implementing are endless - and uncertain. Again you can connect, wired and/or wireless, most likely with a different set of protocols. You have a button to press, and some lights to see blink in colors. But most of all, you can drop code in this, general purpose test code, and make it do things inside, and collect results when done. That is, should you want to.

To automate, you control whatever your control points are.

Power: you turn on/off power supply the device relies on (requires hardware interface for remote control)
Communication: you send/receive stuff in batches or continuously to run the device on any of the protocols and interfaces you have.
Visual: you "see" lights with light-sensitive test devices or read filesystem for logically turning the lights on/off.

No matter what your test automation tool is, this is an exercise of recognizing interfaces and having libraries available to making accessing those interfaces possible.

Robot Framework nor pytest will run on the pink device. It runs on the controller PC connected to it.

Both could run on the blue device, neither one needs to, unless your test design for a specific feature demands it for proper testing.

Both run on the PC connected to the device(s) available interfaces. Both are capable of orchestrating multiple interfaces.

Instead of asking why I am so vehemently against Robot Framework, try asking why you are so vehemently for it? Did you really compare what the same people can get to in maintaining a test framework where they can name every driver rather than thinking it "comes with Robot Framework" which is, just like pytest, an extendable ecosystem. The first has a "vibrant community" but so does the latter.

You already see the first (because it is the local market leader), why are you so against seeing the latter that you think attacking me for saying I see problems with the first is warranted?

2022 Session Sensemaking

As this year is coming to an end, I have my eyes set firmly on things I want to do next year. Earlier this year I was trying to figure out what is my next career progress theme. Earlier I wanted to:

Become a keynote speaker ✅
Get a page of me on wikipedia ✅

I can still become a better speaker, and a more frequent and more awesome keynote speaker. No one is ever ready, speaking is continuous analysis of themes you would choose to invest time on, and keynote speaking is about choosing some talks you really finetune. I choose new experiences and attempt of learning something new, and don't see keynoting now as a goal I would want to pursue actively, just continue.

I can still get people to collect more of my legacy on wikipedia, and move from Finnish wikipedia to English one, but I am happy how my local community fulfilled my aspiration - and how I don't really know the people who wrote my page. I couldn't value what I have now more than if it was bigger and more visible. Again, not a goal I would want to pursue actively, just continue.

I can't dream of a raise in salary, I have a director level salary as a tester. And I don't want to become a director or VP. I could become a consultant, but I have reserved that goal a few years later, connected with an idea of making money to buy apartments for my two kids before they move out.

I love hands-on work, and improving real results. The theories and words to explain what great testing looks like are just words if we can't turn them into action. And I want to be better, still, at turning my words into actions.

So I chose what is new for 2022:

Upskilling as board member
Building up Finnish community of testing

I applied and got accepted to TIVIA board, and I'm now actively learning what boardroom work would look like. I will seek opportunities in this space beyond non-profits in due time.

I resurrected from hibernation the Finnish testing non-profit I have been leading: Software Testing Finland (Ohjelmistotestaus ry) will see a steady monthly session investment from me in 2022. Because I pair up with Ru Cindrea, the language we use is English / Finnish mix.

So I will run sessions with four "entities":

inside Vaisala: broadcasts (anyone can hear me inside the company if they opt into it), test automation demos and test experiences discussions.
Ohjelmistotestaus ry: monthly sessions to build up local testing community, my primary place to try out sharing new talks that aren't internal
Ëxploratory Testing Academy: free sessions on hands-on activities I am turning into creative commons training materials. I'm adding a theme trio later today with unit, api and UI exercises.
General Conferences, Meetups: I try to say yes to being invited within what my schedules allow as I did this year, creating new talks and repeating what I had already created.

I have my own company (maaretp) on the side, through which I do open and company-specific trainings occasionally, invoicing for the teaching work. I have limited availability to this, as I *want* to invest my regular work weeks in making testing and quality even better at Vaisala, in scale.

Saturday, December 11, 2021

Transforming Agile Ceremonies

Time and time again, I join teams with the usual agile ceremonies and can barely hold myself together. The daily meetings make me cringe. The retrospectives are post-its but conversations are prioritized for majority meaning my things never get talked about. The planning for next increment feels like a forced routine, and what fits in 2 weeks is on center of the stage. The long term planning, refinement, is half-hearted since there is already heads full with what is going on. And the demos, they are either non-existent or the best bit of the whole set.

In a greater picture, "agile" as the ceremonies and discussions on what is the correct way of doing things is something I would prefer to step away from. But I do care about how I feel, and how my colleagues feel, and I care about the results we are able to provide.

What I would like us to do is to take the ceremonies and turn them to their better versions.

Daily should not be about each individual explaining what they did. It should be about each individual synchronizing and pulling work to advance the highest priority items the team co-owns. A good daily creates a shared understanding and we make decisions on the next day more than explain that we continue on the plan. We should already know what goes on, after all we are collaborating through chats and calls on the themes throughout the working day. Or at least we should be.

A better daily centers around the "epics" or "stories" we are delivering - value to the customer. We optimize all our work so that we don't progress everything at once but the topmost item the most and fastest with whatever limits we currently have on our abilities.

Retrospectives should not be about minimizing our difficult conversations to post-its that don't get discussed. It should be a space in which we come together to hear what others have in mind, and sometimes turn that into actions right away. With one of our teams, we had a homework questionnaire for the retro, showing that the team was heavily divided in the opinions. This was never visible in the shared session where loud majority creates an appearance of the truth. We did not agree on actions to fix it, but the mere understanding changed things within just one month - each individual chose their own ways of showing up better for for that particular theme.

A better retrospective is versatile continuous improvement conversation. Sometimes we collect views. Sometimes we agree on solutions. Sometimes we follow one structure and other times another. Everyone's voices should be heard. Minority voices should be amplified. We build the working environment for us all.

Planning should not be about effort estimates and fitting a sprint, but agreeing on the next smallest possible scope of delivering something of value. Estimating should be replaced with not estimating, and the task split should be replaced with trusting people that the value card is enough, and when it isn't enough, the team can use tools to make notes that support them. The cards are best written as part of work intake, a person taking work is describing work they take on, and the person prioritizing can review and collaborate.

A better planning enables us to start together, and make sense of the threads we have ongoing - releases, epics/stories, capability improvements and support us in getting stuff done, even stuff we did not agree for as long as it makes sense. And we know if it makes sense when we understand how the changes fit our overarching vision of what good would look like.

Refinement should not be about co-existing in a meeting room when someone reads us the new things they are hoping we would work on next. It should be about checking changes in understanding of what work we need to discover in ways that requires calendar time, and moving all of us in the team to a common understanding where we can work further from.

A better refinement discusses customer problems and our ideas of how we could prepare for solving them.

None of these require an agenda to be great. We can use the first 5 minutes on discovering the agenda together, and prioritizing it to fit into the timebox. They are just reminders on conversations we usually want to have on different cadences.

For testing, when stories are actually stories instead of tasks we call stories, there is a lovely structure to work from. But not having that structure isn't stopping us from always working on the different timeframes: start early on things that require time (refinement); think about what we're doing and how that fits a bigger picture; follow the changes over the plans and learn continuously.

Friday, December 10, 2021

TDD makes a bad programming test for testers

I'm a tester by trade and heart, meaning that looking at a piece of code, I get my thrills on thinking how it will fail over how can I get it to work. Pairing with a developer who don't understand the difference can be an uncomfortable experience. Seeking weaknesses in something that exists is a bit of a different exercise than building up something.

Imagine an interview situation, going in with "Set up your IDE on language of choice ready to go". A tester pairing with a developer on a "programming skills tests". That setup alone is making me uncomfortable.

With an exercise out of the blue, the usual happens. The expectations in what we're about to do get muddled. They share a gist of numbers written in English as text. Working on code they start with asking for "signature" over explaining the problem. The usual troubles with pairing with someone new.

With an empty canvas, we write the first test selecting something from that list in gist.

def test_1_becomes_one():
    assert int_to_english(1) == "one"

Following red, the test won't pass without implementation. So you add implementation.

def int_to_english(n):
    return "one"

Expectations are about to get more muddled. While I know where this is going with TDD, this is not the thing I signed up for.

I know that I can build up the implementation from here. And that it shows how I would solve this puzzle. But my work starts from seeing one of the many ways the puzzle is solved, and I try to steer towards it (with little success, of course).

I ask to just say the problem in English. Knowing the IDE runs co-pilot, I am hoping to steer the conversation from designing code with TDD to testing code someone designed - critiquing the design that emerged, figuring out what values reveal information that we missed something, seeking limits of the solution.

The English comes off as

#function to convert integer to text

It's not particularly good English of the problem, and if I knew what the pair holds in their head, I could have improved the English to

#function to convert integer to English for values from 0 to 999

Copilot is super-helpful giving multiple solutions to choose from, with ctrl+enter. Now we're talking testing!

The first option is next to hilarious.

def int_to_english(n):
    if n == 0:
        return "zero"
    elif n == 1:
        return "one"
    elif n == 2:
        return "two"
    elif n == 3:
        return "three"
    elif n == 4:
        return "four"
    elif n == 5:
        return "five"
    elif n == 6:
        return "six"
    elif n == 7:
        return "seven"
    elif n == 8:
        return "eight"
    elif n == 9:
        return "nine"
    elif n == 10:
        return "ten"
    elif n == 11:
        return "eleven"
    elif n == 12:
        return "twelve"
    elif n == 13:
        return "thirteen"
    elif n == 14:
        return "fourteen"
    elif n == 15:
        return "fifteen"
    elif n == 16:
        return "sixteen"
    elif n == 17:
        return "seventeen"
    elif n == 18:
        return "eighteen"
    elif n == 19:
        return "nineteen"
    elif n == 20:
        return "twenty"
    elif n == 30:
        return "thirty"
    elif n == 40:
        return "forty"
    elif n == 50:
        return "fifty"
    elif n == 60:
        return "sixty"
    elif n == 70:
        return "seventy"
    elif n == 80:
        return "eighty"
    elif n == 90:
        return "ninety"
    elif n == 100:
        return "one hundred"
    elif n == 200:
        return "two hundred"
    elif n == 300:
        return "three hundred"
    elif n == 400:
        return "four hundred"
    elif n == 500:
        return "five hundred"
    elif n == 600:
        return "six hundred"

You don't need to test beyond a glimpse of a review that this is not going to work out. Browsing further, you see more promising options.

def int_to_english(n):
    #dictionary to store the values of each digit
    num_dict = {
        0: 'zero',
        1: 'one',
        2: 'two',
        3: 'three',
        4: 'four',
        5: 'five',
        6: 'six',
        7: 'seven',
        8: 'eight',
        9: 'nine',
        10: 'ten',
        11: 'eleven',
        12: 'twelve',
        13: 'thirteen',
        14: 'fourteen',
        15: 'fifteen',
        16: 'sixteen',
        17: 'seventeen',
        18: 'eighteen',
        19: 'nineteen',
        20: 'twenty',
        30: 'thirty',
        40: 'forty',
        50: 'fifty',
        60: 'sixty',
        70: 'seventy',
        80: 'eighty',
        90: 'ninety',
        100: 'hundred',
        1000: 'thousand'
    }
    #if the number is less than 20
    if n < 20:
        return num_dict[n]
    #if the number is less than 100
    elif n < 100:
        #if the number is a multiple of 10
        if n % 10 == 0:
            return num_dict[n]
        #if the number is not a multiple of 10
        else:
            return num_dict[n // 10 * 10] + ' ' + num_dict[n % 10]
    #if the number is less than 1000
    elif n < 1000:
        #if the number is a multiple of 100
        if n % 100 == 0:
            return num_dict[n // 100] + ' ' + num_dict[100]
        #if the number is not a multiple of 100
        else:
            return num_dict[n // 100] + ' ' + num_dict[100] + ' ' 
                   + int_to_english(n % 100)
    #if the number is 1000
    else:
        return num_dict[1000]

Now I have something to test!

I would already have lovingly crafter tests for each of my branches if I had stuck stuck to the process of TDDing the solution out but that is not what I personally consider interesting. I'm a tester, and the test target is my external imagination. And I wouldn't hand-craft the examples, there's much more effective strategies for this.

I turn this into approvaltest of all values from zero to 999 - and beyond if I feel like it.

def numbers_list(num):
    num_list = []
    for i in range(0, num+1):
        num_list.append(i)
    return num_list

from approvaltests.combination_approvals import verify_all_combinations 

def test_all_cases():
    verify_all_combinations(int_to_english, [
        numbers_list(1000)])

Now I can visually verify if I like the pattern I see.

args: (0,) => 'zero'

args: (1,) => 'one' args: (2,) => 'two'

...

args: (19,) => 'nineteen'

args: (20,) => 'twenty'

args: (21,) => 'twenty one'

...

args: (998,) => 'nine hundred ninety eight' args: (999,) => 'nine hundred ninety nine' args: (1000,) => 'thousand'

The ... added for omission of a very long list beautifully doing what I was expecting.

I have worked with programmers for 25 years knowing that the solution could be different. It could be any of the other suggestions co-pilot gives me, and I - an exploratory tester extraordinaire, care for results matching what is necessary for the success of the business with the user and other stakeholders.

Another solution I could have ended up with is this one:

def int_to_text(n):
    if n < 20:
        return ["zero", "one", "two", "three", "four", "five", "six", 
                "seven", "eight", "nine", "ten", "eleven", "twelve", 
                "thirteen", "fourteen", "fifteen", "sixteen", 
                "seventeen", "eighteen", "nineteen"][n]
    elif n < 100:
        return ["twenty", "thirty", "forty", "fifty", "sixty", 
                "seventy", "eighty", "ninety"][(n // 10) - 2] 
                + (["", " " + int_to_text(n % 10)][n % 10 > 0])
    elif n < 1000:
        return int_to_text(n // 100) + " hundred" 
               + (["", " " + int_to_text(n % 100)][n % 100 > 0])
    else:
        return "one thousand"

Which is nice and concise.

Comparing its output from the same test to the previous implementation, the difference is glaring:

I coul could have also have ended up with this:

ones = ["one", "two", "three", "four", "five", "six", "seven", 
        "eight", "nine"]
tens = ["ten", "twenty", "thirty", "forty", "fifty", "sixty", 
        "seventy", "eighty", "ninety"]
teens = ["eleven", "twelve", "thirteen", "fourteen", "fifteen", 
        "sixteen", "seventeen", "eighteen", "nineteen"]

def int_to_english(n):
    if n < 0:
        return "negative " + int_to_english(-n)
    if n == 0:
        return "zero"
    if n < 10:
        return ones[n]
    if n < 20:
        return teens[n - 10]
    if n < 100:
        return tens[n // 10] + " " + int_to_english(n % 10)
    if n < 1000:
        return int_to_english(n // 100) + " hundred " 
               + int_to_english(n % 100)
    if n < 1000000:
        return int_to_english(n // 1000) + " thousand " 
               + int_to_english(n % 1000)
    if n < 1000000000:
        return int_to_english(n // 1000000) + " million " 
               + int_to_english(n % 1000000)
    if n < 1000000000000:
        return int_to_english(n // 1000000000) + " billion " 
               + int_to_english(n % 1000000000)

And with the very same approach to testing, I would have learned that

args: (0,) => 'zero' args: (1,) => 'two'

...

args: (8,) => 'nine'
args: (9,) => IndexError('list index out of range')

...

args: (18,) => 'nineteen' args: (19,) => IndexError('list index out of range') args: (20,) => 'thirty zero'

And trust me, at worst this is what I could expect to be getting functionally. And with all this explanation, we did not get to talk about choices of algorithms, if performance matters, or if this can or even needs to be extended, or where (and why) would anyone care to implement such a thing for real use.

With co-pilot, I wouldn't have to read the given values from a file you gave me in the first place. That I did because I was sure it is not complicated after the interview and some of that work feel like adding new failure modes around file handling I would deal with when they exist.

Instead of us having a fun thing testing, we had different kind of fun. Because I still often fail in convincing developers, especially in interview situations where they are fitting me into their box, that what I do for work is different. And I can do it in many scales and with many programming languages. Because my work does not center on the language. It centers on the information.

Conclusion to this interview experience: a nice story for a blog post but not the life I want to live at work.

Friday, December 3, 2021

Testing is a fascinating word

We all think we know what testing is. We can even whip up a few options to choose from on the definitions, but definitions don't matter as much as our perceptions. When I ask you to think of testing, you think of of testing. It just may well be that the scope of testing you think the scope of testing I think have little, if anything in common.

I ended up thinking about this again seeing the infamous words in a memo: "We automate all our testing". Well true. And well, not true at all. There is something that creates all that test automation that now does all the testing you care to name, and that too is testing. Confused yet?

To continue with the confusion, let's point out that testing is both a verb and a noun. It is a great general word, a bit like thinking and learning. And those are the core elements of testing, not the documentation we create our of thinking and learning. Test automation is to a large degree documentation with some awesome features like alerting when it is no longer up to date!

Charity Majors said it well - testing we have automated isn't all there is to testing, because *past* mistakes aren't all we are searching for.

Correct. Tests are not there to tell you that software is "working", lolololno. All they can do is boost your confidence that you haven't repeated any of your *past* mistakes by a few (valuable!) smidges. https://t.co/r3wKLgV1e4
— Charity Majors (@mipsytipsy) December 2, 2021

Adding to an ambiguous word "testing" more words does not make it less ambiguous, but it does reveal how versatile the word is. Like this one:

Someone asked me today what is the difference between unit testing, approval testing and exploratory testing. First is a scope in which we consider testing. Second is a method for capturing test results we want to keep up with. Third an approach for learning & results in testing.
— Maaret Pyhäjärvi (@maaretp) December 2, 2021

I propose we focus on the one thing we agree on: testing is something important for us to figure out and controlling words out of our control isn't how we figure it out. Automating all tests means we are adding automation. Automated testing is created manually.

Correcting language is a silencing technique. Correcting language is an investment of time. Is that really how we end up understanding the others when in fact sometimes we already do.