A Seasoned Tester's Crystal Ball: December 2018

Wednesday, December 26, 2018

Thank you 2018, what next?

The year is approaching its end, and I continue my tradition of cyclic reflection. Year cannot change unless I look back at what did I do, what are the big lessons I'm taking out of the experience and whether things are changing. I live my life letting myself do things I want to do, not things I plan to do. And enjoying what I do is my very public secret to getting done a thing or two.

For continuity on my reflection, I looked at what I wrote a year ago. Things look so very different with perspective and more understanding. I looked at 2017 as a difficult year feeling alone in conferences. I should see 2018 as a difficult year as it saw an end of a long term relationship, instead I see it as liberating me from the pains I was going through in 2017. Looking forward to seeing how things look yet another year into the future.

Conferences and Travel

To end last year, I decided:

"So 2018 will see less of me abroad."

That did not work out too well. Or in a way, it was clearly divided. I did a lot of travel on the first half of the year, not being able to cancel any of the existing commitments until I got to a point where I needed to clear the second half of the year seeing only one destination of travel.

In ended up spending 120 hours on planes even with the 6 conferences I cancelled for later half of 2018. I was guilt-tripping on the "first yes then no" but looking at the new speakers and keynotes my cancellation helped enable, I can only be delighted and hope these conferences want to hear from me when my family life is better sorted out.

Regardless of being off conferences, I seemed to compensate with local meetups, online webinars and paid trainings, ending up with 28 sessions delivered in 2018, totaling my public speaking now to 380 sessions since I started. I have always done these on the side of a full-time hands-on job, and find they bring great energy and balance.

There are three particular highlights to 2018 for me. The talk I did with our 16-year-old intern at Nordic Testing Days was an effort to deliver, but showcased awesome growth as a tester and turning a 16-year-old into a public speaker in a conversational 2-person talk was lovely. Finding #MimmitKoodaa (Enabling women programmers in Finland) and teaching Java with them was another highlight.

This was by far the best training last spring!
— Rauna Nerelli (@Nerelli) October 2, 2018

The third highlight was the invitation to teach again the Testing and Quality Assurance course at Aalto University.

Comparing Numbers

10/28 sessions were abroad (2017: 21/30)
110 blog posts (2017: 103)
572448 total hits on my blog (2017: 490 628)
4905 followers on twitter (2017: 3889)
1090 readers with 300 paid for Mob Programming Guidebook (2017: 741 with 201 paid)
336 readers with 73 paid for Exploratory Testing -book (2017: 145 with 17 paid)
201 collaboration calls with potential speakers for European Testing Conference (2017: 120 calls)

Personal Branding

What I did in 2018 followed my energies, and it's been a little funny realizing how muddled my personal brand is. I have no control over what people know me for, and frankly I'm no longer sure I care.

I'm a development manager (not a test manager), a hands-on tester and a programmer.

I'm a speaker in topics of exploratory testing, pairing and mobbing, and agile.

I'm a mentor and a teacher. I teach exploratory testing hands-on, and I mentor people on their general career but in particular on becoming speakers.

I'm an author of this blog and now three books: Mob Programming Guidebook, Exploratory Testing and Strong-Style Pair Programming. All my books are work in progress and on LeanPub.

I'm a conference organizer and community facilitator, organizing European Testing Conference and facilitating Software Testing Finland and Tech Excellence Finland and helping run SpeakEasy as one of the four leadership team members. I show up for Women in Testing Slack group (that grew from 150 to 350 members) and take time to promote the awesomeness around me.

I'm a social justice warrior, a derogatory term I feel like owning up since it was used on me by a family member. I try to change things I can change, and not give up to be comfortable. My most common cause is #PayToSpeak - the unfairness of having to have money to pay for travel to work for conferences and turning that dynamic around to enable diverse voices.

Looking Back

I did a lot and became more comfortable being me doing my things. I learned to appreciate that people can seem good and do a lot of damage and found some of my inner strength I had given away.

Work at F-Secure is still wonderful. I became a development manager, and have spent my time since job crafting a manager job to look almost exactly like a tester job I used to have. I have some of the best colleagues, and look at how much we got done in our No Product Owner mode with delight. We enabled faster releases by more than just two people in the team, got our user numbers up to levels that feel intimidating as we started using the word "million", and found new ways of using statistics on both successes and failures to transform the ways we serve our user base.

My kids do a lot better in school now that they have me more at home. And my home is full of love, with two 10+ year-olds being their own personalities.

I made an impact on some people's speaking careers, and one person in particular made an impact on me: Kristine Corbus showed me how anyone of us can choose to raise up the others, and how I'd rather model after someone like her.

Tuesday, December 25, 2018

Don't be a backseat driver

As I'm pairing with someone, I find it really difficult to negotiate the "contract" for that pairing session. Asking for strong-style pairing (I have an idea, you take the keyboard and I tell you what to do) or traditional-style pairing (I have an idea, give me the keyboard and watch and comment) can both me appropriate, depending on who the person I'm pairing with is, and how they interact with me. But at time we're setting the rules on how we pair unless they are given by a facilitator for the session, it is the time when I know the least of my pair, it's the time when my inherent "making space for the other" is at its strongest and I find myself easily in a place where I'm disengaged and uncomfortable.

At a workshop few weeks back, a friend of mine ended up pairing with a stranger. They had only done pairing in workshops with me, where I introduce and enforce strong-style for the connection, but also make the rules and expectations clear. Now they were told to pair, with someone who does not pair, and the setting was far from optimal. There was a skills difference not in their favor and as they ended up watching the more experienced one, they quickly fell off the loop of what was even going on. The computer they paired on belonged to the other and they wouldn't share it because it was set up just right for work. And the only way to pair my friend was taught was strong-style that really increases newbie involvement in uneven pair. It was clear they did not enjoy it. They left half way through the three sessions.

Learning to talk about the two styles of pairing has helped me a lot in this regard. Now I have words to start the negotiating from. So I was delighted to find two more words for pairing patterns from videos of Alex Harms delivering a talk on pairing. The words were more of anti-patterns than good styles: side by side pairing where the more experienced one sets themselves above and outside the engaged pair, doing their own thing and being available for questions and mild hovering; backseat driving where the person not in position to steer tries to do that anyway.

I could not help but think if Alex had run into a particularly inconsiderate experience with strong-style pairing, because without setting up the relationship with consent, strong-style pairing can easily be indistinguishable for backseat driving.

Let's stop to think about that for a moment. What does good pairing look like? It looks like doing work by two people, where both are engaged in the same work. To be engaged, you need to be there willingly. And opting in to pair isn't always willing, if you did not know what is coming up.

Thinking about the roles in a car is helpful in remembering what it could look like.

Driver is always the person on the controls. No matter what anyone else says, driver has the ultimate power of taking things their way.
Navigator is helping the driver. Navigators can be well versed in the big picture not paying attention to the road, or know the details of the road and help step through the route in an optimal way. In traditional pairing, navigator reviews. In strong-style pairing, navigator controls the high level choices with words.

If you had a backseat driver in the car, that person would be like a navigator, but operating without consent. That person could be very engaged in the pairing, but their input wasn't welcomed or accepted by the driver. A backseat driver might be exactly like a strong-style navigator. The difference is in the contract, that is often implicit, and the assumed power difference.

In the workshop some weeks ago, I also ended up pairing with someone I had not paired with before. It was their computer, and they used Vim - effectively making me feel unwelcome on the keyboard. I did not leave half-way through hand quite enjoyed the session. Looking back, we ended up with strong-style pairing where I would actively suggest ideas.

The more I pair, the less the difference of traditional / strong-style makes sense. But in starting, it meant the world to me. And in continuing long-term, I realize that strong-style also made me uncomfortable many times, pushing a power differential I did not consent for.

Having both in the bag is good. The lesson here is that you should take a moment to negotiate the pairing contract. Especially people who have hard time connecting to the other on emotional level and hearing when words are not used, strong-style can become an act of forcing your opinions over the other just as hogging the keyboard in traditional-style would.

The difference between a backseat driver and strong-style navigator is consent and trust. The first delivers unwelcome guidance and the latter provides instructions asked for, on a level they are able to and that they find necessary.

And since mob programming relies on strong-style pairing as it's mechanism of connecting the group, imagine having whole car full of backseat drivers... That could be very uncomfortable.

Thursday, December 13, 2018

A Pesky Bug that Exploring Would Help With

I work with a particularly great team, and even great teams make mistakes. Many other teams, great or less so, would choose to hide their mistakes. I find I wear our mistakes as a metal of honor, as in having looked at them, figured out what I could try doing differently and going into the future again an experience richer. And looking forward to a different mistake.

In the last weeks, we've dealt with a particularly pesky mistake to make from a tester point of view, because it is a failure in how we test.

As bugs go, different ones show themselves in different ways. This particular one has limited visibility to our customers, as they can only see second order symptoms. But the cost of it has been high - blocking work of multiple other teams, diverting them from their intended use to create some good valuable items for our users, and instead making them create tooling to keep their system alive as we're oversharing data towards them.

So there was a bug. A bad bug. Not a cosmetic one. But also not one visible easily for an end user.

The bug was created by one of our most valued developers.

Since it was created by someone who we've grown to rely on, other people in the team looked at the pull request feeling confident in acceptance. After all, the developer is valued and for a reason of consistency in great work. No one saw the bug.

As we were testing the system, we made few wrong judgements:

We relied on the unit and system level test automation, that tests the functionality from a limited perspective.
We didn't explore around the changes because exploring from another system as user perspective requires special attention and we did not call for it.
We relied on repeating tests as we had before, and none of the tests we did before would have paid attention to the volume of information we were sending.
We had limited availability of team members, and we only see in hindsight that the changes were into a critical component.

So we'll be looking at changes:

Figuring out how the pull requests could work better to identify problems or if they are more about consistency of style and structure as they've grown to be
Figuring out how to better integrate deep exploratory testing activities towards system functionalities (over user functionalities)

I have a few (ehh, 50) colleagues that wasted a relevant amount of time on keeping the mistake from surfacing wider while we did our remedies.

These kinds of bugs would be the ones I'd want to find through exploring. And it would be a reasonable expectation.

Less managing, more testing. My kind is more valuable as not a manager. The work happens hands-on.

Wednesday, December 12, 2018

An Evening Detour to TCR

At an usually early time for me, I waved goodbyes at the office announcing I was heading to a workshop on TCR and greeted with a bit of rolling eyes and quick googling fingers. It was already established that I was more into volunteering on all sorts of learning activities. Aki Salmi hosted a workshop session on TCR with Tech Excellence meetup, which I just so happen to facilitate, so I had all the excuses I could need to get over myself and over there.

TCR - Test && Commit || Revert - is a programming workflow or a test-driven development flavor. Reactions have been from "well, got to try it" to "why are we confusing TDD more" to "will TDD replace TCR" and it feels like a worthwhile thing to do in a great company.

Aki introduced the thing we were about to be experimenting with. Test-Driven Development (TDD) as we've come to know it has three steps, RED - GREEN - REFACTOR, and Test && Commit || Revert (TCR) removes the RED. If your tests aren't green when run, you lose what you worked on. If they are green, they get committed as the next baseline to work from.

Other than the focus on experiencing TCR, the session was framed by 3x25 minutes of paired work with sharing impressions in between, the Lift (Elevator) kata and whatever the pair ends up choosing as their language.

I paired with one of my favorite people who I yet had never paired with before. They came in with things set up on their computer, so the choice on language and IDE were settled: Python on Vim.

Over the three 25-minute sessions, the promise of having fun was well delivered.

The most common reason of reverting for us was syntax - we missed a part of formatting. This made me aware of how much I prefer relying on Pycharm as my IDE, with having my focus free from the details of the syntax. We also had great little discussion on the feeling of control of having to know / do every bit in Vim I wasn't appreciating.
With another IDE, I find it relieving to work with intent and generating frames for the code and the Vim as editor made me aware of how much I appreciate other tooling.
Differences on Python / Java felt evident in running tests with same name that Python just dealt with for us, while we would have had a couple more of reverts if we worked in Java.
One pair "cheated" by commenting out the failing tests and I'm still confused with it. Being always green if cheating is encouraged is easiest by never having much in the way of tests, and that cannot be what Kent Beck means with "Cheating is encouraged, as long as you don’t stop there."
With the workflow, I was missing seeing the test fail to test the test before implementing. I disliked the thinking as the computer when I would much rather see the test fail first, but wasn't willing to let the test be reverted just for that.
Putting the commands together so that your changes are gone on red increased the sense of risk losing your changes and introduced a language of betting a small amount of work.
Being painfully aware of the risk of losing changes keeps changes small. It would require next level abilities compared to what we were working with though that we could identify making designs driving us to smaller steps for this.

Overall, I think I was just as bothered with "losing my IDE" as "losing the RED".

In the discussions afterwards, there was speculation about would something of this sort be more necessary when working on trunk-based development where folks might commit tests while they are red, but it sounds to me like a different problem to just the programming workflow.

I find all these things useful as ways of learning about what you're comfortable with and how constraints of all sorts impact your way of working.

All in all, this just felt like a relaxed version of Adi Bolboaca's baby steps constraint, where you revert if you're not green in 3 minutes. With this style, you can see red, but get to a similar practice - making changes intentionally small without losing the feedback of a test first failing to know you're actually testing what you intended.

Tuesday, December 4, 2018

Testing a Modify Sprite Toolbar

I've been teaching hands-on exploratory testing on a course I called "Exploratory Testing Work Course" for quite many years. At first, I taught my courses based on slides. I would tell stories, stuff I've experienced in projects, things I consider testing folklore. A lot of how we learn testing is folklore.

The folklore we tell can be split to the core of testing - how we really approach a particular testing problem - and the things around testing - conditions making testing possible, easy or difficult as none of it exists in a vacuum. I find agile testing still talks mostly about things around testing, and the things around testing, like the fact that testing is too important to be left only for testers and that testing is a whole team responsibility, those are some great things to share and learn on.

All too often we diminish the core of testing into test automation. Today, I want to try out describing one small piece in the core of testing from my current favorite application under test while teaching, Dark Function Editor.

Dark Function Editor is an open source tool for editing spritesheets (collections of images) and creating animations out of those spritesheets. Over time of using it as my test target, I've come to think of it as serving two main purposes:

Create animated gifs
Create spritesheets with computer readable data defining how images are shown in a game

To test the whole application, you can easily spend a work week or few. The courses I run are 1-2 days, and we make choices of what and how we test to illustrate lessons I have in mind.

Testing sympathetically to understand the main use cases
Intentional testing
Tools for documenting & test data generation
Labeling and naming
Isolating bugs and testing to understand issues deeper
Making notes vs. reporting bugs

Today, I had 1.5 hours at Aalto University course to do some testing with students. We tested sympathetically to understand the main use cases, and then went into an exercise of labeling and naming for better discussion of coverage. Let's look at what we tested.

Within Dark Function Editor, there is a big (pink) canvas that can hold one or more sprites (images) for each individual frame in an animation. To edit image on that canvas, the program offers a Modify Sprite Toolbar.

How would you test this?

We approached the testing with Labeling and naming. I guided the students into creating a mindmap that would describe what they see and test.

They named each functionality that can be seen on the toolbar: Delete, Rotate x2, Flip x2, Angle and Z-Order. To name the functionalities, they looked at the tooltips of some of these, in particular the green arrows. And they made notes of the first bug.

The green arrows look like undo / redo, knowing how other application use similar imagery.

They did not label and name tooltips nor the actual undo/redo that they found from a separate menu, vaguely realizing it was a functionality that belonged in this group yet was elsewhere in the application. Missing label and name, it became a thing they would have needed to intentionally rediscover later. They also missed label and name of the little x-mark in the corner that would close the toolbar, and thus would need to discover the toggle for Modify sprite -toolbar later, given they had the discipline.

The fields where you can write drew their attention the most. They started playing with the Z-order, giving it different values for two images - someone in the group knew without googling that this would have impact on which of the images were on top. They quickly run into the usual confusion. The bigger number would mean that the image is in the background, and they noted their second bug:

The chosen convention of Z-order is opposite to what we're used to seeing in other applications

I guided the group to label and name every idea they tried on the field. They labeled numbers, positive and negative. As they typed in the number, they pressed enter. They missed label and name for the enter, and if they had, they would have realized that in addition to enter, they had the arrow keys and moving cursor out of focus to test. They added decimals under positive numbers, and a third category of input values of text.

They repeated the same exercise on Angle. They quickly went for symmetry with Z-order, and remembered from earlier sympathetic testing they had seen positive value 9 in the angle work already. They were quick to call the category of positive covered, so we talked about what we had actually tested on it.

We had changed two images at once to 9 degree angle.

We had not looked at 9 degrees in relation to any other angle, if it would appear to match our expectations.

We had not looked at numbers of positive angles where it would be easy to see correctness.

We had not looked at positive angles with images that would make it easy to see correctness.

We had jumped to assuming that one positive number would represent all positive numbers, and yet we had not looked at the end result with a critical eye.

We talked about how the label and name could help us think critically around what we wanted to call tested, and how specific we want to be on what ideas we've covered.

As we worked through the symmetry, the group tried a decimal number. Decimal numbers were flat out rejected for the Z-order, which is what we expected here too. But instead, we found that when changing angle from value 1 to value 5.6, the angle ended up as 5 as we press enter. Changing value 4 to 4.3 showed 4.3 still after pressing enter, and would go to 4 only with moving focus away from the toolbar. We noted another bug:

Input validation for decimal numbers would work differently when within same vs. other digits.

As we were isolating this bug, part of the reason why it was so evident was that the computer we were testing with was connected to a projector that would amplify sounds. The error buzz sound was very easy to spot, and someone in the group realized there was asymmetry of those sounds on the angle field and the Z-order field. We investigated further and realized that the two fields, appearing very similar and side by side would deal with wrong inputs in an inconsistent manner. This bug we did not only note, but spent a significant time writing a proper report on, only to realize how hard it was.

Input validation was inconsistent between two similar looking fields.

I guided the group to review the tooltips they did not label and name, and as they noticed one of the tooltips was incorrect they added the label in model, and noted a bug.

Tooltip for Angle was same as for Z-order description.

In an hour, we barely scratched the surface of this area of functionality. We concluded with discussion of what matters and who decides. If no one mentions any of the problems, most likely people will imagine there are none. Thinking back to a developer giving a statement about me exploring their application in Cucumber podcast:

She's like "I want to exploratory test your ApprovalTests" and I'm like "Yeah, go for it", cause it's all written test first and its code I'm very proud of. And she destroyed it in like an hour and a half.

You can think your code is great and your application works perfectly, unless someone teaches you otherwise.

I should know, I do this for a living. And I just learned the things I tested works 50% in production. But that, my friends, is a story for another time.

It's not What Happens at the Keyboard

"What if we built a tool that records what you do when you test?", they asked. "We want to create tooling to help exploratory testing.", they continued. "There's already some tools that record what you do, like as an action tree, and allow you to repeat those things."

I wasn't particularly excited about the idea of recording my actions on the keyboard. I fairly regularly record my actions on the keyboard, in form of video, and some of those videos are the most useless pieces of documentation I can create. They help me backtrack what I was doing, especially when there are many things that are hard to observe at once, and watching a video is better use of my time than trying the same things again on the keyboard - not very often. Or trying to figure out a pesky condition I created and did not even realize was connected. But even on that, 25 years of testing has kind of brought me better mechanisms of reconnecting with what just happened, and I've learned to ask (even demand!) for logs that help us all when my memory fails as the users are worse at remembering than I will be.

So, what if I had that in writing. Or executable format. It's not like I am looking for record-and-playback automation, so the idea of what value those would provide must be elsewhere. Perhaps it could save me from typing details down? But from typing just the right thing - after all, I'm writing for an audience - I would need to clean up to the right thing or not mind the extra fluff there might be.

I already know from recording videos and blogging while testing, that the tool changes how I test. I become more structured, more careful, more deliberate in my actions. I'm more on a script just so that I - or anyone else - could have a chance of following later. I unfold layers I'm usually comfortable with, to make future me and my audience comfortable. And I prefer to do this after rehearsal, as I know more than I usually do when I first start learning and exploring.

A model of exploratory testing starts to form in my head, as I'm processing the idea of tooling from the collection of data of the activity. I soon realize that the stuff the computer could collect data on is my actions on the computer. But most of exploratory testing happens in my head.

The action on the computer is what my hands end up doing, and what ends up happening with the software - the things we could see and model there. It could be how a page renders to be displayed precisely as it is, so that for future, I can have an approved golden master to compare against. It could be recognizing elements, what is active. It could be the paths I take.

It would not know my intent. It would not know the reasons of why I do what I do. And you know, sometimes I don't know that either. If you ask me why I do something, you're asking me to invent a narrative that makes sense to me but may be a result of the human need of rationalizing. But the longer I've been testing, the more I work with intentional testing (and programming), saying what I want so that I would know when I'm not doing what I wanted. With testing, I track intent because it changes uncontrollably unless I choose to control it. With programming, I track intent because if I'm not clear on what I'm implementing, chances are the computer won't be doing it either.

As I explore with the software as my external imagination, there are many ways I can get it to talk to me. What looks like repetitive steps, could be observing different factors, in isolation and chosen combinations. What looks like repetitive steps, could be me making space in my mind to think outside the box I've placed myself in, inviting my external imagination to give me ideas. Or, what looks like repetitive steps, could be me being frustrated with the application not responding, and me just trying again.

Observation is another thing human side of exploratory testing brings. We can have tools, like magnifier glass, to enhance our abilities to observe. But ideas of what we want to observe, and its multidimensional nature are hard to capture as data points, and even harder to capture as rules.

Many times the way we feel, our emotion is what gives another dimension to our observations. We don't see things just with our eyes, but also with how we experience things. Feeling annoyed or frustrated is an important data point in exploratory testing. I find myself often thinking that the main tool I've developed over years comes from psychology books, helping me name emotions, pick up when they come to play and notice reasons for them. My emotions make me brave to speak about problems others dismiss.

Finally, this is all founded on who I am today. What are my skills, habits and knowledge I build upon. We improve every day, as we learn. We know a little more (knowledge), we can do a little more (skills) and can routinely do things a little more (habits). In all of these we both learn, and unlearn.

I don't think any of the four human side parts of exploratory testing can be seen from looking at the action data alone. There's a lot of meaning to codify before tooling in this area is helpful.

Then again, we start somewhere. I look forward to seeing how things unfold.

A Seasoned Tester's Crystal Ball