A Seasoned Tester's Crystal Ball: 2019

Tuesday, December 31, 2019

2019 and Me

Every year I look back and see how things turned out. While I'm one of those people who collects numbers, the relevant insights are hardly ever quantitative in nature.

At work, I looked at how our team had evolved based on the tracks we left in the world and learned that we've grown more used to uncertainty, incremental plans and delivering great with continuous flow. I learned about ways people had grown, and exceeded their past selves.

My past self sets me a standard that I work against. Not the other people. Past me. And how the current me is learning to be more of me, more intentional and accidental, and free to make my choices. I don't want to live my life on other people's defaults, I want to tweak my own settings and explore where they take me.

I tried many different settings in 2019:

I allowed myself to be *less responsible* and let things fall forward when I was low on energy. I learned I have people close to me who catch things, take balls forward and don't blame me for being a human.
I tried not blogging for six months. It was hard to stop a routine but also felt liberating to confirm why I blog when I do. I have not written for an audience in general, but just allowed people to see what I write for myself.
I tried blogging for audience, behind paywall. I did not enjoy it and came to the idea of blogging and making videos for audience in 2020. Can't wait to try that one.
I said yes to all speaking that pays minimum of travel but applied for none. Turned out with 20 talks.
I auto blocked 700 people on twitter to learn about enforcing boundaries and doing what I needed over other people's comfort.

Some things I will remember this year from:

FlowCon talk in France and my talk #400 - with standing ovation.
DDD EU keynote in the Netherlands and finding my crowd.
Making it to 100 most influential in ICT in Finland list and having a 4-page article of me in ITViikko magazine
Talking to 150 people for Calls of Collaboration to choose speakers for my own conference (European Testing Conference) but also others with TechVoices (Agile Testing Days USA, Selenium Conference London keynote)

I have some numbers too:

45 blog posts with half-a-year break from blogging (2018: 110), but on split to 3 platforms
20 talks, out of which 6 keynotes
+2 countries I have spoken at, totaling 26 now
8 graduated TechVoices mentors (I helped them become speakers!)
2 conferences organized
2 Exploratory Testing Peer Conferences organized for #35YearsOfExploratoryTesting
50 flights, sitting in planes for 165 hrs to fly total of 107 878 km (2018: 120 hrs)
5662 twitter followers - after blocking 700
+58 556 page views in the year totaling to 631 004 page views all time to my blog (2018: +81820, 582 448)

While all of the above are on my "on the side of work" achievements, work is where I go to learn.

I learned about business value, and how to discuss it a little better at office, creating a business value learning game.

I learned about making space for people to discover what they are capable of doing, and not pointing out when they contradict their past selves before they are ready to see it.

I learned that manager role is exactly like my tester role except for three things: 1) having to click "approve" as manager comes with the role 2) feeling equal with the most intimating, wonderful and special developers I did not realize I wasn't feeling equal with even though I already was 3) *lack of performance* management is hardest job I have ever done.

I learned I am a master of procrastination as I can turn ideas into code without writing code myself and I want to overcome my internal excuses of not just doing it.

I was there to witness us moving to great and improving results and might have had something to do with some of it.

Turning the "impossible" to possible should happen any moment now, when my consistent push for 3 years turns into continuous deployment for our product type.

2019 was great. 2020 just needs to be different.

Happy new year y'all.

Thursday, December 5, 2019

A New Style for Conference Speaker Intake: Call for Collaboration

Drawing from a personal experience as conference speaker and conference organizer wanting to see change in how conference speakers are selected, I have been experimenting with something completely different.

The usual way for conferences to find their speakers are casting two nets:

Invite people you know
Invite everyone to submit to call for proposals/papers (CfP) and select based on the written submission

Inviting works with people with name and fame. If you want to find new voices with brilliant stories from the trenches, the likelihood of you now knowing all those people (yet) is quite high. Asking them to announce themselves makes sense.

This way of how a speaker announces their existence to conference is where I have discovered a completely new way of dealing with submissions creates a difference.

What is a Call for Proposals/Papers

In the usual world of announcing you might be interested in speaking in a conference, you respond to a Call for Proposals/Papers. The Papers version is what you would expect in more academically oriented conferences, and the paper they mean is usually an 8-page document explaining result of years of research. The Proposals version is what you would expect in a more industry-oriented conference, and the proposal is a title, 200-words abstract and 200-words bio of yourself, and whatever other information a particular conference feels they want to see you write that would help them make selections.

While speaking in public is about getting in front of a crowd to share, conference CfPs are about writing. The way I think of it is that writing is a gate-keeping mechanism to speaking in conferences.

As a new speaker, learning to write in this particular style to be accepted may be harder than getting on that stage and delivering your lessons by speaking about them. At the very least, it is different set of skills.

In my experiences in working to increase new voices and diversity at conferences, there are two things that most get in the way:

Finances - underrepresented groups find it harder to finance their travel if conference does not address that
Writing to the audience - unrehearsed people don't write great texts of their great talk ideas. Many feel the writing to be a task so overwhelming they don't submit.

Conferences try to help people in multiple ways, usually seeking writing based ways. It is fairly common to expect a conference to provide some feedback on your written text, especially when using supportive submission systems where you then improve your text based on feedback. But the edits are usually minor even when you could frame your talk different to make it better presented. Many ask for speaking samples (videos), adding to the work expected on the competition towards a conference speaking slot. Some conferences shortlist proposals and then call people, to ensure the speaking matches the writing. Some conferences realize after selection you could use help and call mentors like myself to help bring out the better delivery of an already great idea.

What is a Call for Collaboration

Call for Collaboration is a submission process I have been discovering for the last five years, coming to terms with my discomfort on choosing a speaker based on writing instead of speaking. I have felt I don't appreciate the purely competitive approach of writing for a CfP to win a speaking slot, and wanted to find something different.

Call for Collaboration is about aspiring speakers announcing their existence and collaborating on creating that proposal. It's a process where the conference representatives invest online face to face time to getting to know great people they could invite. And it's a process where the investment from the speaker side is smaller, creating less waste in case of not fitting the scarce conference slots. It's a human-human process where people speak and instead of assessing we build the best possible proposal from whatever the aspiring speaker comes in with.

In Call for Collaboration (CfC), we appreciate that every voice and story belongs on a stage, and making the story the best form of itself increases it chances to this conference, but has a ripple effect of improving it for other conferences too.

This submission process was first created for European Testing Conference, and later used for TechVoices track for Agile Testing Days USA 2018 and 2019, and TechVoices keynote for Selenium Conference London. So far I have done about 500 15-minute calls over the years of discovering this.

How Does This Work?

It all starts with an aspiring speaker thinking they want to make their existence and idea known for a particular Call for Collaboration a conference kicks off and being willing to invest 15 minutes of their life to have a discussion about a talk idea they have.

Image. TechVoices version of CfC + Activity Mentoring

Schedule a Call

To announce their existence, they get a link created with Calendly that shows 15-minute timeslots available to schedule.

Behind the scenes, a conference representative has connected Calendly to their calendar knowing when they are not available and defined time frames when they accept calls and limits to numbers of calls per day. They can define questions they want answered, and I usually go for minimum:

Your talk's working title
Optional abstract if you want to pass us one already
Your pronouns

Each call from a conference representative perspective is 15 minutes, like a coffee break. It includes taking an online call to someone anywhere in the world and meeting someone awesome.

If the aspiring speaker has something come up, they can reschedule with Calendly. Calendly also handles timezones so that both parties end up expecting the same time - at least if you have the tool create a calendar appointment for you.

Show Up for the Call

The 15-minute call is for collaboration. It starts with establishing we don't need to discuss credentials but just the talk idea and that everyone is awesome. We are not here to drop people away from the conference, but to understand the world of options and make this particular option shine, together.

It continues with the aspiring speaker telling how they see their talk: what is it about, what they teach and what the audience would get from that.

The usual questions to ask are on "Would you have an example of this?" and "What is your current idea of how you would illustrate this?" or on "Have you considered who is your audience?" or "Why should people care about this?" or "We know you should be talking of this, but how would you tell that to people who don't know it yet and have many options in similar topics?".

I have had people come to the call with the whole story of their life in agile, and leave with one concrete idea of what they are uniquely able to teach. There are talks in this world that exist because they were discovered through these 15-minute discussions.

For some of the calls, we've had a whole group of people from the conference - this serves as a great way of teaching the mechanism further - mentoring the mentors to take the right mindset. We are there to build up the speaker and idea, not to test it for possible problems.

Share to the World

In the end of the call, the conference representative asks for permission to summarize what they learned about the talk in a tweet with reference to the aspiring speaker, and with permission share that. Sharing serves three purposes: it helps remember what the talk was about (to prioritize for invitation to work on the talk further); it allows the aspiring speaker to confirm if their core message was heard and to correct; and it creates a connection for the aspiring speaker to other people interested of this theme in the community.

Prioritize to Invite

Now we are at a point where there isn't really an abstract, but there is a tweet and there are the lessons of what the abstract could be about from the aspiring speaker to the conference representative. We can make a selection based on how people speak in that call, and particularly, what unique content they would bring to that conference.

If someone isn't quite there yet on how they deliver their message, we can invite them and ask them to pair up with an activity mentor for rehearsing the talk. This is the only way to get some of the unique new experiences from people who are not accustomed to speak in public. With rehearsing, people can do it. The only concern around rehearsing I have sometimes is on English skills - I have mentored people who would either need a translator (used a translator in our call) or a few more years of spoken English.

This is a point where you have usually used about same effort on the person as you would if you were carefully reading their written abstract - but you might have now a different talk to consider as a result of the collaboration.

If you invite, the next step is needing the abstract for the conference program. Or, it could be that this is the abstract use for yet another round of selections if you want to pin this process on a more traditional CfP.

Activity Mentoring for Conference Proposal - What Is This?

The activity of creating that title, abstract and bio to show the best side of the talk is the next part. The newer the speaker, the harder this is to get right without help. A natural continuation of CfC is activity mentoring, ensuring the written test as a deliverable of the process reflects the greatness of the talk.

1st draft is what comes out of the aspiring speaker without particularly trying to optimize for correctness. It is good to set up expectations, but also encourage: something is better than nothing. This is just a start.
2nd draft is what comes our when the conference representative from the call puts together 1st draft, their notes on what the talk is about, the tweet they summarized things into and their expertise on abstracts. Its usually an exercise of copypasting together my notes of their spoken words and their written words in an enhanced format.
Submission is what the conference system sees, and it is an improved version of 2nd draft.

Example

The latest effort from this process will be soon published as the TechVoices Track of Agile Testing Days USA 2020. 9 speakers with stories from new speakers that are not available without taking the effort to get to know the people. Many of these voices would not be available if the choice was done on what they wrote alone. Every single one of these voices is something I look forward to, and they teach us unique perspectives.

I still have another activity mentoring period ahead of me with a chance of hearing the premiers of these talks before conference and helping them shine with yet another round of feedback

We chose 9 talks out of 45 proposals invited from USA, South America and Canada. I also had a few calls with ideas from people not from invited geographies and helped them figure out their talks in the same 15 minute slots.

As a mentor, I had time to talk to all of these people and feel privileged to having had the chance of hearing their stories and tell about their existence through tweets. I would not have had time to help all of them get the proposals to a shape where they could get accepted to the conference. The activity mentoring is a focus draining activity, whereas having a call is more easily time-boxed and happens without special efforts.

I spent 11 hours over timeframe of a month talking to people and getting to know them. I spent another 10 hours on the 9 people that were selected.

The hours the 15-minute slots make have been the best possible testing awareness training I could have personally received. It has given me a lot of perspective, and made me someone who can drop names and topics for conferences that have a more traditional Call for Proposals.

Tuesday, December 3, 2019

The First Test On a New Feature

Testing various features, I have just one rule:

Never be bored.

Sometimes I try to figure out a template to what I do when I test and how I end up doing what I do, and it always boils down to this. Do something that keeps you engaged.

Today I was testing a feature about scheduling tasks of all sorts, and as I was thinking about getting started, I told myself I should test the basic simple positive scenario first. Not that I thought that wouldn't work, but to show myself how the feature could work. I've used that rule a lot, but today that rule did not make me feel excited.

Instead, I started off with a list of examples that was provided. I quickly glanced the list through, and selected one that looked complicated telling myself that if something wouldn't work, that would probably be it.

It turns out I was right. Instead of seeing the feature work, I got to see a fairly non-explanatory error message on the log I was monitoring. So I run a second test, the simplest sample from the list to see how it could work. From starting testing to discussing a bug with the developer was just minutes away.

Meanwhile, I was having a discussion with another developer to refresh my ideas of test strategy: would we test these types of things in long term reliably with unit tests, yes - getting to a confirmation that while fixing what I complained about, the first developer had already improved the unit tests on it.

Similarly, the fix for the bug was also just minutes away, and I needed to chip in some more ideas.

I ended up asking a lot of questions. Questions around how things would work on leap years and days, impossible combinations of months and days and instead of ending up testing these, the developer volunteered. Questions around how time could be too short or too long for our taste. Questions around what the log would show. Discussions changed my perspectives, and clarified our shared intent around what we were building.

I ended my day of testing with googling evil cron for testing, and had lots of fun with online tools generating me test data I could try. Things that were not supposed to work did and things that really stretched what was supposed to be valid were confirmed. I explored time calculation, passing valid and invalid values. And as always with testing, had a great time.

These are first tests, and I'm not done. Instead of testing this all by myself, I invited a group of people to do mob testing with me on this, to test my own thinking and improve our test strategy around what makes sense to automate even further.

There is no first absolute test. It is never too late to do the thing you thought you should have done first. With testing, very few options expire. You just need to keep track of your options.

Sunday, December 1, 2019

Sequences of time and cognition

Back in the days when we were starting to figure out what the essential difference between exploratory testing and the other testing was, we figure out that we would call the other scripted testing.

We recognized separations of sequence in:

time - activities separated by time
cognition - activities separated by skill focus

For exploratory testing to take place, the design and execution activities needed to be intertwined so that both time and cognition could not allow for separation. The reasons were clear:

designing and planning what to do early when you know the least makes little sense has a high risk of wasteful activity
designing and planning by a "more skilled" person to be executed by "less skilled" person assumes knowledge can be transferred with a document instead of seeing it as something acquired through effort.

With DevOps and Continuous Delivery, the time separation has transformed. We still see some of it in the sprint-based ways of working where we start with BDD-style scenarios very much separated in time even if only by days or hours. That part is scripted testing, not exploratory testing.

The cognitive separation has also transformed. For exploratory testing to be great, the cognitive sequence of today includes using code as a way of executing things and documenting things, and this can only be included if the exploratory tester knows programming at least to a level of effectively pair with others in turning ideas into code. The scripted variants separate the cognitive sequence into the person who designs the tests, the person who automates the tests and monitors them, and the person who tests stuff that automation may not cover, perhaps in what we used to know as exploratory testing way.

Thinking through this, I have come to the idea of cognitive bridges. In close collaboration with teams, the cognition around the testing activity even when it is split for multiple people can either be isolated (not exploratory) or bridged (exploratory).

We used to have cognitive bridges we organized for larger exploratory testing efforts through time-boxing and debriefing with other testers. Now have cognitive bridges in the whole team, and with activities beyond just plain testing. The one that fascinates me the most right now is how the tone of discussing developer intent builds a cognitive bridge.

Friday, November 22, 2019

35 Years of Exploratory Testing

Exploratory Testing is 35 years this year. I've celebrated this by organizing two peer conferences to discuss what it is today and my summary is that it is:

more confused than ever
different to different people
still relevant and important

Cem Kaner described it 12 years ago this way:

The core of what I pick up from this is emphasis on individual tester without separation of cognitive sequence, optimizing value through opportunity cost awareness and learning supporting test design and execution throughout the work being done.

I dropped words like project (because in my world of agile continuous delivery, they don't exist), and added words like opportunity cost to emphasize the continuous choice of time on something being time away from something else. I also brought in the concept of cognitive load separation as the idea is to not separate work into roles but build skills and knowledge in the people through doing the work.

Exploratory Testing after 23 years

Cem's notes are available in the presentation, and I wanted to summarize them here. He identified four areas to describe from his circles:

Areas of agreement

Definitions
Everyone does it to a degree
Approach, not a technique
Antithesis to scripting

Areas of controversy

Not quicktesting - packaged recipes around particular theory of error; requires domain and application knowledge to do well
Not only functional testing - quality beyond functional is of concern to exploring
Uses tools - test automation is a tool, but there could also be tools specifically in support of exploratory testing
Not only test execution - not a technique but an evolutionary approach to software testing. You can do all things testing in an exploratory and not-exploratory way.
Complex tests requiring prep included - cycle of learning is not in the moment but varies
Certifications - don't understand this style of testing and can be worthless or anti-productive for the industry trying to do exploratory testing

Areas of progress

Understanding of quicktests - from works of Whittaker, Hendrickson and Bach
Oracle problem - thinking around oracles has evolved
Learning and cognition in the focus of ET - individual and paired work
Multiple guiding models - everyone with their own

Areas of ongoing concern

Modeling an area of early understanding - talk of it goes on
Myths - making way to understand it is cognitively challenging, skilled and multidisciplinary
Tracking and reporting status - dashboards and time-boxed approaches were the whim of the day
Individual tester performance - we don't know how to assess that
No standard test tool suite - tools guiding thinking in the smart way

Exploratory Testing after 35 years

Previous areas of agreement are not that anymore.

Some folks have decided that Exploratory Testing is deprecated.

Other folks are rediscovering Exploratory Testing as the smart way of testing that was relevant 35 years ago that is different today when automation is part of how we explore rather than a separate technique.

We still agree that it is an approach not a technique, just not if it is necessary separation. Wasteful practices of testing both with and without automation are still popular, and it is a cost-aware approach to casting nets to identify quality-related information.

Previous areas of controversy are old lessons and not particularly controversial. Some of them (like certifications) describe a divided industry that no longer gets the same attention. Most of what was controversial 12 years ago, is now part of defining the approach.

Areas of progress seem far from done from today's perspective. Quicktests give less value with emergence of automation. Oracles have been a focus of Cem Kaner's teaching for a decade after the previous summary and are working to understand them in respect to automation being closely intertwined in our testing. Paired work of testing was a whim of a moment as focus of really understanding the cognitive side of it but in addition to paired work, we now have mob testing - work in a group.

Concerns are still concerns except for tracking and reporting status. With automation integrated into exploratory testing and continuous delivery, status is not a problem but industry is further divided to those in the fast paced deliveries and those who are not.

What do we agree on, what are the controversies, what progress can we hope we are still making and what concerns we have at 35 years of exploratory testing?

Areas of agreement

All testing is exploratory to a degree
Exploratory testing is skilled, multidisciplinary and cognitively challenging and finds unknown unknowns

Areas of controversy

Is exploratory testing even necessary concept in the world of continuous development, when the forms of it happening can be labeled as "automation", "pairing", "production monitoring", "learning", and "test automation maintenance".
The early experts cling to materials and lessons from early 2000 and fail to connect well with a wider community, creating a tester community isolated from the overall software communities

Areas of progress

New voices thinking and sharing from practice first, in agile development delivering continuously: Anne-Marie Charrett, Alex Schladebeck, Maaret Pyhäjärvi and many others are working actively to move the area further
Developers are doing exploratory testing, the split to this being tester specialty is giving way to working together and learning together

Areas of concern

Lack of shared learning and seeking common understanding. The field feels more like a competition of attribution than learning to do testing well in modern circumstances.
Trainings on exploratory testing are hard to select for organizations as they include very different things in similar looking box. A field as big as exploratory testing should have a whole series of trainings, not one that introduces the concept again and again.
Testers, in particular those without coding skills, are dropped out of industry. We are losing people who believe they don't code without helping them see their contributions through collaboration skills. People dropping out are disproportionally women. The old adage of "coding not being trainable" to many testers is harmful.

Testing Computer Software, 2nd ed

I'm collecting some of the history of Exploratory Testing and for that, going into selected references that brought the idea to me. The first book I read on testing (I'm a newbie, only 22 years in the industry :D ) was

Cem Kaner et al. Testing Computer Software, 2nd ed. published: 1999

For research purposes, I have had my hands on the 1st edition too, which is from 1988 but it's been a while.

For the second edition, I go for index and look for exploratory testing.

exploratory testing
- generally 6, 7-11, 215
- boundary conditions 7-11
- discover the program's limits 215, 241

Page 6 is a chapter "The first cycle of testing" and Step 4: Do some testing "on the fly". Here the book speaks on running out of formally planned test cases. It emphasizes that regardless, you can keep testing. It encourages to think of new things without spending much time preparing or explaining the tests. It encourages trusting instincts and trying anything that feels promising.

It introduced the example in question as something where you switched from formal to informal very early on as the program crashed. And then it introduces the concept:

"Rather than gambling away the planning time, try some exploratory tests - what ever comes in mind."

It then pulls a core concept out:

"Always write down what you do and what happens when you run exploratory tests."

It moves to give examples about how surprising problems you did not anticipate for when planning your formal test series are now useless on figuring out what the problems with software are.

Page 7-11 goes into Step 5: Summarize what you know about the program and its problems. It introduces with a table "Further exploratory tests". It then discusses a possible way of implementing as a reason for problems testing is finding with explaining ascii character codes and linking boundaries to them. Finally, it describes a second cycle of testing after fixes and reminds there will be multiple cycles and that you want to select tests for each cycle based on what the programmer said she changed. The book emphasizes that it is not a mere act of selecting but you need to come up with new tests too.

Of total of 16 pages on the first sequence of testing, the book discusses exploratory tests for 5 pages at a time the rest of the world was still speaking only of scripted tests.

Page 215 talks about Initial development of test materials. It talks of parallel work on testing and the test plan saying "you never let one get far ahead of the other". It moves to talk about the first prompts you should consider: testing against documentation, starting to create a function list and analyzing limits. It never mentions exploratory testing, but describes how a plan is created as per "our approach" which reads as exploratory.

Page 241 is on components of test planning documents and specifically three kinds of matrices: environment matrix, input combination matrix and error message and keyboard matrix. For input combination matrix, it elaborates:

"Our approach is more experiential. We learn a lot about input combinations as we test. We learn about natural combinations of these variables, and about variables that seem totally independent. We also go to the programmers with the list of variables and ask which ones are supposed to be totally independent."

The book was a collaboration between Cem Kaner, Jack Falk and Hung Quoc Nguyen and as such it does not outline if exploratory testing was coined by one or all of them but it being published under their names says the was some level of agreement to allow such concept in the book.

While these are the specific parts marked for "Exploratory testing", the way I read it, the whole book is a description of exploratory testing as it was known then - the smart way of testing with care but attention to costs and nature of testing in product companies that lived or died with wasteful practices.

Thursday, November 21, 2019

Tools Guide Thinking

I have spent a significant chunk of my day today in thinking about how exploratory testing is the approach that ties together all the testing activities even when test automation plays a significant role. Discussing this with my team from every developer to test automation specialists to the no-code-for-me-tester, finding a common chord isn't hard. But explaining it to people who have a different platform for their experiences isn't always easy. Blogging is a great way of rehearsing that explanation.

I frame my thinking today around the idea I again picked up from Cem Kaner's presentation on Exploratory Testing after 23 years - presented 12 years ago.

"Tools guide thinking" - Cem Kaner

Back then, Cem discussed tools that would support exploratory thinking, giving examples like mindmaps and atlas.ti. But looking back at that insight today, the tools that guide a rich multi-dimensional thinking can be tools we think of as test automation.

We have a tool we refer to as TA - short-hand for Test Automation. It is more than a set of scripts doing testing, but it is also a set of scripts doing testing. To shortly describe the parts:

machinery around spawning virtual environments
job orchestration and remote control of the virtual machines
test runners and their extensions for versatile logging
layers of scripts to run on the virtual environments
execution status, both snapshot and event-based timelining

Having a tool like this guides thinking.

When we have a testing question we can't see from our existing visualizations, we can go back to event telemetry (both from product and TA) and explore the answers without executing new tests.

When we want to see something still works, we can check the status from the most recent snapshot automatically made available.

When we want to explore on top of what the scripts checked, we can monitor the script real time in the orchestration tooling seeing what it logs, or remote to the virtual machine it is running and watch. Or we can stop it from running and do whatever attended testing we need.

We can explore a risky change seeing what the TA catches and move either back or forward based on what we are learning.

We can explore a wide selection of virtual environments simultaneously, running TA on a combination we just designed.

We want a fresh image to test on without any scripted actions going on, we take a virtual environment which is at our hands ready to run in 2 seconds it takes to type it into a remote desktop tool.

It makes sense to me to talk about all of this as exploratory testing, and split it to parts that are by design attended and unattended. A mix of those two extends my exploration reach.

With every test I attend to either by proactive choice or reactive choice being called in by a color other than blue (or unexpected blue knowing the changes), I learn after every test. I learn about the product and its quality, about the domain and for exploratory testing most importantly, I learn about what more information I want to test for.

Tool guides my thinking. But this tooling does not limit my thinking, it enables it. It makes me a more powerful explorer, but it does not take away my focus on the attended testing. That is where my head is needed to learn, to take things where they should go. Calling *this* manual is a crude underrepresentation of what we do.

A Good Week for Exploratory Testing

Exploratory Testing is a 35-year-old approach to testing, created back in its days to describe a style of testing common in Silicon Valley companies that was clearly different from what we saw as mainstream.

In the 35 years of Exploratory Testing, we've grown to understand it better.

We now understand that the core of exploratory testing is the degree to which learning from the test we do now impacts our choice of what we do next.

We understood clearly that the old and traditional way of testing where people design and document tests to be later executed effectively is not what we would call exploratory testing. Exploratory testing is something else.

Where we still struggle is understanding the relationship of test automation and exploratory testing.

Test automation is a process in which we learn. Exploratory testing is a process in which we learn. When we center incremental and iterative, and learning in multiple dimensions, they are the two sides of the same coin.

This was a good week for exploratory testing, because someone many of us follow raised it up by writing about it.

new post: I'm a strong proponent of extensive automated testing, but there is still an essential role for Exploratory Testing https://t.co/ZGdTaLkAMu
— Martin Fowler (@martinfowler) November 18, 2019

Fowler describes it as:

a style of testing that emphasizes a rapid cycle of learning, test design, and test execution
avoiding separation of script creation and execution
avoiding predetermined expected behavior (being open to more than what we predetermine)
seeking to find new behaviors not covered by already defined tests
seeking failures defined tests don't catch
informal but relying on discipline to do it well
something that is good to consider as a task type of its own
requiring skill, curiosity and a learning mindset

Sounds like how great test automation is created! But this is exploratory testing? YES!

Fowler concludes his article:

Even the best automated testing is inherently scripted testing - and that alone is not good enough.

I argue that the best automated testing is inherently exploratory testing. The bad automated testing is inherently scripted testing. Because test automation, when it really works out, is a product of documenting what we are learning in an executable artifact.

I specialize in exploratory testing. While I read and even occasionally write automation code, it is a platform for a great exploratory testing. It extends my reach. Exploratory testing includes test automation.

In testing, we start with a baseline of knowledge and tools. I start my day with a very different baseline that someone new to my organization. Where exploratory and scripted approaches differ is in where our feedback loops are, and what learning is welcomed. Scripted approach learns about testing in the end, rather than continuously.

Saturday, November 16, 2019

From Single Scenario to Feedback in Test Automation

Understanding your system and environment is core to designing a strategy of how you balance unattended testing with test automation to attended testing. ¨

Imagine you just build a new feature that really sums up to a single scenario: your user can say "Do a scan on Wednesdays at 10.21 every second week of the month".

To test that basic positive scenario as attended testing, you kick up a Windows machine you will test on. It just happens to be a out-of-the-box-clean US 64-bit Windows 10 with the latest OS patches all in it. You confirm that indeed there is a scan on Wednesday (as it happened to be today) at 10.21 and that it looks like it is supposed to happen again in a month. Having chosen a realistic scenario, one that a user would definitely be likely to set up, you leave the machine waiting and make a note to test again in one month, and verify there were no surprise runs in the meanwhile.

If you know something about exploratory testing, you would probably look at the basic positive scenario as a starting point, and explore a lot more around it:

Explore Day, Time and Cadence
Explore type of task, is "Scan" always the same thing, realizing it scans something that could be entirely different
Explore plausible (and less plausible) error scenarios around wrong values, missing values, mixed up values, unintended use
Explore other things that could happen on the computer simultaneously and could have an impact

It all works. Are you done?

You are not. It works on your machine. Your machine that is just one kind of machine there could be.

And it worked today. With the version you had now. It will change.

This is the thinking that drives us to do test automation, or TA as we seem to lovingly call it.

For every scenario, it gets run on:

Versions of Windows for Workstations and Windows for Servers
64bit and 32bit - same code, different executables that only can be tested with right bitness.
Mixes of patch level and "computer cleanliness"

Sometimes, it gets run on more. There's a nice list of Windows OS versions that matters in what we are testing.

No human can attend to all that.

For our 550 scenarios, we ended up with 178745 tests run on one day that was busiest last week. A very small percentage of them fail but the failures are usually crash dumps related to timing that are super hard to reveal by human repetition or information on changes an their side effects.

This takes us from 'works on my machines' to ' works on some thousands of our machines'. Yet it does not mean that things work in production in full.

The basic scenario gets tested on "my machine" as I implement the automation. The exploration that must happen to script a scenario qualifies as testing.

The other dimension to attended exploratory testing are relevant. But so is the tooling to enable unattended exploratory testing, one that covers new environments. The tooling that aids us calls us when we need to attend to it.

Yesterday, it was 0.9% of our tests that called us to attend to it. And knowing these dimensions we work, I was very happy with seeing the new telemetry give us the list of priorities of what would make most impact if we attended to it.

And at the same time, I was still doing very traditional exploratory testing to find problems that automation was not the best fit for.

Thursday, November 14, 2019

Seasonal Fluctuation

It's that time of the year in Finland where days are feeling shorter and shorter, with the dark period of the day taken more than half of the day. The lesser amount of light is bound to have its impact: it's dark when we go to work, and it is dark when we get home from work.

Just like people's moods have seasonal fluctuation in response to the visible external impacts, a phenomenon very similar happens for our testing at work.

The easy season to spot for a lot of us in software is ahead of us, with Black Friday starting the Christmas shopping season and the webshops all around the world are under heavy loads from the eager shoppers. We just had a lovely discussion Lisa Crispin from MABL hosted online to cover that one.

But let's think of the visible external impacts a little more. There's more to seasonal fluctuation than the day of heavy load e-commerce sites prep for, and knowing your company's and businesses annual clock gives you insight into priorities that can really make you shine in your choices of how you test right now.

Budgeting and End of Year

I think we've all received the emails towards the year closing that we need to remember to invoice whatever expenses we have. Budgeting and end of year are are such a common cycle in companies that it is hard to miss.

You want to ask for a raise, your odds of getting that are very much tied to the annual cycle.

You want to go to a conference, and you could notice a pattern on it's easy end of year ("we have to use our budgets, or we won't get one next year!") or next to impossible before the year changes.

The computer systems have a lot of logic around the year changing, and the budgeting seasons are a specific feature of financial systems of many sorts. When on a normal month we run some batch processing sets, the end of year can be a very special one and we only get to rehearse it once a year, and with little rehearsal, there is often a surprise ahead.

Christmas Freeze

A seasonal feature I have seen a lot is driven by a company's IT and operations department, scheduling annually a freeze time when changes to production don't happen. For other companies, the same period seems to be the time of taking the huge new systems into production, while everything else around is not allowed to change so that granularity for identifying problems is better.

The time just before Christmas freeze is often critical. Make sure you're risk averse at that time, because fixing is slower and harder. It starts with the assumption it should not happen unless it must happen.

The Events

You might have noticed that the plans the company you work for have a spike at some time of the year, and especially the new-cool-feature is something that creates a special kind of buzz a certain time.

The big event your company must show new stuff at is bound to have a spike in what is expected of you. The requests of "just one more thing" come from many directions. The emphasis of how important this is eats up a significant chunk of focus. For those of us with a bit of nerves, the seasonal spike promises we must watch out for symptoms of burnout, and create those personal ways of managing the stress the managers of all levels popping in to check progress cause.

The Two Spikes A Year Principle

A more subtle fluctuation I have noticed is that a lot of companies find it necessary to tell something big an important to customers two times a year. If you recognize those times, there's relevant goodwill to collect on paying attention to these spikes and ensuring all the changes we deliver in an agile fashion sum up to a story worth telling. Also that story gives food for thinking about tests we might not have considered in the continuous flow of changes and a timely heads up can be greatly appreciated.

The Human Tendency to Seek Achievements

Two more spikes I see annually come from categorizing people into two boxes. People in the management side bring out a spike in around October, ensuring all the dreams and wishes from the whole year they have been planning for get done before the year ends. This is a time when I see a lot of plans, meetings, workshops and the like. People in doing side, the spike comes earlier, right after summer vacations with fully refreshed minds tackling problems they now have ideas and energy to improve on.

Removing fluctuation

Over the years, agile and continuous delivery has changed a lot of the seasonal fluctuation to feel less intense and I welcome the less stressed continuously flowing value approach that can come with it. But the fluctuation is there, and if we notice it, and are able to react appropriately, it can really amplify the impact we have as individuals, and the perceptions of how helpful and useful other people around us consider us.

What is your annual cycle? How about chatting on that the next time you sit for a cup of coffee at work chatting work? The powers that impact us are good to know.

Tuesday, November 12, 2019

A Feature Was Born

There is a fascinating phenomenon that I am following, one I would call "Overanalyze, Underdeliver". I find it fascinating because I catch myself often wanting to slow things down to think, not trusting the others on their thinking and assuming delivering something could be the end of it.

When delivering truly continuously, there is no beginning, and no end. There are just steps on our journey to build something our users find more valuable, rather than less.

There is an evolving product vision we work against. It wasn't defined by product management, but it is most certainly influenced by them channeling different stakeholders - with their particular kind of filter. It is emerging from discussions with many different parties, including customers we actively seek out.

From this foundation, a single developer can have a great idea of how to make things better.

In last week, I have been following a feature being born and the discussions and actions we take around it.

Awareness of such a feature was born two years ago, and the wishful thinking of it was cut down before it bloomed. It needed people in a particular team to have time for it and they had higher priority work.

In two years, things changes. Not that the particular team would have time, they don't. But we took our internal open source practices to a next level, where we don't only share components on our main programming language, but bravely go polyglot-one-more and with right motivation, can make changes beyond our previous scope.

So it bloomed again. The "we need to think this through" meaning "I can't think this through right now" came about again. But this time instead of spending time on thinking it through in an abstract way, a developer molded the thing in code.

Today came the time to think it through - demoing, testing and improving the feature as a group. Tomorrow is the time to get it in, in a pull request.

A feature was born. It was born in a time where choosing the discussion route, we would still be discussing. What fascinates me most is about how much power there is in breaking off the defaults and reorganizing the flow.

We probably saw the earliest exploratory testing & fixing we have been capable of so far - before ever making that pull request.

An early Christmas for a Testing Dreamer. And just a happy day for a process rebel.

Monday, November 11, 2019

What the Testing We Do Looks Like?

Back in the days before releasing frequently was a thing, testing looked very different. The main difference was that we thought of automation as something that was replacing attended testing, whereas we now see it more as a way of introducing scale of unattended testing we could never have done attended.

The stuff we attend to changed as the world around us shifted. We still do "testing" with just as many hours, but the work is split on working heavily on unit tests, test automation and other quality practices.

I try to explain the change I see with a metaphor: pool is not a bigger bathtub. Both these are containers of water (just like software development is just making and delivering code changes), but what you can do in a pool is very different than what you can do in a bathtub. Imagine a bath guard - kind of hilarious. Or a bath party - very eccentric. We create new things we can do with just a bigger container of water, and bringing down the release cycle does something very similar to the ways we work. The whole conceptual model of what makes sense changes.

In efforts to describe what seems to make sense right now, I work on finding words to describe what testing I do looks like to me. It is not a tool for sense making the whole world of all testing, and I don't need a tool for that - I don't work in the whole world, I don't consult in the whole world. I merely describe my lessons from where I am for the organization that I work for but also in my understanding of how the world comes together.

For me, it all is customer-obsessed and developer-centric. We all care for where the money comes for our business. And we recognize and appreciate that we have something out there that people are already paying for that we want to always improve and never make worse. But we already have something valuable.

Smart Developers Turn Ideas Into Code

Since it is already out there, I can't describe a phase of requirements gathering, but rather it begins with a vision of value. There is always, even if informally, a mostly shared base of understanding of the types of things we are providing for the customers. I recognize vision not from a document someone wrote, but from discussions with multiple members of community bringing perspectives driving us to a similar direction. Similar, not same, as vision is in its most powerful state when it guides individual actions without excessive coordination.

From having something out there working for the customers and vision, we come to a set of ideas on what we could try to make things better. This part of the process of coming up and acting on the ideas, turning them into code, is where the change gets made.

Some ideas are bigger, and they are hard to grasp alone. Smart people are not alone, but surrounded by other smart people. Together, we can make ideas better, and as such, the resulting code better. Or we can discover the ideas in the first place.

Let me share a small example of how I can reasonably expect my days to be from today.

I had a Windows machine I had kicked up from an image last week, forgetting that the images in that particular set of images give me a fairly old version of windows, one where our .Net dependency for showing a full UI stops me from seeing the UI that I wanted to test. I run a windows update tool on Friday, and since that takes its time, did something different going back to the computer today. As I remote to the computer, I see the IP but I wasn't remembering the name of the computer. There is one very simple way for me to do it: logging into our security management portal and seeing it there. So I did.

I found the computer, got things I needed but also noticed we had started showing both IPv4 and IPv6 addresses. Having been elsewhere, I had not followed that detail, but just looking at it I said that I didn't like a detail on how they were ordered. 5 minutes later of me just casually mentioning this, I had a pull request to review that fixed the problem. We added a bit of discussion around what I would test around more network interfaces that was hard to simulate, and another pull request on an additional unit test was created.

My tester-contribution can happen anywhere, anytime, without constraint of a process. It will happen from a foundation of me (the tool) being in the right mind in the right place to connect things. And I cultivate the chances of that lucky accident through discussions but also, hands-on with my external imagination - the product - either directly or through a set of scripts known as test automation (TA).

On doing my work, I rely on an existing structure that we have built and are enhancing as we learn what we are missing:

A Smart Developer creates code to change the application
Unit tests on local machine to 80-90% coverage
Pull Request Review - a minimum of second pair of eyes on change
Static tools of various flavors in the pipeline
Unit tests in the pipeline
TA in the pipeline (with TA telemetry)
CI environment application telemetry
Change-log driven exploratory testing
Changes out in other product line continuous beta
Changes out in internal pilot
Changes out in early access
Synthetic monitoring aka. production TA (or machines, in production, continuously monitored)
Production Telemetry, positive events and error events

The like "TA in the pipeline" is more than a few scripts run, and I will dedicate a post of its own to it later.

With every change, we try to leave things better than they were before. Caring developers pulling in the help they need, and making the help readily available is what testing looks like to me. Testers are developers. We change things, but we also change the other developers perceptions.

We see things other people don't.

Saturday, November 9, 2019

A Career Retrospective

With few decades in the software industry, I have something I would call a career. I would not have seen how my career unfolds in advance, and I could never have described what I do as a path I want to take.

A few heuristics have served me well:

Do something you enjoy (and some of it should bring money in)
Always be learning.
Have a goal, just to recognize whether what you enjoy takes you there. If not, change the goal.

For some years, I had a goal of becoming an international keynote speaker and I did a lot of my choices around that goal. I chose jobs that built me a platform of experiences to speak on, doing things hands on with product development rather than consulting. And I became a keynote speaker that wants to quit speaking and replace her contribution of 20-30 talks a year with 20-30 new speakers who start from a platform of mentoring.

My current goal goes under the working name "Maaret to Wikipedia" and if you have ideas on how to do that, I am open to suggestions. Currently it feel funny, self-indulgent and next to impossible to see the route, and it is already making an impact on how I prioritize - within things I enjoy - the things I end up doing. The best thing about this goal is how supported I feel for it at work with my close colleagues and how it helps me see some people who always lift me up when I need it (Marit van Dijk, I'm thinking of you).

A lot of the work I do is invisible.

I help people who are speakers to get their messages out better. Sometimes giving people the transformative insight into what they are speaking on takes me literally a few minutes, and the people seeing their own experience in a different light completely miss the shine I added. They would not be better off without running into me - usually very intentionally on my part.

At work I facilitate a developer-centric way of working in a way that mixes holding space for good things, injecting good things and leading by doing some of the things. The work I do leaves behind very few pull requests, but many things others do shine a little better because of the stuff I do.

I change jobs fairly often, leaving behind people who, I would hope, know something more because our paths have crossed. The companies interest is not lifting their people up, rather to abstract us under a brand.

I write blogs, articles and all kinds of texts. It changes what some people do. Yet when they do it, doing it is their own success.

I build talks of my experiences, I show up to share, and I discuss with people. Meeting people gives me a lot of energy, new ideas and drive.

I'm not exactly a one trick pony within software - but software all in all is my trick. My interests are manyfold. I speak and write about exploratory testing, test automation, teaching programming, mob & pair programming, agile, management, self-development, conference organizing, speaking, diversity, and any observations around software that I feel like. I'm usually known for things I do on the side, rather than the things I focus on.

What I'm particularly proud of is my ability to re-invent myself and see my belief systems shattered - with my own initiative. Listing things that I believed to be true that aren't so is one of my favorite pastimes.

A Vague Timeline

In recent reflections, I have come to appreciate how large chunks of my work during my career has been left to oblivion as per how things are and personal choices of not sticking with them. They've all given me a platform to observe things from. They also bring out feelings of wishing someone would have taught my younger self some of the things I now know. But I also recognize that my younger self did what she could with the conditions she was under, and every experience I have had has made me the person I am today. Hindsight bias makes us feel like we could have known things, and if there is one thing exploratory testing really enforces in learning, it is that the reality of missing things is the reality and we outcomes are unpredictable.

“Looking back, journeys are never clear. so why do we still expect them to be when we start a new one?” — @a_bangser
— Maaret Pyhäjärvi (@maaretp) November 9, 2019

Today:

Describing test automation at work as a baseline for returning to research - on applying AI in testing, and applying testing in AI-based systems
Building a self-organized developer-centric team with modern agile practices that have enough structure for the powerless
Writing further my books on Mob Programming, Strong-Style Pair Programming and Exploratory Testing
Organizing a conference as experimentation platform to change the world of conferences
Helping aspiring speakers by finding them mentors with SpeakEasy (or mentoring them myself)

Before, each step going sort of backwards in time in a way that makes sense to me:

Becoming an expert in exploratory testing. I've done this all my career, and it is the one thing that has been my continued focus.
Becoming an expert in engineering management. I did not realize I had been learning this in my test manager role before. A few decades of reading every book on the topic to manage up effectively as a tester did help.
Becoming an expert in test automation. Moving it from none to some, and from some to better. Knowing well what better looks like.
Speaking in conferences, meetups and delivering training sessions that total 399 sessions.
Discussing (and improving) conference proposals in 15 minute time-slots over three years with about 500 people and discovering a process I call "Call for Collaboration".
Popularizing "Testers don't break your code, they break your illusions about the code" by speaking about it, elaborating it with samples from my professional life, beyond testing conferences. The guy who said it did not do the work I did around it. Google for evidence and stop assuming my work belongs to him.
Introducing frequent product releases where it was "impossible" as release updates computers in the millions.
Introducing daily product releases where it was "impossible" as there was no test automation.
Organizing 5 years of European Testing Conference to learn how (if) conferences should pay the speakers, to create a true networking conference and to bring together developers and testers on a shared testing agenda.
Becoming an expert in pair and mob testing (and programming).
Teaching programming (in Java) to women over 30 and kids with the Intentional method using pair and mob programming as core instruments in teaching.
Teaching Software Testing at Aalto University of Applied Sciences / Helsinki University of Technology both as main lecturer but also as visiting industry speaker
Doing my first keynote to only be known as the woman the other keynoter spent their keynote bashing "out of respect and surprise how alike we think".
Building and teaching a 22-day on-site Testing training program to enable unemployed career changers into the industry. Delivering a second iteration as independent trainer.
Running Finnish Association for Software Testing for decade and letting it wither away as a man was rewarded and thanked for starting the thing. Starting Software Testing Finland (Ohjelmistotestaus ry) to start over, only to realize that there was no correcting as any communities around the topic in Finland are intertwined in people's minds.
Becoming an expert in complex test environments. If you ever feel like talking about the kinds of environments that cost a million and take minimum of 6 months to deliver, then we have similar experiences.
Becoming an expert in defect management and bug advocacy. Analyzing a large set of defect management tools in order to select one against requirements gathered in a fairly large organization.
Becoming an expert in acceptance testing. I know how to get domain experts clueless on testing just enough structure to excel and not waste effort and impact the quality at start of acceptance testing through contracts and collaboration. I spent some years intensively learning it.
Becoming an expert in test management. Running multi-million projects as test manager, but also running smaller ones. I did this for different companies to get the crux of it.
Becoming an expert in software contract quality and testing -related aspects. If you ever want to spend a few hours on discussing how badly contractors can behave and how you recognize loopholes in contracts around this, I'm your person.
Becoming an expert in software processes leading up to agile. When Alistair Cockburn asks who has read his work on Crystal, there were not many others in the room that had. Research gives you chances to read and think deeply about what others are saying.
Becoming an expert in benchmarking with the TPI-model. Analyzing 25 Finnish companies with TPI-model and doing a benchmark on state of testing in Finland. I can still speak on the details because I did the work even if the company kept me in the background.
Doing my first talk on the topic of Extreme Programming in 2001.
Researching (and publishing) on software product development, and (exploratory) testing
Becoming an expert in localization testing. I spent years running localization testing projects and doing it myself and learning everything I could read on then and since on how localization testing works.

Even if I have my "Maaret to wikipedia" project, it serves more as a way of thinking through what there is that I could even do. In the end of the day, I go back to my heuristics: do what you enjoy, and always be learning. Goals move, but appreciation of learning with great people remains.

Rethinking Test Automation - From Radiators to Telemetry

Introducing Product Telemetry

A week after we started our "No Product Owner" experiment a few years back, the developers now each playing their bit in product owner decided they were no longer comfortable making product decisions on hunches. In now common no hassle way, they made a few pull requests to change how things were, and our product started sending telemetry data on its use.

As so often is, things in the background were a little more complex. There was another product doing the pioneer work on what kind of events to send and sending events, so we could ride on their lessons learned and to a large extent, implementation. The thing I have learned to appreciate most in hindsight is the pioneer work they did on creating us an approach to care for privacy and consent as key design principles. I've come to appreciate it only through other players asking us on how we do it.

The data-driven ways took hold of us, and transformed the ways we built some of the features. It showed us that what our support folks know and what our real customers know can be very far apart, and we as a devops team could change the customer reality without support in the middle.

The concept of Telemetry was a central one. It is a feature of the product that enables us to extend other features so that they send us event information about their use.

At first product telemetry was telling us about positive events. Someone wanted to use our new feature, yay! From the positive, we could also deduct the negative: we created this awesome feature and this week only a handful of people used it, what are we not getting here? We learned that based on those events, we did not need to ask all questions beforehand, but we could go back exploring the data to learn patterns that confirmed or rejected our ideas.

We soon came to conclusions that events about error scenarios would also tell us a lot, and experimented with building abilities to fix things so that the users wouldn't have to do the work of complaining.

This was all new to us and as such cool, but it is not like we invented this. We just did what the giants did before us, adapting it to ensure it fits to the ideas of how we are working with our customers.

We Could Do This in CI!

As telemetry was a product feature, we tested it as a feature, but did not at first realize that it could have other dimensions. It took us a while to realize that if we collected the same product telemetry from our CI (testing) environment than we did in production, it would not tell us about our customers but it would tell us about our testing.

As we did that, we learned things about the way we test (with automation in particular) that the scale of things creates fascinating coverage patterns. There were events that would never be triggered. There was a profile of events that was very different to that of production. A whole new layer of coverage discussions was available.

This was different use of the same feature we had in the product in test than in production.

The Test Automation Frustration

To test the product we are creating, we have loads of unit tests to do a lot of heavy lifting on giving feedback on mistakes we may make when changing things. As useful as unit tests are, we still need the other kinds of testing, and we bundle this all together in a system we lovingly call TA. As you may imagine, TA is shorthand for Test Automation, but the way I hear it, I rarely hear the long word at work but TA is all around.

"We need to change TA for this."
"We need to add this to TA."
"TA is not blue. Let's look at it."

TA for us is a fairly complex system, and I'm not trying to explain it all today. Just to give some keywords of it: Python3, Nosetest, DVMPS/KVM, Jenkins, and Radiators.

Radiator is something you can expect to see in every team room. The ones we're using were built by some consultants back in the days when this whole thing was new, and I have only recently seen modernized versions someone else built in some of the teams. It's a visual into all of the TA jobs we have and a core part of TA as such.

The Radiator builds a core principle on how we would want to do things. We would want it to be blue. As you see from the image of its state yesterday as I was leaving office, it isn't.

When a box in that view is not blue, you know a Jenkins job is failing. You can click on the job, and check the results. Effectively you read a log telling you what failed.

A lot of times what failed is that some part the TA relies on in its infrastructure was overloaded. "Please work on the infrastructure, or try again later."

A lot of times what failed is that while we test our functionalities, they rely on others. They may be unavailable or broken. Effectively we do acceptance testing of other folks changes in the system context.

Some people love this. I love it with huge reservations, meaning I complain about it. A lot. It frustrates me.

It turns me into either someone who ignores a red, or risking overlapping work. It requires a secretary that communicates for it. It begs people to ignore it unless reminded. It casts a wide net with poor granularity. It creates silent maintenance work where someone is continuously turning it back blue, that hides the problems and does not enable us to fix the system that creates the pain.

I admire the few people we have that open a box and routinely figure out what the problem was. I just wish it already said the problem.

And as I get to complaining about the few people, I get to complain about the logs. They are not visitor friendly. I don't even want to get started on how hard it is for people to tell me what tests we have for X (I ask to share my pain) or for me to read that code (which I do). And logs reflect the code.

From Radiator to Telemetry

A month ago, I was facilitating a session to figure out how to improve what we have now in TA. My list of gripes is long, but I do recognize that what we do is great, lovely, wonderful and all that. It just can be better.

The TA we have:

spawns 14 000 windows virtual machines a day (older number, I am in process of checking a newer one)
serves three teams, where my team is just one
tests 550 unique tests for my team for number of windows flavors on pull request
tests all the 15 products we are delivering from my team
runs 100 000 - 150 000 tests a day for my team
finds crashes and automatically analyzes them
finds regression bugs in important flows
enables us to not die out of boredom repeating same tests over and over again
allows us to add new OS support and new products very efficiently

The meeting concluded it was time for us to introduce telemetry to TA - and some of the numbers above on the unique tests and number of runs daily are our first results of that telemetry in action.

Just as with the product, we changed the TA product to include a feature that allows us to send event telemetry.

We see things like passes and fails now in the context of the large numbers, instead of the latest results within a box on the radiator.

We see things in multiple radiator boxes combined together into the reason we before needed to verify from the logs.

We see what tests take long. We see what tests pass and what fail.

And we have only gotten started.

The historic date of the feature going live was this week Thursday. I'm immensely proud of my colleague Tatu Aalto for driving through code changes to make it possible, and the tweets where he is correcting me on my optimism warning he had a few bugs he already fixed. I'm delighted that my colleague Oleg Fedorov got us to see a solution through seeing things. And I can't wait to see what we make out of it.

Monday, November 4, 2019

A meeting culture transformation

As I was looking into mob programming some years back, we summarized a common theme of complaints into a little cartoon with people discussing in a meeting room.

Person 1:
My team is interested in trying Mob Programming.
The idea is everyone works together on one computer.
The person at the keyboard is just typing what the whole team tells them to. So everyone is involved, instead of 5 people watching 1 person work.
You rotate quickly, every 5 minutes, to develop cross-functional teams and eliminate knowledge silos.
Ideas get implemented the best way the team can no matter who has them.
Misunderstandings and bugs are minimized.

Person 2:
Sounds like I'd be paying 5 people to do 1 job.
Now let's stop talking such nonsense. I still have a lot of slides to go through.

The word around is that managers hate mob programming. As a manager who wants my team to do mob programming but they refuse, I think we love blaming managers for our own assumptions we did not keep in check.

Up until this morning when I came to office, I was discussing how Mob Programming is different than a meeting. What changed this morning is that a colleague read my latest Mob Programming Guidebook and pointed out that while we don't really do full-on mob programming, we have managed to transform our meetings into little mob sessions.

It's funny how you need someone else's eyes to see how you're different.

For the last three years here, I have not gone to a single meeting with slides prepared.
I don't go unprepared. But I never ever write an agenda in advance.
When I start a meeting, we build an agenda. It might be that we actively take time to build it. Or it might be that we build it by parking themes that pop up that are relevant but not about the thing we are trying to sort out right now.
We work the agenda within a timebox either by doing the most important work first, or by doing just enough of it that the rest can happen offline, outside the meeting without others losing context completely.

As my colleague points out: all our meetings are little mob sessions. How about yours?

Sunday, November 3, 2019

Mobbing with an Audience

I've run some hundreds of mob programming and testing sessions with new groups for purposes of conference talks and trainings, and while I prefer setting up a full day session so that I can mob with the whole group of 25 people, sometimes I end up splitting the group for demo purposes. I was writing about this for the new version of Mob Programming Guidebook, and thought it might make useful content just as a blog post.

Mob programming with an audience is a special setup that is useful tool especially to someone teaching mob programming, teaching any skills in software development in a hands-on style making new kinds of sessions available for conferences, or generally running demo sessions with partial session participant involvement. As a conference speaker and a trainer, a lot of our mob programming experience comes from facilitating mob programming sessions with various groups. For a training, we usually set up the whole group into a mob where everyone rotates. For conference sessions where time constraints limit participant numbers for effective mobbing, we use mobbing with an audience.

THE SETUP

For mobbing with an audience, you split the room to two groups:

The Mob. For the most effective mob made of complete strangers is small. You want to have a diverse set of mob programmers. These are the people doing the work.
The Audience. The rest of the group sit in rows as audience. The role of the audience is to watch and make observations, and their participation is welcome when doing a retrospective.

For the mob, you will set up a basic mob setup in the front of the room with chairs for each person, whiteboard furthest away from the computer to ensure speaking volume for the designated navigator through the physical setup.

For this setup, you will need a room with chairs that are freely moving. Make sure text on the screen is big enough not only for the mob to see, but the audience to follow as well.

TIPS FOR THE FACILITATOR

As we have run some hundreds of sessions with various groups in this format, we have had things go wrong in many ways.

Things you can do in advance to ensure less problems

If the room is big, ask for a microphone for both the driver and designated navigator. It is essential that people in the room can hear their dialog. While there are no decisions allowed on the driver seat, speaking back to the navigators pointing out things you see and they don’t is often necessary.
If you have only one microphone, give that to the designated navigator. Even in smaller rooms, the microphone can work as a talking stick the designated navigator passes around for other navigators and can help create an atmosphere where everyone in the mob gets to contribute.
Make sure the text on the screen is visible from the back row. Avoid dark theme, it does not serve you well for live coding and testing in front of an audience.
When selecting the diverse mob, what you need to do for this depends on who you are. If you are a white man facilitator and want women, start with inviting women or facilitate mob member selection in a way that gives you a diverse set of mob programmers. As a white woman, women volunteer for me in ways they don’t for the men and I need to work and I need to work on other aspects of diversity.
For a demo mob, you may want to demo a group with experience working on the problem and even together. If that is your aim, invite the people you want for the mob in advance.
A new mob with different experiences highlights many powerful lessons around collaboration and people helping each other and your goal to set up a fluent demo is probably infrequent. The new programmers exclaiming “they now know how to do TDD” as equal contributors is a powerful teaching tool.

Things you can do while mobbing to improve the experience

Encourage people in the audience who want to be navigating from the audience to join the mob. To be more exact, demand that or holding their perspective that can be very disruptive.
If you want to introduce who is in the mob, you can do that on first round of rotation. If you want deeper introduction, you can have a different question to tell about themselves on each round of rotation.
When people rotate, ask them to tell what they continue on. It helps to enforce the yes and -rule and is sometimes necessary when nervous participants have been building their private plan waiting for the hot seat.
When group is stuck, ask questions. “Does it compile?”, “What should you do next?”, “Did you run the tests?”, “What are your options now?”. Your goal is not to do things for them but get them to see what they could be doing.
When group is stuck in not knowing how to do a thing, say “Let me step in to navigate” and model how to do a thing for short timeframe. Expect the group to do that themselves the next time.

Things you can do in retrospective to save up a messy session

Facilitate a retrospective towards discussions around reasons we could learn from for lack of progress
Introduce theories or ideas of how you could try doing things different the next time.
Find your own style of facilitating groups of strangers. Having seen multiple people facilitate, there are style differences where one person’s approach would feel off on another. Strong-handed “supporting progress” and light-handed “enabling discovery” will result in sessions that are different.

Saturday, November 2, 2019

Never Stop Learning

I have a full time work that I enjoy, and I very carefully review my own satisfaction to the impacts going on at work. I require myself a balance of being productive and generative. Not one to the other, but a balance of these two.

I'm being productive when I:

strategize testing and communicate strategies so that we are better aware of problems I will be looking for
test (possibly documenting as test automation) to add to coverage of what might work but particularly identify things that did not
have the gazillion discussions leading to over time to a process improvement or someone else's raise
when I fix problems, be in it the program or in the way people interact

I'm being generative when I:

teach others how they do better testing when I am not around to do it
lead people into insights that make then do things in a way that is more productive
bring in ideas that inspire me and through me, us overall

The way I control my work weeks is that I try to be mindful doing things that are directly for my employer the 40 hours a week, and then have 'hobbies' that resemble work but are fully my choice, my control - even though these activities benefit my employer too.

Realistically, I cannot split work and fun. Work is fun. So I manage my own expectations of what I do, and try being mindful of the work-life balance when the lines are blurred by my own choices.

Doing stuff that resembles work and could be work 140% is a better framing. On top of that there's family, friends and stuff that does not resemble work. Writing a blog post on a Saturday resembles work.

I do this because my interest are divided. While I love the impact we are building for at work that I have defines as my purpose (while there, for now), I also love making a dent in the world outside helping new speakers get started, building my own talks, writing articles beyond what can fit in my work day frame.

In theory, I could be giving more for the purpose at work. The 100% time I give them could arguably be more awake, more focused if I wasn't doing all the other things. But thinking this way would be shortsighted because the 40% time gives me learnings that change who I am and what I can do, both in providing motivation and actual skills.

Having discussed this with a colleague with similar yet different profile, I'm taking a learning from it:

It's not the hours and their efficiency today, it's the continuous growth on our ability to deliver.

It's the math of never stopping learning.

Wednesday, October 30, 2019

Assert and Approvals, and Why that Matters

As an exploratory tester, a core to my existence are unknown unknowns. I stumble upon problems - to an extent people like Marit van Dijk call out that "I don't find bugs, the bugs find me". But stumbling upon them is no accident. I intentionally find my way to places where bugs could be at. And when there is a bug, I try to finetune my ability to recognize it - a concept we testers call oracles.

As I'm automating, I codify oracles. In some ways of automating tests, the oracles are multipurpose (like property-based testing, describing and codifying rules that should hold true over generated samples), and sometimes they are very specific.

Both these multipurpose partial oracles and single purpose specific partial oracles are usually things we build as asserts. In the do-verify layers of creating a test automation script, asserts belong in the verify part. It's how we tell the computer what to verify - blocklisting behaviors that cannot be different. Much of our automation is founded on the idea of it alerting us when a rule we create does not hold true. Some rules are fit to run unattended (which is why we focus on granularity) while others are for attended testing like exploratory unit testing.

Another approach to the same codifying oracles problem comes through approval testing. What if we approached the problem with the idea that a tester (doing whatever magic they do in their heads), would recognize right-enough when they see it, and approve it. That is where approvals come in. It is still in the verify-layer of creating a test automation script but the process and lifecycle is essentially different. It alerts us when things change, giving names to rules through naming tests without a pre-assumed rule of comparing to a golden master.

Approvals in automation increase the chance of serendipity, a lucky accident of recognizing unknown unknowns when programming, and they speak to the core of my exploratory tester being as such.

The Difference in the Process and Lifecycle

When we create the tests in the first place, creating an assert and approval is essentially different:

An assert is added to codify the pieces we want to verify and thus we carefully design what we will tell us that this worked or didn't. Coming up with that design is part of creating the assert and running the assert (see it fail for simulated errors) is part of creating it.
An approval is prepared by creating a way to turn an object or aspect of an object into file representation, usually text. The text file would be named with the name of the text, and thus whatever the textual representation of what we are creating is the focus of our design. We look at the textual representation and say "this looks good, I approve", saving it for future comparison.
Assert you write and run to see green. Approval you write and run to see red, then you approve to see green.

When we run the tests and they pass, you see no difference: you see green.

When we run the tests and they fail for a bug we introduced, there is again an essential difference:

An assert tells us exactly what comparison failed in a format we are used to seeing within our IDE. If run on headless mode, the logs tell what the failed assert was.
An approval tells us that it failed and shows the context of failure e.g. opening a diff tool automatically when running within our IDE. Especially on the unit level tests, you would want to run the tests in IDE and fix the cause of failure in IDE, having it all at your fingertips.

When we run the tests and they fail for a change we introduced, we have one more essential difference:

An assert needs to be rewritten to match the new expectation.
An approval needs to be reapproved to match the new expectation.

When looking for things we did not know to look for, we are again different:

An assert alerts us to the specific thing we are codifying
An approval forces us to view a representation of an object, opening us to chances of seeing things we did not know we were seeking.

Back to exploratory and why this distinction matters so much to me

Even as a programmer, I am first and foremost an exploratory tester. My belief system is built around the idea that I will not know the mistakes I will make but I might recognize them when I see them.

I will create automation that I use to explore, even unit tests. Sometimes these tests are throwaway tests that I never want to push into the codebase. Sometimes these tests belong to a category of me fishing for new problems e.g. around reliability and I want them running regularly, failing sometimes. I will keep my eye on the failures and improve the code they test based on it. Sometimes these tests are intended to run unattended and just help everyone get granular feedback when introducing problems accidentally.

With approvals, I see representations of objects (even if I may have to force objects into files creating appropriate toStrings). I see more than I specifically command to show. Looking at a REST API response with approvals gives me EVERYTHING from header and message and then I can EXCLUDE undeterministic change. Creating an assert makes me choose first and moves exploration to the time I am making my choices.

The difference these create matters to my thinking. It might matter to your thinking too.

A Seasoned Tester's Crystal Ball