Saturday, August 31, 2024

Ethical Stress of the Consulting Work

For three months now, I have been adjusting into a new identity. With my new job, I am now a consultant, a contractor and a service provider. I work for managing a product of testing services, and provide some of those services. 

It's not my first time on this side of the table, but it's my first time on this side of the table knowing what I know now. 20 years ago when I was a senior consultant, I was far from senior. I was senior in a particular style of testing, driven to teach that style forward and learn as much as I could. And while I got to work with 30 something customers opening up new business of testing services back then, I was blessed with externally provided focus and bliss of ignorance.

I had a few reasons to stop being a consultant back then: 

  • Public speaking. I wanted to speak in public, and as a consultant your secondary agenda of sales was getting in the way. Not really for me, but in eyes of others. I got tired of explaining that I would not be able to go to my organization for sponsorship just so that I could speak, and that I was uncomfortable building that link when contents should drive the stage. I knew being in customer organizations for exactly the same work would change the story. And it did. And that mattered to me. 
  • Power structures.With customer and contractors, there is a distribution of power. When a major contractor in a multi-customer environment kicked out of steering group the testing representatives of two out of three organization citing "competitive secrets" and I was the only one allowed in room to block the play that was about to unfold, I learned a lesson: being in the customer organization was lending me power others lacked. Back then I had no counteracts, and I do now, having been in boardrooms as both a testing expert advising board members, and member of those boards. 
Thus 20 years later, I knew what I was doing when I joined consulting to solve the problem of testing competences in testing services in scale. I knew consultancies are the place where this can be solved in scale. I knew the numbers are not on individual customers side, and scale means I need to serve many customers. I knew I needed to level up the testing services to become the testing services I had so much difficulty purchasing when I was on the customer side. 

What I did not know is that the job of consulting would teach me about ethical stress. Because when you serve many yet invoice some, your life is a daily balancing with your sense of fairness. And you will feel the push of just working for one customer so that the overhead of context switching would not be on any of them. The teaching of ethics this particular organization offers adds to the stress. Working by the hour is an extra mental load. 

It's not that I can't deal with it. It's just that it is so much more than it was before that it sticks out, and I need to label it: 

Ethical stress is the continuous sense of having to balance the different perspectives. 

If it takes me an hour to report hours, who should pay for that hour? 
If I get interrupted with another customer while working for a different one, who pays for the reorientation time?
If I have to create a plan and actually have to follow that plan even though I know better, do I look worse even though I am better? 

Having recognized this, I now use it to discuss no estimates in agile teams. Because ethical stress is the big thing estimating and following detailed hours brings to people in the team. It is an invisible motivation killer, impacting how we do the tasks to make them trackable rather than flow best way we know how. 

Ethical stress costs us energy. And sometimes we are so focused on teaching the ethical part of things that we forget the stress part of the same. 


Thursday, August 22, 2024

Prepare to shift down

While the world of testing conferences is discussing shift left - a really important movement in how we make the efforts count instead of creating failure demand, we are noticing another shift: shift down. 

For years, we have been discussing the idea that sustainable success in programmatic testing space requires you to decompose testing differently. Great automation for testing purposes is rarely built from automating what you considered end to end flows a human experiences. Great automation optimizes for speed and accuracy of feedback, granularity in both time - immediately available for the developer who made an impacting change - and location. No speculation on the source of the problem saves up a whole lot of effort. 

As I talked about these two shifts today as part of a talk in applying AI of today in testing as a foundational concept in terms of how I believe we should first shift left and down and a personal suspicion that we might like what AI is doing for us after the shifts, am also realizing positioning organizations at these two locations helps make sense of the kinds of tools and workflow ideas they are automating from. 


What is this shift down that we talk about? Shift left is a common place term and I prefer not having left or right but single commit delivery making things continuous. One can dream. But shift down, that is not as commonly discussed.


Shift down is this idea that test-driven development is great, yet limited by the intent of the individual developer. A lot of developers are good and getting better daily at expressing and capturing that intent, and having that intent is hugely beneficial in routinely accepting / rejecting generated code from modern tools and choosing to stay on controls. From a sample set of projects most certainly not built with TDD and with unknown level of unit testing, we have seen a report recounting that 77% of bugs that escaped after all our efforts to production could be in hindsight reproduced with a unit test, meaning there is a lot more potential for doing good work of testing on the unit testing level. I like to play with the word exploratory unit testing, which is kind of a way of stretching the intent of today with learning in the context of the code, to figure out some of this 77%.

For a few crafters I have had the pleasure to work with, Test-Driven Development and Exploratory Unit Testing could be interchangeable. For others, the latter encourages us to take time to figure out the gap, especially with regards to legacy code where those who came before us left us less than complete set of tests to capture their intent. 

Shifting down pushes conversations to unit tests; component tests; subsystem tests; and guides us to design for decomposed feedback. We've been on that shift as long as the other one. 

Tuesday, August 20, 2024

How to make testing fun and enjoyable?

I have believed and experienced that testing is fun and enjoyable for 27 years. I have had that experience enough to talk about my primary heuristic from stage:

Never be bored.


 This confuses people, especially when their idea of testing is repetition growing over time. 

You keep replenishing the test results. Sort of same results. Except that while the tests may be the same, you don't have to be. You can vary things, and return to common baseline when the variation takes you to surprising information. Every change, every changer in the moment is different. And it's like a puzzle figuring out how to create a spider web of programmatic tests that tells you enough while not all, and yet look at each change with the curiosity of 'what might go wrong here'. 

If I feel bored, I introduce variation. I change the user I log in with. I change the colleague I pair with. I change the order in which I test in. I write test automation that does not fit our existing patterns of how we automate. I write detailed public blog posts while I test unlike normally. I experiment with separating programmatic tests that always run into suites where I run each suite every second day to save up replenishment resources. Well, the list of variations is kind of endless. 

I love testing. And I have been testing today for a new system under test (for me). 

In order to be able to test the way I love testing, I have to be able to have a solid foundation of programmatic tests that we grow gradually as output of our testing, capturing the pieces of learning worth keeping around. Today, I want to recognize a few things that I need to keep testing fun and enjoyable. 

  1. Agency. You don't give me a test case to automate. You give me a feature to test, and out of that I will automate some test cases. But thinking you plan and I execute takes the fun out of my testing. Even the more junior folks do better starting with WHY not HOW. 
  2. Smart constraints. You don't tell me that programmatic tests need to mimic written step by step test cases. That makes me use my time in updating two documentation sets for same purpose, and doing busywork is not fun. 
  3. Test environment. You don't deny me access to exploring of an old version while I design and collect ideas for how we should test changes for the new version. External imagination - the product without the change - makes the task more productive, and it's fun to do good work. There needs to be enough of these to go around for us all, every day. 
Notice how my fun and enjoyment isn't asking for documentation or answers to all things about the product. Not having those around is sometimes different kind of fun, even if I prefer us starting with a better agreement, you can be sure I will discover things outside it. It also does not include great people and friendly developers, because today I choose to believe that people are good and want to do good. Us discovering exactly how many jokes going around that requires is part of the variation. 

A colleague inspired this post by wishing that we had common templates and a unified front on what test documentation looks like. Figuring out how I could ever do that, when I do decent plans and strategies but not to a template should be fun. While it's fun and enjoyable, it is less impactful for the good results I would want out of my testing. Plans are more often ways for me to think the big picture than the most relevant deliverable. 

That's my shortlist (today), what's yours? 

 


Saturday, August 17, 2024

Believing things can be different

27 years testing, and I still think problems with software are fascinating. Running into a problem does not mean it wasn't tested at all, it means I run into a flow that wasn't covered or a problem considered not relevant to fix under whatever constraints the organization has. Sometimes the conditions are fairly general throughout the user base, like with Crowdstrike, and impact all users. Sometimes the conditions are fairly specific, and impact some users. 

There's a story I have told a lot, an illustration of serendipity and the role it plays in testing. Joining a new organization as a tester on my first day, I got access to the system under test, credentials to log in, and logged in barely before I was asked to join whatever inception program the company had in mind. Being pressed with time and not relying on remembering, I bookmarked the address. Not the main page though, I had already clicked onwards a bit, not even paying attention. When I got back to testing, I used the link to find the application and experienced a big visible error screen. Further investigation showed that I was lucky enough to bookmark a single page inside the application with a different implementation pattern resulting in this error. Quite a way to make a tester entrance. 

For some years after that, I was thinking that core of testing seem to be serendipity and perseverance. Working with a team where developers tested, had tested for 15 years before the first tester ever joined the team, claiming there was no testing done would have been misrepresenting the efforts. But something with the way I tested was different. I got lucky with running into bugs. A lot. Like a more than you can imagine. And this sentiment is something a lot of testers relate to. And I gave the program chances to show how I was lucky, with systematically working through flows and time, and odds, fast forwarding different users year in production time and time again. 

I pulled up two quotes to summarize the insight I had gained: 

"The more I practice, the luckier I get." - Arnold Palmer (golfer)

"It's not that I am so lucky, I just stick with the problems longer." - Albert Einstein (scientist)

This kind of luck favors my kind, testers, or at least we frame it as a positive event of luck, to serve as means to remove such luck from those who would not welcome it. But we aren't always working. Work is when you get paid. Most of your time, you are just you, and a user amongst all the others. You'd love for the systems you use to have been built and fixed to work, but things are complicated.

A favorite incident, really showing perseverance with the problem comes from a friend. She was out in an event one evening with her development team, having a good time. She returned home late, crashing in for the night as soon as she could. Next morning she realizes things are a little off at her house. Her scented lamp has fallen and stained her sofa. Going into her kitchen, she discovers a handwritten note. In her absence, police had visited inside her apartment to turn off Alexa, and left a note. A little later in the day, a worried neighbor reaches out and tells about the full blasting volume of music leading to calling the police over, this being so unusual. Turns out Alexa had been throwing an empty house party. 

We were recounting this incident and all the details of interoperability at a conference to realize - later to be confirmed by another colleague working inside AWS, that there was indeed a case where you could have Spotify in a state where you could be far away trying to connect Spotify elsewhere, only to turn up volume on the previous location. A very specific flow. This interoperability bug resulted in an invoice from the apartment company for opening doors to the police late in the evening in owners absence, and a lot of research on whether you could be responsible to pay up for a bug somewhere between Amazon and Spotify. Turns out that spending enough time fighting, you did not have to pay. 

I did not turn this one into a talk and research for quotes, but I did draw a major lesson: 

Getting lucky costs you time and money that no one is responsible for. 

It's a lesson that is hard to miss as a tester. You are sometimes painfully aware that outside the paid hours where you work for advocating for users in a very particular style, the same characteristic of serendipity is used as free labor sometimes by accident and sometimes by design of the companies. 

When I a few years ago serendipitously run into a Foodora security bug that allowed users to get free food, I did the free labor following through to fix before I disclosed my knowledge. They gave me a discount code of a few euros for hours of investigating with them, which compared to my professional rates are quite a contrast. 

When I last week serendipitously run into another Foodora bug that now impacts users, I came to realize the immense power difference when tables are turned. User advocacy takes even more time. I do user advocacy, because I believe things can be different. 

Things can be different if people share their experiences, and individual experiences sum up to phenomena that speaks to the power. 

Things can be different if people care enough to pick up feedback outside the usual channels. 

Things can be different if people believe there may be a step in the flow where while it works on your machine, it indeed does not work on mine. 

Believing things can be different guides my choices of what I end up doing with my time and effort. I ended up researching legislation to learn where to report and how so that I can loan power a regular user does not have. They would have to investigate with a specific way of asking and report back to me in writing in 14 days. They would have to correct the state of my banking within 1 day. 

I did not imagine that in reporting this I would serendipitously find one more bug: the messaging app fails with error code 500 ("it's our backend") on a specific part of the message I had crafted, even if it seems like regular text with no special features. Reporting the second bug was possible in the messaging app, reporting the first bug required sending over a portion of the text as a screenshot. I did try also talking to a developer who worked at the company and was helpful in explaining how things should work on Foodora, because believing things can be different means I believe people could help when their organizations technical channels fail. 

From this one, I can draw again major lessons: 

Reporting channels don't always work even if you find them, reducing your chances of solving your issues.

Testing gives you tools and attitude to work around what blocks you: the text vs. screenshot, the developer inside the organization; the legal references.

A regular user would give up but I am no regular user, even if I am a volunteer on task of user advocacy. When I get the written report on this, it still does no conclude the case even if it hopefully fixes this bug. There's still the work we need to do in the industry to connect the regular users with working feedback systems to get things set right, and the work of setting our systems right. I hope the programmers and testers community does better on quality, and I know it is not easy. I know we try and that we are not done. 

I've been a tester for 27 years because I stick with the problems longer. And I stick with the problems on user advocacy. It's as important as ever when we can't even find a channel to communicate back our troubles to the multinational corporations. Just look at the troubles content creators have social media being kicked out from the foundation their businesses run on, without routes to appeal. 

User advocacy is speaking. It's listening. And the latter is needed even more.  

 

 

 

Thursday, August 15, 2024

Which of us did not understand, again?

A few years back, a friend in testing got on a stage and shared a personal story of bug advocacy. She found and reported a bug, and the reporting ticket kept pinging back and forth as "does not reproduce" up to a point when the two of them were put on a spot no one should ever be in - a boss with power of letting you go, and an ultimatum where it's one of you stuck in that loop. If the problem can be reproduced in front of the boss, the tester stays. If it cannot, the developer stays. The problem was reproduced, and the story illustrates things can escalate. 

I think about this story whenever I find myself in an argument with a developer about a bug that I think I am experiencing. When we disagree, it almost always means one of us is missing information. I have had this argument professionally enough to know that sometimes it is me missing information. But a lot of times it isn't, even when I am told I am. 

Something went wrong

Last Saturday I ordered food with Foodora delivery app. I have done so before. I was already using the app less frequently, choosing another option over it due to previous two experiences of using it, but I gave it a go. 

I ordered food once. I confirmed the purchase with the bank twice. And I confirmed the purchase with mobile pay three times. This is not how the purchase flow is supposed to work. But this is how the purchase flow has been for me with my last three uses. It's not a temporary glitch. With three times of having seen this, I have a good idea of what it is in my flow of use that reveals the problem. Yet I am unwilling to test in production, with my own money. After all, the result for the user is that today, 5 days later the money has been invoiced from my account once (correct), but still remain reserved so that it is unavailable for me to use (incorrect). 


I also know my experience in the foodora app. The user interface flow never completed with confirmation. Yet I got my food, I paid for my food, and forcibly loaned money for the bank in their reservations queue. 

I offered the service desk person that I could work with them for free to isolate the problem. They told me they don't have that kind of contacts within their company, and they could only tell me to wait to see if I get invoiced once or twice, and that their system only shows only invoice going out. 

I have experienced reservations of money for a failed transaction before. I just don't experience it with every time I use the application. Or in a way where the application never finishes the users flow yet delivers me the food. So I suspect there is a problem here. 

Speaking about problems

I decided to use the experience as a talking point on LinkedIn. My message of what I wanted to say was muddled with multiple messages. I wanted to say:
  1. I am still happy to help Foodora figure out the bug because I think I can isolate my actions better than an average person, with 27 years of testing experience. Not asking to be paid.
  2. Paying for people for testing, isolating problems being part of it, would be good. 
  3. Users need advocates and holding double bookings, especially in scale, is unacceptable, even when they get the money back in a week or two. 
  4. When users suffer the consequence, we only care in scale. When the company suffers the consequence, we call it security vulnerability and care immediately. 
  5. Users option - my option - is to not use this service. They would not know why my business goes elsewhere. 
Lovely developers jumped in to help. 

The first one wanted to help me with not saying bad things about Foodora in public, enough to go and snitch tag my employer. I got a lovely Sunday evening of worrying if a multinational corporation would let me go on my trial period because I shared an experience, until I stopped spiraling and realized I work with smart people who would do no such thing. His point was that in having so many messages and not my context, he did not appreciate tone of my post resulting from comparison of how differently we treat individuals and companies. 

The second one wanted to help me like they help users. He explained preauthorization process - which I also am aware of, and even did some research on related to this case - and that it is designed to ensure the company gets their money. He insisted, and probably is still insisting that it is not a bug that I don't have access to 35,69 € of my own money 5 days later. I agree that it would be a worse problem if I ended up with a negative balance due to reservation missing. And I agree we prioritize the company getting their money, even if the user is forced to live with extra reservations. 

What we don't agree on is my experience on the app that causes the double - triple transactions. I am pretty sure it is an interoperability problem and a difficult one to test because of the specific conditions in my flow of use, unlikely to be available without setup in an integrated test environment. 

I have been a part of enough financial systems integrations to have learned two things:
  • Production-like test environments don't exist for all financial integrations
  • We can't test the exact user flows if we don't have the users kind of environment
Financial integrations tend to be, at least in Finland, heavily in dominant market positions and you can't exactly choose who you integrate with if you want to move money.

I tried making a final point: 
The company whose app I used to make my purchase is responsible for choices of the contracting chain.
Theoretically, they could choose to use financial service provider that designs the transaction flow differently. Realistically there are no options, and becoming an option would be hard. But if quality issues with double reservations were a problem in scale, they could seek solutions with options other than shouting down the chain. 

End result of this is that Foodora lost my business until I forgive and forget. Since this is my second rodeo, I know it took me something in the neighborhood of a year last time. In that time, I would choose to pay with a different flow, or someone catches the glitch they have for now. Not that we would know of it. 

I wish all the best for the development team at Foodora, with great production-like environments, feedback reaching them from their call center, and a lovely developer who will notice and fix the problem even without the two others. And thanks for fixing the security bug I reported where I got free food. I wish it was the same sense of urgency when your users are experiencing the trouble in whatever sociotechnical system you have going on. It might even include my colleagues, I would not know. 

At least it was not that one of us would be fired at the spot. 


Tuesday, August 13, 2024

Did you make notes of what you learned on the CrowdStrike case?

It was hard to miss a few weeks back: CrowdStrike's inputs off-by-one bug bluescreened a number of computers relevant enough to impact how business of all sorts run globally. I'm still not sure what is the top sentiment of this for me:

  • Sucks to have been the cause of this. Sociotechnical systems fail and you never know when you're part of the system that causes this. 
  • Scale of who were impacted surprised me. Managing to deliver visibly broken to this scale in an hour feels rushed. 
  • Costs of this across industry must be everyone's responsibility. Having people manually fix this on business computers globally must have cost a fortune, and the bill of this must end up being split instead of finding its way to source. 
Everyone in this tech bubble followed this. Everyone had an opinion. There's a case of failure now, something other than Therac-25 we still talk to this day as an example even if that happened 1987. 

An Outsider's Recount of What Happened

Revealing the details of the issue reached us in two waves: 
I learned more from the second wave of reporting. 

I learned the problem was an off-by-one number of inputs of specific kind between two components. One component produced input of 21 input parameters with the extra input parameter including specific kind of input. The other component received 20 input parameter and did not work well with the extra input parameter with specific kind of input. 

The time-based distribution of why this leaked was particularly interesting. 

The capability to produce templates with 21 inputs was introduced in February and taken to production. The use of the capability with incompatible input happened in July, resulting in the blue screen. 

Process-wise, the bug was missed in February. It may have been introduced by another developer. The latter developer, in July would have found the bug if they tested previous developer's work again. Testing of it, as it appears, would have required two things that take effort away from the development environment: 
  1. Production like environment. If Windows blue screens, a Windows would have been needed. Nothing in the materials discusses if developers could test this on their development machines, or if it would require connecting to a virtual environment or remote environment. 
  2. Looking at result from outside in. As last step of making a change, someone could look at the change working at least on one Windows machine. The reporting, however, does not make it completely clear if there really was no third conditions like having to run the system for more than 5 minutes. These kind of faults could take time for the computer to do the work to come crashing down. 
This could have happened in many of the organizations I have been at. Developer teams shortcut production like environments and looking at the result from outside in. Even when automated, end-to-end tests like this take longer to run. It is often considered a risk worth taking that we make software available for users without zero pair of eyes on it. I tend to prefer two pairs of eyes on a change, because no one - no one - should be left alone with responsibility of potentially breaking some millions of computers globally. It is always a fail of the sociotechnical system when pushing bugs to production, in scale, happens. 

The changes outlined were interesting: Items 1-4 are missing behaviors (features) of the code. Item 5 says they will test in changes now in integration instead of in isolation. Item 6 says they will throttle delivery to destroy a select group rather than everyone. Items 5&6 are generally considered good practices. 

Recounting Personal History
 
I used to work with security software with risks exactly like this - possibility of taking down millions of Windows machines. This whole incident reminds me we too, decade(s) ago had problem like this that test coverage missed. But we had a few other mechanisms that protected us: 
  1. Delay-based distribution: the group who would get things first was small but reactive, and rendering a board member's computer useless without manual intervention did a lot of good for investments in ensuring the lessons were learned without impacting customers
  2. Eating own dogfood in company: we learned to distribute internally continuously, and segment distributions. The whole environment for testing was built to provide HUMAN interventions because test systems fail. 
  3. Throttling, autorevert, secondary update channels: we built a number of features that would enable us to fix things without showing up face to face
  4. Architectural rewrites: isolating risks like this, because not all software causes blue screens. 
And then there was the continuous investment on test coverage. I would have remained disappointed if the conclusion in the end is that test coverage is what they missed, when they have a wealth of investments their board members would happily sponsor to never see this happen to them again. 

Moving on

The sociotechnical system designed and put in place did not account for a risk that realized. And that leaves me thinking of this.
"Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand."

--Norm Kerth, Project Retrospectives: A Handbook for Team Review
This is our chance as industry to remember: Production-like environment and looking at result from outside in. In some teams developers do that. And in many teams still, it's considered so important that the teams invest in a team member with a testing emphasis. Some teams call that person a developer, but others may use the familiar term 'tester'.

Sunday, August 11, 2024

Explaining Exploratory Testing

In this last week, a realization hit me. Exploratory testing, coined by Cem Kaner and the most natural frame in which testers would work and do good work in, just turned 40 years old this year. It has been around longer than I have, and yet we don't agree or understand it fully. 

When first observed to be labeled 40 years ago, it meant the exceptionally different way of testing that cost and results -aware companies in Silicon Valley were doing testing. It was multidisciplinary, and it generally avoided test cases that the rest of the non-exploratory testing companies were obsessed with. 

We learned it was founded on agency, the idea that when two things belong together, we don't separate them. And a lot of people separate them, by having different people do different parts of what is essentially the same task, and having a separation in time to protect thinking / learning time of testers. We learned that opportunity cost was essential because we could choose to do things differently with the same limited time we had available. 

Some people run with the concept and framed it with testing vs checking. Checking was their choice of word to say that for the exact same reason exploratory testing was framed as an observation of Silicon Valley product companies doing something different, we needed to wrap the other by contrast. Ever since I realized that checking is an output of exploring, I have not cared much for this distinction. And when it became the main tool for correcting people, I stepped away from it more actively. 

We can still observe that not all companies do exploratory testing. And looking deeper, we can see that some companies do exploratory testing as a technique, kind of as it is framed in a lot of writing that is based on how Lisa Crispin and Janet Gregory describe it. Others do it as an approach, and that is how I tend to describe it. 

For sake of illustration, I sketched a typical social agreement of how I work with an agile team. 

My work as tester starts already before the team gets together to look at a feature in a story kickoff. I usually work with product owners on impact analysis, exploring the sufficiency of our test environments, possible dependencies that would need to be in place, the availability of right skills and competencies for success, the other features that will be impacted, and so on. With impact analysis, I usually use the previous version of the application as my external imagination while thinking of what we'd need to address. That is very much exploratory testing. 

When we then prioritize this feature and have a story kickoff, I join the team in exploring what examples would be the minimal set for us to understand what is in scope and out of scope. With my best efforts as a seasoned tester, I seem to get to about 70% success of identifying claims of relevance with my teams. We usually write these down as acceptance criteria (updating whatever was there already from impact analysis), and as examples we would like to see in test automation. 

While implementing, we also implement test system. The unit tests, other tests, the reviews of the tests, the new capabilities the other tests rely on if a feature is touching areas where some automation capabilities are missing. If you wonder what I might mean by an automation capability, a good example is a library allowing for certain kinds of commonly available functions, like simulation. 

Even though I was exploring throughout the implementation, I do take a breather moment and just look at the thing we created as my external imagination, trying to think of stakeholders, and their feedback. I might even drag them in, and strengthen my external imagination from just the application that speaks to me, to actual people looking at the application with me. 

Then, I will still look at things with my team once more, seeing if we can just press button to release or if we want to double check something. I aim to minimize anything I would have to do while releasing, but at the same time, I make sure one of us is exploring the experience with the new feature included. 

Finally, I follow through. Sometimes I follow immediately. Sometimes I follow in a month, three months and six months. I deal with the long tail of learning by exploring what use in production is like. In mature organizations, I do much of this from logs and access to customer facing issue trackers. In not so mature organizations, I drink coffee with people who meet real users. 

Within the social agreement in a team, I have an exceptional level of agency: I am allowed to break the social contract at any time when I recognize something important. I discuss what I plan on doing, and sometimes we have a conversation. Some of my activities feed into the feature we are on now, other times on the features we are about to think about after this one. My exceptional level of agency allows me to choose what I do and which order, making agreements in who does what. Then again, I work in a very social context where dailys allow us to redistribute work, not from a backlog of tasks of one. 

If in any stage of the process people talk about test cases, that's as output. Sometimes we need to leave those behind for standards that don't quite get exploratory testing. And in most cases we hear "test case" and we just transform it into a programmatic test. While it does only limited part of the testing, it does enough of evidence as long as it is framed in exploratory testing mindset. 

For me, exploratory testing is an approach. I explore, target, with resources, for information throughout. When it is a technique, it's that rethink with external imagination part of the social agreement.

In its core, exploratory testing starts with the idea that while there are knowns (or targeted requirements we think we know we want), there's movement in continuous learning. 

The difference in thinking tends to drive opportunity cost and thus prioritization. Having to choose between writing automation and using the application, a lot of people in exploratory testing would choose the latter. And when I speak of contemporary exploratory testing including automation, I discuss a frame in which we actively change things so that we don't need to do choices between the two, but we can merge the two. Modern tools, modern teamwork and short cycles in agile development all enable that. Our ideas of what exploratory testing is and isn't are still sometimes getting in the way. 



Saturday, August 10, 2024

It took me two decades to get to quality engineering

I have carefully curated a coined description of me on Mastodon:

🇫🇮Tester of products and organizations. Regretful Manager. Exploratory tester, (Polyglot) Programmer, Speaker, Author, Conference Designer. She/Her. maaret@iki.fi

The order matters. Each choice matters. And I am on the verge of changing the word my experiences center around *tester* to *quality engineer*. 

20 years ago, I was working in an organization that called the group I joined quality engineering. It was important to some people, it was not important to me. Showing up as me meant by personality that I was always living a bit in the future and I put significant effort in grounding my professional skills to now, empirical information, into seeing the difference of asking for something and actually having what you had asked for. Testing lives in the NOW, and a lot of people don't live in the now. 

It still takes me effort to focus on the now, when impacts I am seeking for are in the future. But teams need people who bring in different perspectives. And the perspective grounded on now and empirical information, it is hard to come by. Models that help me be intentional of the horizon I am working in, notice where others are working in, and making choices to be the complementary force have lead my professional choices, even when I don't explain why or how I think about things. In practice, I described this a lot some years back in materials I was creating about being a test manager, where you would have entirely different items to work with in case your project manager counterpart was optimist vs. pessimist. The job of working to fill the negative space was hard to share and teach, and it is a worthwhile investment to find some people to your teams that can do it, no matter what position you end up assigning them. 

When the world talked of quality assurance, I emphasized testing. When the world moved to quality assistance, I recognized more of the sentiment and still emphasized testing. Over the years, I stuck with testing, only adding the piece you can see in my mastodon bio. I a tester of products and organizations. This thing we do and know as testing that lives in the now is just as needed for products and organizations. 

Some weeks ago, I finally managed to make the connection though. Testing lives in the now, and what we have now may be bad quality. It may be lack of technical excellence. It may be unhappy people, bad practices, and subpar outcomes. When I enter a case of bad quality, I don't accept the bad quality. I don't accept the job of pointing out pizza boxes on living room floor, I take it upon myself to improve the habitability of work, and the experience of quality for different stakeholders. I have called this contemporary exploratory testing when I take the center of the practice but I require more than people who would traditionally do exploratory testing. I require automation. I require decision-making. And I require social software testing approaches. I require focusing on the now, to move towards future. Quality is the future, it's the target, it's the goal. It is the good enough to our present state. It's the raising bar, the productivity impacts to the whole development. That mindset that I frame into contemporary exploratory testing, other people like Anne-Marie Charrett frame into quality engineering

Quality engineering is forward facing from a platform of testing, in the now. 

With testing, we become generalists on amplifying all failure signals of things that might go wrong. With quality engineering, we join the path of fixing things.

It's hard to get what we have in testing sometimes, because we are moving and learning. And while the testing future is already here, it's really not evenly divided. Testing to quality engineering is a part of that future, and I finally made the connection after all these years. 



Sunday, August 4, 2024

Ensemble Testing — How to Enable Better Habits and Skills

This post has originally been published as Medium post on October 27th, 2019. Since Medium did not win on my choices of blogging platforms, I am including it here in efforts to consolidate my legacy. This post has 2.8K views on Medium.

Five years ago, I was sitting in a conference front row feeling puzzled. I was listening to Woody Zuill on Ensemble (Mob) Programming and the idea he presented felt too extreme to be good:

Whole team working on the same thing using one computer.

I had to try it and it transformed me from a non-programming tester to polyglot programmer I had always been since age of 12. To help other people get started, I started Ensemble Programming Guidebook. But the most significant transformation came from what I started calling Ensemble Testing.

Ensemble Testing

Ensemble Testing is to testing activities what Ensemble Programming is to all activities. While I can easily recommend Ensemble Programming 40 hours a week, if you would do Ensemble Testing 40 hours a week, you would soon find yourself grown into Ensemble Programming.

Ensemble Testing is ensembling on testing activities:

  • Cleaning up test automation code
  • Creating test automation code on any or many layers
  • Exploring an application while writing code
  • Exploring an application without writing code
Ensemble Testing Ongoing and the Group Works Together.

Before when testers got together to test and learn about testing, we would work in same space, everyone with our own computers or a minimum of a computer per pair. Ensemble Testing gave us a vocabulary to say that one computer or set of computers with control together as a group would be an option. And that while we might do less, we might learn more. Usually in ensemble testing we don’t uncover so many issues, but we learn a lot. And we learn to work better together over time!

A Facilitating Teacher’s Golden Egg

As I discovered Ensemble Testing, I started using it for all my hands-on exploratory testing courses. As a teacher trying to enable movement in habits and skills for a group of 10–16 students, ensembling took my ability to teach to next level. I would always see what the students could do (not just what they said they could do), I could have the students teach each other providing great leveling of next steps to learn, and I could take control myself to teach hands-on how to apply a particular idea.

I would find myself saying “Let me navigate for a while” and showing how I would test it. I would ask questions on the intent, asking people to give words to what they were trying to do. I would have them write it down on a whiteboard so that they could anchor their learning. And I would watch people move to the keyboard, and know how to do a thing they did not know to the same level on a previous round.

In particular, I learned that when there are things you do not know you do not know (unknown unknowns), ensembling is the way to reveal them.

I moved my courses from exploratory testing without automation to very specific courses on automation, and regardless of what would come up, I could rely on the group’s collective power intertwined with my knowledge on solving the problems. I have no fear.

Skills and Habits

Skills are about actionable knowledge: being able to do something, in a smart way — the right tool for the right job. Habits are about consistency: being intentional not accidental about our results.

I learned people are often at their best behavior in groups. We want to please our peers, be it kindness or innovation.

Best results from Ensemble Testing come over time. Not just one training, just one session but doing your testing work in a group setting every now and then. Mixing up unexpected people. Building bridges. Transforming practices and organizations.

If you have not yet tried, please do. If you have, tell me what your experience has been like with #EnsembleTesting.

Turning Ideas into Code

This post has originally been published as Medium post on Jul 27th, 2019. Since Medium did not win on my choices of blogging platforms, I am including it here in efforts to consolidate my legacy. This post has 1.98K views on Medium.

Recently, I’ve been asked to visualize our software process. Whenever I see visualizations of what we’re supposingly doing or supposed to do, they make me cringe. Yet, I find myself not able to do any better. This article is one attempt to bring clarity to how I think about software process.

Developer-Centric Ways of Working

As much as people like to draw ownership of products into a product management organization, I see the true ownership lies within the developers. If they don’t, successfully, create a pull request that implements a change, the users will not see change. The core of the process is turning ideas into code.

Smart Developers Turn Ideas Into Code

No matter what else we agree that goes on, it all flows through here. Smart ideas, into smart code by people able to do that transformation.

There’s three clear points of failure we often create processes around:

  • What if the ideas are not smart? What if the ideas are not worth spending time on?
  • What if the people doing the transformation don’t have ideas of high quality, in context and are missing relevant perspectives?
  • What if people turning ideas into code could do things faster having help?

Clearly, smart people can learn to have smarter ideas. Looking at what they do and what is the impact of it, they can change. Externalizing the looking often has adverse effects to motivation and ability to hit the mark on delivering, eventually, something of value.

Heart for the Customer

A smart developer would not be very smart if they did not care why they are building the software. There’s someone, somewhere willing to trade money for having something they’re building. We can recognize that the relationship is complex and requiring more attention, especially in scale than a developer can put there, but behind all software is a need it is supposed to fullfil.

Caring for the Customer

Some people talk about being customer obsessed, but obsession sounds like a negative thing. I believe we need to have our hearts set out to hear, to understand and to care. From this relationship of us caring, we also find inherent motivation. Addressing a real need and a right need is motivating. Filters and proxies dampen the motivation, while listeners help manage the relationship.

Being Smart Takes a Village

Looking at the way of working as developer centric, we also soon come to realize that a developer, as smart as they may be, don’t need and don’t want to be alone. Making changes through pull requests that end up delivered to millions of computers out there is a heavy burden being left alone. So we have a principle of always having at least two pairs of eyes on every change.

Smart Ideas Improve with Collaboration

A lot of times the process focuses on how the others are intended to contribute in the process of turning ideas into code — improving the ideas about to be turned into code, or already turned to code needing adjustment. The product owners, the designers, the testers, all work particular perspectives of improving the ideas.

A Transformative Way of Thinking

Looking at software product creation this way, every developer welcomes help in understanding what is the right thing to build and how we together could learn about it more effectively.

Lets face it: only through making a change through means of coding and delivering it all the way to the hands of the users things change. We can analyze, plan and prepare all we want, but unless that helps a smart developer have smarter ideas, we are probably not improving the impact we make with software development.

Let’s address the weak points: ideas become smart by working together and learning; people are smarter in diverse groups; the work we do can be shared in many ways.

How am I supposed to describe this in process model?

What is Exploratory Testing?

This post has originally been published as Medium post on Jul 3rd, 2019. Since Medium did not win on my choices of blogging platforms, I am including it here in efforts to consolidate my legacy. This post has 816 views on Medium.

It has been 35 years since Cem Kaner coined the term to describe a style of skilled multidisciplinary testing common in Silicon Valley. I’ve walked the path of exploratory testing for 25 years and it has been a foundational practice in becoming the testing professional I am today. Let’s look at what it is, to understand why it still matters — more than ever.

Living Life on Defaults

Teaching programming to a group of children, we look at the little program we’ve created following a recipe. The little turtle we have learned to refer to as tortoise and give commands to draws a square, just as we’ve told it to. We’re ready to look for something more interesting, to unleash the power of programming, allowing the computer to do things we would not do. The children have no concept of this, yet.

We identify attributes with the square. We find things that could be different, things we later will call variables: number of sides, length of a side, color of the line, and eventually, we look at the line to characterize its width. We realize it is not very thick, but we also realize that there is nothing in our code telling how thick the line should be. Jumping to the conclusion of a hidden default value usually takes only a moment.

We learn that the first step to changing the default is to reveal the default.

A lot of our life, just as a lot of software building we do, makes us move on defaults. We let life happen on defaults, when we could take control and optimize for what we find important. This is an insight that I recently found through proxy, Scott Hanselman talking about wisdom his wife shares on one of the Hanselminutes podcast episodes.

Even in life, and in our work, we move with defaults. And the first step to changing the default is to reveal the default.

What does this have to do with Exploratory Testing?

The Bridge Between What Comes Easy and The Real World

At work, we were building a new major functionality to replace the old, and I had a history with the old. The old had a developer known to frustrate the likes of me, testers, for breaking things left and right with every change. They would commit their code and pass it on without testing it. They would then start testing it, and often by the time I had tried if it works (and it didn’t), all they had to say is that they know. The functionality was also complex, so I could have lived through this groundhog day more that I care to admit.

I was particularly frustrated. The other tester assigned to the work approached the work with automation, and had a habit of explaining in meetings how they had 100% of the functionality automated. Reading their code, I knew they had 3 scenarios automated, and it was far from sufficient. The bar for 100% was low. The bar for 100% was their defaults: what they were aware that the functionality needed to do.

Armed with their defaults, they looked at the functionality like a bystander. Whatever was coming their way, they would take it, and turn it into automation. But where does the work come from? The first three cases came from the programmer dealing with the bulk of the new functionality. There were two main options for recognizing more:

  • Exploratory testing: Finding new perspectives beyond what was already in the table that could show things were not what they should be
  • The Real World: Finding out what users would say

The programmer was heavily opting for the Real World, after all, they had already seen that tester assigned to work was 100% done and everything they shared as knowledge was automated.

The missing piece in the puzzle was someone, with the Explorer mindset that could find out more of the real world, before and while production.

Adding perspectives that were not addressed, charters to explore, was not complicated. Some of them would result in changes without testing anything, as the questions revealed problems. Others we would insist could not cause problems, and yet when spending time exploring with the functionality, would have unexpected side effects.

Exploratory Perspective is Moving Our Defaults

In hindsight, without exploratory testing in between the bystander perspective and the real world, the real world would not have received what we generally expect to deliver. The work here in between, modeling and actively learning about the real world, is what I think of as exploratory testing.

Exploratory Testing, Exploratory Development

We often come to realize all testing is exploratory to a degree. Similarly, we realize all programming is exploratory to a degree. Why then we have an approach, exploratory testing, when we don’t have the named practice on the programming side?

What we seem to be missing is that testing and debugging is the name we’ve given to the simplest exploratory loop in programming. We test to learn about the program. Exploratory testing says we test to learn about the program, and about the testing we should be doing. It says we move from the bystander role to an active contributor.

Both on building and breaking (illusions) perspectives, we want to move smart people from moving with defaults, being bystanders in their projects, to active people making choices. Exploratory testing is, and has been a way of moving testing away from defaults.