Saturday, November 30, 2024

You are enough

Sometimes there is just too much going on. Too many self-volunteered tasks and deadlines, some more visible than others. And you find yourself doing a lot. Feeling in control though, you chuck through the pile recognizing there is no progress if you juggle the load, filled with anxiety. Finding that sense of agency, you do what is in your power to do. That sense of power, it is essential. 

I come to think of this for multiple triggers in this space in the last weeks: 

  • Firehosing information
  • Exercising replan
  • Managing anxiety in others
With the four stories to share, I ground myself to the purpose I have blog in the first place. Not to write the perfect articles. But to write down things to reflect. Usefulness of my reflections to others remains with the  discretion of the reader. It's different to what majority of bloggers do, but then again, not everyone frames their blog as 'seasoned tester's crystal ball'. 

Firehosing information

I know a thing or two. I learn more by building a foundation of what I know by sharing it, and have built a bit of an identity in reflection and proving myself wrong. I take pride in the 180 turns of opinions I can recognize on my work. The attitude towards whether continuous integration is a good idea. The idea of what test cases should look like. The idea about separation of concerns to developers and testers. The attitude towards automation. I have written evidence that I learned and changed views. It used to scare me enough to not say things of today. But I learned that I am enough today, even when I am not enough as today for tomorrow. 

I have needed to tap into this lesson a lot now that I have a lot of new colleagues, with a lot of those learnings I have needed to lead myself through over the years. I have needed to remember that I did not change my perspective because I was told what was right. I changed my perspective because I observed myself the options, and had agency in making those changes to my internal model of the world. 

I have a lot to say. And I moderate on how much of it I say. I have decided now that I say one piece from stage a month with audience of my colleagues and anyone in the Finnish community, through the platform that Software Testing Finland (Ohjelmistotestaus ry) offers. And I hold space for conversations with my colleagues on leading and managing testing twice a month. Once a day I can post on our internal channel to share something. Once a day I can post on LinkedIn to share something. And on Mastodon / Bluesky, I can say what I want to say when I want to say it. 

I write and say more in a week than others can consume. Sharing is an outlet, and a processing method. 

November was a month of firehosing. I did seven new talks. One because of my choices of cadence, but 6 others because someone pulled and asked for them. Doing my tally of stage presence a month early was an act of offloading. 

It's been hard to remember that the one guy giving me feedback last year everything I ever say is shit, and the other guy giving me feedback this year that "quantity is not quality", just in case I did not already deal with enough negative internal self-talk, they aren't the full picture. I am enough. You are enough. It is true for us all, without comparison to others. 

Exercising replan

While my head needs an outlet of information through sharing, the visible parts of what I do aren't all I do. Life has been a lot, work has been a lot. 

I found myself in a situation where I had so many competing tasks to complete that I couldn't. 

I couldn't deliver a report to a customer that I promised. So I told the customer, and took a week of extra time. Turns out the appreciated the confession of a consultant overestimating the pace in which they can analyze a complex situation. 

I could not find time to talk to half a dozen people about test automation I wanted to. I didn't, and while they may not forgive me, I forgave myself by letting them know what I learned while I could not show up for them. 

I have still one more thing on replanning, and I am balancing the cost of replanning on that one. Me doing it might just be easier than me replanning it for someone else, we've already been through two bounces back to me. 

Ending up with too much requires replan. Requires confessing need of help, need of time and space, and support. Remembering I am enough even when I am not enough for all the things I may end up with. 

Managing anxiety in others

Being a leader is about having people who follow you, sometimes with positional power but sometimes just because they were in search of someone with ideas. I still call myself a regretful manager, because I don't want to manage; I don't want to lead. I would much prefer if we shared the leadership and the doing, and found ourselves negotiating our journey together.

I have bubbles that recharge me where the world is peer to peer. Our group of regretful managers. My monthly benchmarks on work with a peer, now running for multiple years. I love these groups. 

But I also have other groups where I show up to hold space. Even in volunteering side, I find I volunteer to manage. Set context for decisions, while trying to stick to my own boundaries of what I can take on. And making space with spoons to ease anxiety of others. Reminding them they are enough, because while I can tell that to myself and believe it, some people need to hear it from me. 

With these stories, I would remind you: you are enough. You have agency. You have outlets - writing, talking from stage, talking to people. And just because it's not easy or even possible, it's not you. 

You are enough.

Monday, November 18, 2024

Cost-Constrained Exploratory Testing

The world of software used to run on-premise, and the basic promise was this: you buy a computer of the expensive sort, "server", and whatever you do on that ever since costs little. We did not calculate electricity or operations team attention to that server, it was just there. Costs were essentially there but distributed and hidden. 

Then we got the cloud and now we pay per use. I can't be the only person who caused 2k€ extra costs on her 1st month on cloud use by not understanding all aspects of pay per use and differently priced services, but once burned, I have been more cautious. 

With that cost caution in mind, I set out to observe my thinking and actions when trying out a new genAI tool published today, Hercules, https://github.com/test-zeus-ai/testzeus-hercules

I loaded some money on my personal OpenAI API account. I verified that my settings would lead me to losing the money I loaded but no more. I created the API key I needed to run Hercules. 

First four tests I explored cost me 0.36€. Two later, I am at 1.13€ and well aware of cost of exploring. I also note that awareness of the cost makes me consider somewhat more carefully what I will try. 

Hercules?

The high level promise of Hercules is to do agentic transformation (fancy way of saying multiple LLM calls within a logic frame that could be just about anything) of Gherkin to test results. So given this: 


I get this: 

No code written. Annoying level of detail with entering inputs, pressing buttons and all that, but a pass as it should be for this case. 

That was my exploratory test #2. The first one skipped line 7, resulting in a fail because you have to press the button to see the results. Tests #1 and #2 cost me 0.18€, and did not scare me off on cost-constrained and cost-aware exploratory testing I was on. 

With test #3 I invested changing the level of language for my gherkin file. Adding the URL to my gherkin examples for my exploratory testing foundations course, I went about seeing what would happen with three tests in single feature file, where one test is two tests parametrized totaling this to four tests, 

Again a green run. Three tests not four, but watching video evidence of the last, both sub scenarios were included. 

Where test #2 execution cost was 0.085, this one cost me 0.312. Whatever the unit, because it did not match what I ended up seeing on the costs panel in openai portal. 

Test #4 I dedicated to seeing a test fail for the right reasons. Taking incorrect prime analysis and setting expected values to to calculate 8 words for "to be or not to be - hamlet's dilemma", I indeed got the fail with an unexpected error analysis. The words on my tests and on the UI for the concepts don't match literally, and I wrote them in a different order, and yet the connecting of concepts hit the mark and compared right things. 

Tests #1-#4 cost me 0.36€ to run. 

For test #5, I was sure I would end up adding to the cost. I took a screenshot of the application and passed the screenshot to Claude asking for Gherkin feature file. 

Not quite perfect scenarios. Discouraged words is a concept that is completely misinterpreted, but that also tells about the UI concepts not being intuitive. Example for e-prime mastery tries to avoid the verb 'to be' but ends up still having one amongst the avoided examples. 

Running the test, I start to see 429 responses from the API - asking too much too soon, at least as per my cost aware settings and after some minutes I decide to not risk the cost, paying 0.50€ for this one failed experiment. Failed as in did not produce the report, but did produce some of the videos. 


Video showed me that not specifying where the app I want tested results in testing eviltester's version of the same. First hit on google and all that. 

The three first tests resulted in failure. Will be is not be, and other imprecisenesses in the scenarios end up requiring some more work. 

<failure message="EXPECTED RESULT: The tool should identify the verbs 'am', 
'is', and 'will be'. ACTUAL RESULT: The tool identified the verbs 'am', 'is', 
and 'be'."/>

Final test #6 was against a live system of a pair I tested with. We tested search, with a passing test, ending up at €1.13 for my out of pocket cost for exploratory testing a new thing, 

That investment pays me back a "coffee or beer" next time I meet the tool creator. Or when revealing pepsi max as my beverage of choice, a six pack of it. I ended up finding a bug on telemetry and the bug ended up being fixed and fix deployed already. 

Conclusions

Every time one explores under a constraint, it has an impact on thinking and resulting intent on action. Cost-awareness drives me to think about what information I am seeking before hitting the tool. 

Costs are one side of the coin, but we do pay a lot more for people figuring out locators and clicks to implement what gets generated here. 

Costs drive reuse, and wishes that we would share the scenarios so that we don't have to rerun the same. Learning from what others already paid for is in the future.

Generating worthwhile gherkin might still be a human effort for now. 

Replay without the GPT 4-o cost would be nice but we did not find it - yet. 



Friday, November 8, 2024

Dear customer, what you assess me for is extremely ambiguous

Dear customer, 

I am delighted to note you have come to realize you could use some testing services. I am delighted, because I deeply care about how those services could help customers work out their quality, productivity and decision-making areas with timely empirical information. And I am even more delighted I get to be in the position to receive such requests. No sarcasm. I look forward to any level of collaboration we may have, be it the RFP you just submitted or the potential engagement that could follow. 

That said, I need to bring to your attention that I am puzzled with many things you are asking for. 

As someone who studied at Helsinki University of Technology for 251.25 cr (just checked) but never graduated, I have a bit of a pet peeve on the idea that my 27 years in testing don't qualify me to work on your projects. I can justify your request for Bachelor or Master of Science degree by balancing it with the fact that that your other requirements often speak of someone with little years on the job, and delight myself in how that is a smart move on the requests. Balance of cheap - some experience - completed studies bodes good. I would wish you would allow for expensive sold cheap - lots of experience - knowledge you seek of the studies without completion. But that is kind of a personal request. 

That is not why I am writing this public letter to so many of you though. That is just a backdrop. The real reason for is that you ask other things, and I don't think you realize how ambiguous those requests you make are. I am sampling some from the 5 months of samples I now have had access to. 

Requesting Acceptance Testing experience

When in 2001 I summarized my literature research on Testing in Extreme Programming (XP), the complete lack of tester in the method was a key takeaway. Acceptance testing was used as a way of describing a particular style of functional tests in the team. The world of testing knew these as system tests, yet the term Acceptance tests took a significant foothold. 

Meanwhile, the rest of the world thought of Acceptance testing as anything that the customer organization would do for purposes of acceptance. It could be anything from accepting based on a report of other testing, to very detailed and organized test effort ensuring the system adheres to explicit and implicit requirements in the scale that was asked for (and paid for) as well as fits the purpose of use it was commissioned for. Key takeaway is that it is something the customer organization does. 

When you ask for experience of Acceptance testing, you could mean modern end to end automation on GUI and API, driven by examples. You could also ask if we have people who worked on customer organization payroll. You could ask if they have been specifically commissioned to represent a customer organization in an acceptance testing project. 

You probably had an idea of why you asked for this experience. You would do better if you could explain that. 

Requesting Integration Testing experience

If there ever was an ambiguous term, integration testing is guaranteed to be the one. Ask 10 people, you get 10 different answers. 

It used to be popular to call rehearsal rounds to system testing integration testing. The only real separation between these two were that contractors would get to keep bugs they found a secret from integration testing, but had to share them with the customer while in system testing. I am almost certain you weren't looking for people who have played this game. It was really popular 15-20 years ago. 

Those tired of the games started calling it integration testing if we tested a feature while other features were still unimplemented. Usually the focus was on integrating that feature to other pre-existing features. 

Some books recommended integration testing is when we have components or subsystems, and we specifically focus on the back and forth communication between those parts. People rarely knew how to do this in practice though.

Other books emphasized that integration testing is about integration strategies, focusing on smaller scales in large and complex systems, advising to choose an incremental strategy over a big-bang strategy. There integration testing meant control of test environments. 

Many people though went with a very straightforward idea. They called it integration testing if you could test through an API. However, most people would call that API testing rather than integration testing. 

You probably had an idea of why you asked for this experience. You would do better if you could explain that. 

Requesting Test Automation experience

You asking for test automation is different kind of ambiguous. It is hard to find people who don't have experience of test automation in their teams, but only some write it themselves. You may not realize, but people who have test automation experience but don't write it themselves but through exceptional collaboration with developers may do better in long term automation efforts you could be seeking. 

My personal pet peeve is that a lot of people who know how to automate don't know how to test. So they have experience in test automation but that may not be the test automation you want. 

You probably have more specific needs than the high level category. Maybe you have existing tests that need maintaining? Maybe you are about to get started? Maybe you need build pipelines rather than the tests? 

You probably had an idea of why you asked for this experience. You would do better if you could explain that. 

Requesting UI Testing experience

By now, you already know what I am about to say. Experience of WebUIs is different to experience on WindowsUIs, or the intricate details of Java UIs intended to be cross-platform. You may mean usability testing. You may mean specific technology. You may mean awareness of isolating UIs from the backend so that you can do great UI component testing enabling UI changes of the future. 

You probably had an idea of why you asked for this experience. You would do better if you could explain that. 


I want to help you. I want to find you the right people, exceptional people. I want those people to enjoy working in your projects, doing good work, and getting praise. And it would be a lot better for us all if we ended up aiming for the same thing. 

The ways you ask in public sector bids aren't helping you. We've learned how to work with those, and change is so much work that we just do what you ask. Unless you ask that we figure out better together. I am sure I am not the only contractor representative would be happy to volunteer to work with you for better. Perhaps we could do this as a joint community effort, customers and contractors roundtable style?