Monday, June 20, 2022

Untested testing book for children

Many moons ago when I had small children (I now have teenagers), I saw an invite to a playtesting session by Linda Liukas. In case you don't know who she is, she is a superstar and one of the loveliest most approachable people I have had the pleasure of meeting. She has authored multiple children's books on adventures of Ruby and learning programming. I have all her books, I have purchased her books for my kids schools and taught lower grades programming with her brilliant exercises while I still had kids of that age. For years I have been working from the observation that kids and girls in particular, are social learners and best way to transform age groups is to take them all along for the ride. 

The playtesting experience - watching that as parent specializing in testing - was professional. From seeing kids try out the new exercises to the lessons instilled, my kids still remember pieces of computer that were the topic of the session, in addition to the fact that I have dragged them to fangirl Linda's work for years. 

So when I heard that now there is a children's book by another Finnish author that teaches testing for children, I was intrigued but worried. Matching Linda's work is a hard task. Linda, being a software tester of a past while also being a programmer advocate and a renowned author and speaker in this space, sets a high bar. So I have avoided the book "Dragons Out" by Kari Kakkonen until EuroSTAR now publicised they have the book available on their hub.

However, this experience really did not start on a good foot. 

First, while promotional materials lead me to think the book was available, what actually was available was an "ebook" meaning one chapter of the book and some marketing text. Not quite what I had understood. 

Second, I was annoyed with the fact that the children's book where pictures play such a strong role is not promoted with the name of the illustrator. Actually, illustrator is well hidden and Adrienn Szell's work does not get the attribution it deserves by a mention on the pages that people don't read. And excusing misattributing a gifted artists work by not allocating her as second author works against my sense of justice. 

So I jumped into the sample, to see what I get. 

I get to abstract to start with annoyance. It announces "male and female knights" and I wonder why do we have to have children's books where they could be just knights, or at least boys/girls or men/women over getting identified by their reproductive systems. Knights of all genders please, and I continue.

Getting into the book beyond the front page keeping Adrienn invisible, I find her mentioned. 

Ragons. Cand. Typos hit me next. Perhaps I am looking at an early version and these are not in the printed copy? 

Just starting with the story gives me concerns. Why would someone start chapter 1 of testing to children with *memory leaks*? Reading the first description of a village and commentary of it representing software while sheep are memory, I am already tracking in my head where the story can be heading. 

For a village being software, that village is definitely not going to exist in the sheep? I feel the curse of fuzzy metaphors hitting me and continue.

Second chapter makes me convinced that the book could use a translator. The sentences feel Finglish - a translation from Finnish to English. Gorge cannot really be what they meant? Or at least it has to be too elaborate for describing cracks into which sheep can vanish? Sentences like "sheep had stumbled into rock land" sounds almost google translated. The language is getting in the way. "Laura began to suspect that something else than dangerous gorges was now." leaves me totally puzzled on what this is trying to say.

Realising the language is going to be a problem, I move to give less time to the language, and just try to make sense of the points. First chapter introduces first dragon, and dragons are defects. This particular dragon causes loss of sheep which is loss of sheep. And dragons are killed by developers who are also testers and live elsewhere. 

We could discuss how to choose metaphors but they all are bad in some ways, so I can live with this metaphor. There are other things that annoy me though.

When a developer makes and error, she is a woman. That is, when we are introduced in explanation text dragons as defects, we read that "developer has made an error in her coding". Yet, as soon as we seek a developer to fix it, we load on a different gender with "he is known to be able to remove problems, so he is good enough". Talk about loading subliminal gender roles here. 

What really leaves me unhappy is that this chapter said *nothing* about testing. The testing done here (noticing systematically by counting sheep every day) was not done by the knights representing developer/testers. The book starts with a story that tells that dragons just emerge without us leaving anything undone, and presents us *unleashing the dragons* as saviors of the sheep instead of responsible for loss of the sheep in the first place. The book takes the effort of making the point that knights are not villagers, developers/testers are not users and yet leave all of testing for the villagers and only take debugging (which is NOT testing) on the developers/testers. 

If it is a book about testing, it is a book about bad testing. Let's have one developer set up fires, wait for users to notice it and have another developer extinguish the fire! Sounds testing?!?!? Not really. 

On the nature of this red dragon (memory leak), the simplifications made me cringe and I had to wonder: has the author ever been part of doing more than the villagers do (noting sheep missing) with regards to memory leaks? 

This is a testing book for children, untested or at least unfixed. Not recommended. 

Unlearning is harder than learning right things in the first place, so this one gets a no from me. If I cared of testing the book, setting up some playtesting sessions to see engagement and attrition of concepts is recommended. However, this recommendation comes late to the project. 

Monday, June 13, 2022

Testing, A Day At a Time

A new day at work, a new morning. What is the first thing you do? Do you have a routine on how you go about doing the testing work you frame your job description around? How do you balance fast feedback and thoughtful development of executable documentation with improvement, the end game, the important beginning of a new thing and being around all the time? Especially when there is many of them developers, and just one of you testers. 

What I expect is not that complicated, yet it seems to be just that complicated. 

I start and end my days with looking at two things: 

  • pull requests - what is ready or about to be ready for change testing
  • pipeline - what of the executable documentation we have tells us of being able to move forward
If pipeline fails, it needs to be fixed, and we work with the team on the idea of learning to expect a green pipeline with each change - with success rates measured over last 10 2-week iterations being 35 - 85 % and a trend that isn't to the right direction with an excuse of architectural changes. 

Pull requests are giving me a different thing that they give developers, it seems. For the they tell about absolute change control and reality of what is in test environment, and reviewing contents is secondary to designing change-based exploratory testing that may grow the executable documentation - or not. Where Jira tickets are the theory, the pull requests are the practice. And many of the features show up with many changes over time where discussion-based guidance of the order of changes helps test for significant risks more early on. 

A lot of times my work is nodding to the new unit tests, functional tests in integration of particular services, end to end tests and then giving the application a possibility to reveal more than what the tests already revealed - addressing the exploratory testing gap in results based on artifacts and artifacts enhanced with imagination. 

That's the small loop routine, on change.

In addition, there's a feature loop routine. Before a feature starts, I usually work with a product owner to "plan testing of the feature", except I don't really plan the testing of the feature. I clarify scope to a level where I could succeed with testing, and a lot of times that brings out the "NOT list" of things that we are not about to do even though someone might think they too will be included. I use a significant focus on scoping features, scoping what is in a release, what changes on feature level for the release, and what that means for testing on each change, each feature, and the system at hand. 

In the end of a feature loop, I track things the daily change testing identifies, and ensure I review the work of a team not only on each task, but with the lenses of change, feature and system. 

I tend to opt in to pick up some tasks the team owns on adding executable documentation; setting up new environments; fixing bugs and the amount of work in this space is always too much for one person but there is always something I can pitch into. 

That's the feature loop routine, from starting together with me, to finishing together with me. 

The third loop is on improvement. My personal approach to doing this is a continuous retrospective of collecting metrics, collecting observations, identifying experiments, and choosing which one I personally believe should be THE ONE I could pitch in just now for the team. I frame this work as "I don't only test products, I also test organizations creating those products". 

It all seems so easy, simple and straightforward. Yet it isn't. It has uncertainty. It has need of making decisions. It has dependencies to everyone else in the team and need of communicating. And overall, it works against that invisible task list of find some of what others have missed for resultful testing. 

Bugs, by definition, are behaviours we did not expect. What sets Exploratory Testing apart from the non-exploratory is that our reference of expectation is not an artifact but human imagination, supported by external imagination of the application and any and all artifacts. 

Saturday, May 28, 2022

Sample More

Testing is a sampling problem. And in sampling, that's where we make our significant mistakes.

The mistake of sampling on the developers computer leads to the infamous phrases like "works on my computer" and "we're not shipping your computer". 

The mistake of sampling just once leads to the experience where we realise it was working when we looked at it, even if it it clear it does not work as someone else is looking at it. And we go back to our sampling notes of exactly what combination we had, to understand if the problem was in the batch we were sampling, or if it was just that one sample does not make a good test approach.

This week I was sampling. I had a new report intended for flight preparations, including weather conditions snapshot in time. If a computer can do it once, it should be able to repeat it. But I had other plans for testing it.

I wrote a small script that logged in, captured the report every 10 seconds, targeting 10 000 versions of it. Part of my motivation for doing this was that I did not feel like looking at the user interface. But a bigger part was that I did not have the focus time, I was otherwise engaged pairing with a trainee on the first test automation project she is assigned on. 

It is easy to say in hindsight that the activity of turning up the sample size was worthwhile act of exploratory testing. 

I learned that regular sampling on user interface acts as keep alive mechanism for tokens that then don't expire like I expect them to.

I learned that for expecting a new report every minute, the variation of how samples every 10 seconds I could fit in varies a lot, and could explore that timing issue some more.

I learned that given enough opportunities to show change, when change does not happen, something is broken and I could just be unlucky in not noticing it with smaller sample size. 

I learned that sampling allows me to point out times and patterns of our system dying while doing its job. 

I learned that dead systems produce incorrect reports while I expect them to produce no reports. 

A single test - sampling many times - provided me more value than I had anticipated. It allowed testing to happen, unattended until I had time to again attend. It was not automated, I reviewed the logs for the results, tweaked my scripts for the next day to see different patterns, and do now better choices on the values I would like to leave behind for regression concerns. 

This is exploratory testing. Not manual. Not automated. Both. Be smart about the information you are looking for, now and later. Learning matters. 

Friday, May 6, 2022

Salesforce Testing - Components and APIs to Solutions of CRM

In a project I was working on, we used Salesforce as source of our login data. So I got the hang of the basics and access to both test (we called it QUAT - Quality User Acceptance Testing environment) and production. I learned QUAT got data from yet another system that had a test environment too (we called that one UAT - User Acceptance Testing) with a batch job run every hour, and that the two environments had different data replenish policies. 

In addition to coming to realise that I became one of the very few people who understood how to get test data in place across the three systems so that you could experience what users really experience, I learned to proactively design test data that wouldn't vanish every six months, and talk to people across two parts of organization that could not be any more different.

Salesforce, and business support systems like that, are not systems product development (R&D) teams maintain. They are IT systems. And even within the same company, those are essentially different frames for how testing ends up being organised. 

Stereotypically, the product development teams want to just use the services and thus treat them as black box - yet our users have no idea which of the systems in the chain cause trouble. The difference and the reluctance to own experiences across two such different things is a risk in terms of clearing up problems that will eventually happen. 

On the salesforce component acceptance testing that my team ended up being responsible for, we had very few tests in both test and production environments and a rule that if those fail, we just know to discuss it with the other team. 

On the salesforce feature acceptance testing that the other team ended up being responsible for, they tested, with checklist, the basic flows they had promised to support with every release, and dreamed of automation. 

On a couple of occasions, I picked up the business acceptance testing person and paired with her on some automation. Within few hours, she learned to create basic UI test cases, but since she did not run and maintain those continuously, the newly acquired skills grew into awareness, rather than change in what to fit in her days. The core business acceptance testing person is probably the most overworked person I have gotten to know, and anything most people would ask of her would go through strict prioritisation with her manager. I got a direct route with our mutually beneficial working relationship. 

Later, I worked together with the manager and the business acceptance testing person to create a job for someone specialising in test automation there. And when the test automation person was hired, I helped her and her managers make choices on the tooling, while remembering that it was their work and their choices, and their possible mistakes to live with. 

This paints a picture of a loosely coupled "team" with sparse resources in the company, and change work being done by external contractors. Business acceptance testing isn't testing in the same teams as devs work, but it is work supported by domain specialists with deep business understanding, and now, a single test automation person. 

They chose a test automation tool that I don't agree with, but then again, I am not using that tool. So today, I was again thinking back to the choice of this tool, and how testing in that area could be organized. As response to a probing tweet, I was linked to an article on Salesforce Developers Blog on UI Test Automation on Salesforce. What that article basically says is that they intentionally hide identifiers and use shadow DOM, and you'll need a people and tools that deal with that. Their recommendation is not on the tools, but on options of who to pay: tool vendor / integrator / internal.

I started drafting the way I understand the world of options here. 

For any functionality that is integrating with APIs, the OSS Setup 1 (Open Source Setup 1) is possible. It's REST APIs and a team doing the integration (the integrator) is probably finding value to their own work also if we ask them to spend time on this. It is really tempting for the test automation person in the business acceptance testing side to do this too, but it risks delayed feedback and is anyway an approximation that does not help the business acceptance testing person make sense of the business flows in their busy schedule and work that focuses on whole business processes. 

The article mentions two GUI open source tools, and I personally used (and taught the business acceptance testing person to use) a third one, namely Playwright. I colour-coded a conceptual difference of getting one more box from the tool over giving to build it yourself, but probably the skills profile you need to work so that you create the helper utilities or that you use someone else's helper utilities isn't that different, provided the open source tool community has plenty of online open material and examples. Locators are where the pain resides, as the platform itself isn't really making it easy - maintenance can be expected and choosing ones that work can be hard, sometimes also preventively hard. Also, this is full-on test automation programming work and an added challenge is that Salesforce automation work inside your company may be lonely work, and it may not be considered technically interesting for capable people. You can expect the test automation people to spend limited time on the area before longing for next challenge, and building for sustainability needs attention. 

The Commercial tool setup comes to play by having the locator problem outsourced to a specialist team that serves many customers at the same time - adding to the interest and making it a team's job over an individual's job. If only the commercial tool vendors did a little less misleading marketing, some of them might have me on their side. The "no code anyone can do it" isn't really the core here. It's someone attending to the changes and providing a service. On the other side, what comes out with this is then a fully bespoke APIs for driving the UI, and a closed community helping to figure that API out. The weeks and weeks of courses on how to use a vendors "AI approach" create a specialty capability profile that I generally don't vouch for. For the tester, it may be great to specialise in "Salesforce Test Tool No 1" for a while, but it also creates a lock in. The longer you stay in that, the harder it may be to get to do other things too. 

Summing up, how would I be making my choices in this space: 

  1. Open Source Community high adoption rate drivers to drive the UI as capability we grow. Ensure people we hire learn skills that benefit their career growth, not just what needs testing now.
  2. Teach your integrator. Don't hire your own test automation person if one is enough. Or if you hire one of your own, make them work in teams with the integrator to move feedback down the chain.
  3. Pay attention to bugs you find, and let past bugs drive your automation focus. 


Thursday, May 5, 2022

The Artefact of Exploratory Testing

Sometimes people say that all testing is exploratory testing. This puzzles me, because for sure I have been through, again and again, frame of testing in organisations that is very far from exploratory testing. It's all about test cases, manual or automated, prepared in advance, maintained while at it and left for posterity with hopes of reuse. 

Our industry just loves thinking in terms of artefacts - something we produce and leave behind - over focusing on the performance, the right work now for purposes of now and the future. For that purpose, I find myself now discussing, more often, an artefact of answer key to all bugs. I would hope we all want one, but if we had one in advance, we would just tick them all of into fixes and no testing was needed. One does not exist, but we can build one, and we do that by exploratory testing. By the time we are done, our answer key to all bugs is as ready as it will get. 

Keep in mind though that done is not when we release. Exploratory testing tasks in particular come with a tail - following through to various timeframes on what the results ended up being, keeping an attentive ear directed towards the user base and doing deep dives in the production logs to note patterns changing in ways that should add to that answer key to all the bugs. 

We can't do this work manually. We do it as combination of attended and unattended testing work, where creating capabilities of unattended requires us to attend to those capabilities, in addition to the systems we are building. 

As I was writing about all this in a post on LinkedIn, someone commented in a thoughtful way I found a lot of value in for myself. He told of incredible results and relevant change in the last year. The very same results through relevant change I have been experiencing, I would like to think. 

With the assignment of go find (some of) what others have missed we go and provide the results that make up the answer key to bugs. Sounds like something I can't wait to do more of! 

Thursday, April 21, 2022

The Ghosts Among Us

It started some years ago. I started seeing ghosts. Ghost writers writing blog posts. Ghost conference proposers. Ghost tweeters. And when I saw it, I could not unsee it.

My first encounter with the phenomenon of ghost contributions was when I sought advice on how to blog regularly from one of the people I looked up to on their steady pace of decent content, and learner the answer was "Fiverr". Write title and a couple of pointers, send the work off to a distant country for very affordable prices, and put your own name on the resulting article. You commissioned it. You outlined it. You review it, and you publish it. 

If you did such a thing as part of your academic work, it would be a particular type of plagiarism where while you are not infringing someone else's copyright, you are ethically way off mark. But for buying a commercial service and employing someone, there is an ethical consideration but it is less clear cut. 

Later, I organised a conference with a call for collaboration. This meant we scheduled a call with everyone proposing a session. Not just the best looking titles, every individual. It surprised me when the ghosts emerged. There were multiple paper proposals by men, where the scheduling conversation was with a woman. There were multiple paper proposals by men, where the actual conversation was with a woman they had employed to represent them. The more I dig, the more I realise: while one name shows up on the stage, the entire process up to that point, including creating the slides may be by someone completely invisible. 

As someone who mentors new speakers, partial ghost writing their talk proposal has often been a service I provide. I listen to them speak about their idea. I combine that with what I know of the world of testing, and present their thing in writing in best possible light. What I do is light ghosting, since I very carefully collect their words, and many times just reorganise words they already put on paper. My work as mentor is to stay hidden and let them shine. 

Not long ago, I got a chance of discussing with a high profile woman on communication field on social media presence. Surprised I learned she was the ghost of multiple CxO level influencers in tech. She knew what they wanted to say, collected their words for them and stayed invisible for a compensation. 

I'm tired of the financial imbalance where ghost writing is a thing for some to do and others to pay for. I'm tired that it is so often a thing where women are rendered invisible and men become more visible, on an industry where people already imagine women don't exist. Yay for being paid. But paid to become a ghost in the eyes of the audience is just wrong.

It's a relevant privilege to be financially able to pay for someone to do the work to remain invisible. Yet it is the dynamic that the world runs on. And it seems to disproportionately erase work of women, and particularly women in low-income societies. It can only be fixed by the privileged actively doing the work of sharing the credit, or even, allocating the credit where it belongs. 

We could start with a book - where illustrations play as big if not bigger role than texts - and add the illustrator's name on the cover. We should know Adrien Szell

I have not yet decided if I should play the game and pay my share - after all, I have acquired plenty of privilege by now - or if I should use my platform to just point out this to erase it. 

It makes my heart ache when a woman calls me when her 6 months of daily work is erased by the men around her saying the work she put 6 months to is achievement of a man who was visiting for 6 months to work as her pair. 

It makes my heart ache when I have to every single time point out to my managers what and how I contribute. 

It makes my heart ache that Patricia Aas needs to give advice that includes on not sharing an idea without an artefact (slides - 10, demo -11) and be absolutely right about how things are for women in IT. 

We can't mention the underprivileged too much for their contributions. There's work to do there. Share the money, share the credit. 

Friday, April 15, 2022

20 years of teaching testing

I remember first testing courses I delivered back in the days. I had a full day or even two days of time with the group. I had a topic (testing) to teach and I had split that into subtopics. Between various topics - lectures - I had exercises. Most of the exercises were about brainstorming ideas for testing or collecting experiences of the group. I approached the lectures early on as summaries of things I had learned from others and later as summaries of my stories of how I had done testing in projects. People liked the courses but it bothered me that while we learned about things around testing, we did not really learn testing.

Some years into teaching, I mustered courage and create a hands-on work course on exploratory testing. I was terrified because while teaching through telling stories requires you to have those stories, teaching hands-on requires you to be comfortable with things you don't know but will discover. I had people pairing on the courses and we did regular reflections of observations from the pairs to sync up with ideas that I had guided people to try in the sessions. We moved from "find a bug, any bug" to taking structured notes or intertwining testing for information about problems and testing for ideas about future sessions of testing. The pairs could level up to the better in the pair, and if I moved people around to new pairs, I could distribute ideas a little but generally new pairs took a lot of time establishing common ground. People liked the course but it bothered me that while the students got better at testing, they did not get to the levels I hoped I could guide them to. 

When I then discovered there can be such a thing as ensemble testing (started to apply ensemble programming very specifically to testing domain), a whole new world of teaching opened up. I could learn the level each of my students contribute on with the testing problems. I could have each of them teach others from their perspectives. And I could still level the entire group on things I had that were not present in the group by taking the navigator role and modelling what I believed good would look like. 

I have now been teaching with ensemble testing for 8 years, and I consider it a core method more teachers should benefit from using. Ensemble testing combined with nicely layered task assignments that stretch the group to try out different flavours of testing skills is brilliant. It allows me to teach programming to newbies fairly comfortably in a timeframe shorter than folklore lets us assume programming can be taught in. And it reinforces learning of the students by the students becoming teachers of one another as they contribute together on the problems. 

There is still a need of trainings that focus on the stories from projects. The stories give us ideas, and hope, and we can run far with that too. But there is also the need of hands-on skills oriented learning that ensemble testing has provided me. 

In an ensemble, we have single person on a computer, and this single person is our hands. Hands don't decide what to do, they follow the brains, that make sense of all the voices and call a decision. We rotate the roles regularly. Everyone is learning and contributing. In a training session, we expect everyone to be learning a little more and thus contributing a little less, but growing in ability to contribute as the working progresses. The real responsibility of getting the task done is not with the learners, but with the teacher. 

We have now taught 3/4 half-day sessions at our work in ensemble testing format for python for testing -course, and the feedback reinforces what I have explained. 
"The structure is very clear and well prepared. Everything we have done I've thought I knew but I'm picking up lots of new information. Ensemble programming approach is really good and I like that the class progresses at the same pace."
"I've learned a lot of useful things but I am concerned I will forget everything if I don't get to use those things in my everyday work. The course has woken up my coding brain and it has been easier to debug and fix our test automation cases."
"There were few new ideas and a bunch of better ways of doing things I have previously done in a very clumsy way. Definitely worth it for me. It would be good to take time to refactor our TA to use the lessons while they are fresh."

The best feedback on a course that is done where I work is seeing new test automation emerge - like we have seen - on the weeks between the sessions. The lessons are being used, with the woken up coding brain or with the specific tool and technique we just taught.