Friday, July 1, 2022

Testing on THEIR production

Many years ago, a candidate was seeking employment as software tester for a team I was interviewing for. The candidate had done prep work and tested the company's web site looking for functional, performance and security problems. They had caused relevant load (preventing other's from using the site), found functionalities that did not match their expectations and had ideas of possible vulnerabilities. They were, however, completely oblivious to the idea that other organisations production environments are available for *fair use* as per *intended purposes* and testing is not an intended purpose of production environments. They had caused multiple denial of service attacks to a site that was not built to resist those and considered it a success. We did not. We considered it unethical, borderlining illegal, and did not hire.

For years to come, I have been teaching on every single course that we as testers need to be aware of not only what we test, but where we test too. THEIR production isn't our test environment. 

When I discovered a security bug in Foodora that allowed me to get food without paying, I did my very best on not hitting that bug because I did not want to spend time on reporting it. THEIR production was not my test environment. Inability to avoid it lead to some folks in the security community speak poorly of me as I was unwilling to do the work but mentioned (without details) that such a problem existed, after I had done the work I did not want to do on helping them fix it. They considered that since I knew how to test (and was more aware of how the bug could be reproduced), my responsibilities were higher than a user's. I considered requiring free use of my professional skills unfair. 

What should be clear though: 

Other organisations' production is not your test environment. That is just not how we should roll in this industry.

When I teach testing, I teach on other people's software deployed to my own test environment. When I test in production, I do so because my own company asks and consents to it. When I test on other people's production, I do that to provide a service they have asked for and consented to. 

There are some parallels here to web scraping which isn't illegal. The legal system is still figuring out "good bots" and "bad bots", requiring us to adhere to fair use and explicitly agreed terms of use to protect data ownership. 

Building your scrapers and testing web sites are yet a different use case to running scrapers. When building and testing, we have unintentional side effects. When testing in particular, we look for things that are broken and can be made more broken by specific use patterns.

Testing on someone else's production isn't ethically what we should do even if legally it may be grey area. We can and should test on environments that are for that purpose. 

Regularly I still come across companies recruiting with a take-home assignment of automating against someone else's production. Asking a newer tester to show their skills by potentially causing denial of service impacts without consent of the company whose site is being tested is not recommended. Would these people have the standing to say no - most likely not. 

So today I sent two email. One to a testing contractor company using a big popular web shop as their test target letting them know that they should have permission to make their candidates test on other people's production. Another to the big popular web shop to let them know which company is risking their production for their test recruiting purposes. 

The more we know, the more we can avoid unintentional side effects but even then - THEIR production isn't your test environment. Stick to fair use and start your learning / teaching on sites with consent for such pattern. 

Wednesday, June 29, 2022

Fundamental Disagreements

There are things in testing that never get old for me. The intellectual puzzle of uncovering the answer key to bugs that matter, and the dynamics of removing friction from uncovering bugs to such a level that the bugs never really existed. Figuring out communication that conveys things the people in this particular conversation would otherwise miss, with realisation that writing a specification is the wrong mindset on a learning problem. The results in efficiency and effectiveness, and building something of relevance that works. 

As a tester, while I center information about products and organizations, I'm also an active agent in the system that creates the products.  We're builders, building together.

I felt the need of saying this once again seeing Michael Bolton proclaim: 

So, for the thousandth time: testing does not make things better. Weighing yourself does not improve your health. Weighing yourself informs decisions about what you might do with regard to your health.

Me and my tester colleagues in projects are more than a measurement device like a scale. If we were like a scale, we would be commodity. We choose information we produce based on risks, and make decisions when we call out risks we focus on. We find information and choose to present and act on it. We have agency and a place at the table to do something with that information. 

Testing is about making things better. It is about selecting the information that would be valuable enough to make things better. And it is about timely application of that information, sometimes so early that we avoid making certain mistakes. To make things better, we need to be founded in empiricism and experimentation. We learn about the state of things and we actively change the state of things every single day at work. We don't decide alone but in collaboration with others. 

It's not enough to know, we have long since moved to the combination of knowing AND doing something about it. 

We want to know some things continuously and we create test automation systems to replenish that information in a world that keeps moving, that we want to move. 

We want to know some things deeply, and spend time thinking about them with product as our external imagination, without caring to capture all of that to test automation systems. 

We build better people, better products and better societal impacts by actively participating in those conversations. We consume as much information as we produce. 

We can decide and we should. 

While this is a fundamental disagreement, acknowledging it as such is what could move us forward.

Monday, June 20, 2022

Untested testing book for children

Many moons ago when I had small children (I now have teenagers), I saw an invite to a playtesting session by Linda Liukas. In case you don't know who she is, she is a superstar and one of the loveliest most approachable people I have had the pleasure of meeting. She has authored multiple children's books on adventures of Ruby and learning programming. I have all her books, I have purchased her books for my kids schools and taught lower grades programming with her brilliant exercises while I still had kids of that age. For years I have been working from the observation that kids and girls in particular, are social learners and best way to transform age groups is to take them all along for the ride. 

The playtesting experience - watching that as parent specializing in testing - was professional. From seeing kids try out the new exercises to the lessons instilled, my kids still remember pieces of computer that were the topic of the session, in addition to the fact that I have dragged them to fangirl Linda's work for years. 

So when I heard that now there is a children's book by another Finnish author that teaches testing for children, I was intrigued but worried. Matching Linda's work is a hard task. Linda, being a software tester of a past while also being a programmer advocate and a renowned author and speaker in this space, sets a high bar. So I have avoided the book "Dragons Out" by Kari Kakkonen until EuroSTAR now publicised they have the book available on their hub.

However, this experience really did not start on a good foot. 

First, while promotional materials lead me to think the book was available, what actually was available was an "ebook" meaning one chapter of the book and some marketing text. Not quite what I had understood. 

Second, I was annoyed with the fact that the children's book where pictures play such a strong role is not promoted with the name of the illustrator. Actually, illustrator is well hidden and Adrienn Szell's work does not get the attribution it deserves by a mention on the pages that people don't read. And excusing misattributing a gifted artists work by not allocating her as second author works against my sense of justice. 

So I jumped into the sample, to see what I get. 

I get to abstract to start with annoyance. It announces "male and female knights" and I wonder why do we have to have children's books where they could be just knights, or at least boys/girls or men/women over getting identified by their reproductive systems. Knights of all genders please, and I continue.

Getting into the book beyond the front page keeping Adrienn invisible, I find her mentioned. 

Ragons. Cand. Typos hit me next. Perhaps I am looking at an early version and these are not in the printed copy? 

Just starting with the story gives me concerns. Why would someone start chapter 1 of testing to children with *memory leaks*? Reading the first description of a village and commentary of it representing software while sheep are memory, I am already tracking in my head where the story can be heading. 

For a village being software, that village is definitely not going to exist in the sheep? I feel the curse of fuzzy metaphors hitting me and continue.

Second chapter makes me convinced that the book could use a translator. The sentences feel Finglish - a translation from Finnish to English. Gorge cannot really be what they meant? Or at least it has to be too elaborate for describing cracks into which sheep can vanish? Sentences like "sheep had stumbled into rock land" sounds almost google translated. The language is getting in the way. "Laura began to suspect that something else than dangerous gorges was now." leaves me totally puzzled on what this is trying to say.

Realising the language is going to be a problem, I move to give less time to the language, and just try to make sense of the points. First chapter introduces first dragon, and dragons are defects. This particular dragon causes loss of sheep which is loss of sheep. And dragons are killed by developers who are also testers and live elsewhere. 

We could discuss how to choose metaphors but they all are bad in some ways, so I can live with this metaphor. There are other things that annoy me though.

When a developer makes and error, she is a woman. That is, when we are introduced in explanation text dragons as defects, we read that "developer has made an error in her coding". Yet, as soon as we seek a developer to fix it, we load on a different gender with "he is known to be able to remove problems, so he is good enough". Talk about loading subliminal gender roles here. 

What really leaves me unhappy is that this chapter said *nothing* about testing. The testing done here (noticing systematically by counting sheep every day) was not done by the knights representing developer/testers. The book starts with a story that tells that dragons just emerge without us leaving anything undone, and presents us *unleashing the dragons* as saviors of the sheep instead of responsible for loss of the sheep in the first place. The book takes the effort of making the point that knights are not villagers, developers/testers are not users and yet leave all of testing for the villagers and only take debugging (which is NOT testing) on the developers/testers. 

If it is a book about testing, it is a book about bad testing. Let's have one developer set up fires, wait for users to notice it and have another developer extinguish the fire! Sounds testing?!?!? Not really. 

On the nature of this red dragon (memory leak), the simplifications made me cringe and I had to wonder: has the author ever been part of doing more than the villagers do (noting sheep missing) with regards to memory leaks? 

This is a testing book for children, untested or at least unfixed. Not recommended. 

Unlearning is harder than learning right things in the first place, so this one gets a no from me. If I cared of testing the book, setting up some playtesting sessions to see engagement and attrition of concepts is recommended. However, this recommendation comes late to the project. 

Monday, June 13, 2022

Testing, A Day At a Time

A new day at work, a new morning. What is the first thing you do? Do you have a routine on how you go about doing the testing work you frame your job description around? How do you balance fast feedback and thoughtful development of executable documentation with improvement, the end game, the important beginning of a new thing and being around all the time? Especially when there is many of them developers, and just one of you testers. 

What I expect is not that complicated, yet it seems to be just that complicated. 

I start and end my days with looking at two things: 

  • pull requests - what is ready or about to be ready for change testing
  • pipeline - what of the executable documentation we have tells us of being able to move forward
If pipeline fails, it needs to be fixed, and we work with the team on the idea of learning to expect a green pipeline with each change - with success rates measured over last 10 2-week iterations being 35 - 85 % and a trend that isn't to the right direction with an excuse of architectural changes. 

Pull requests are giving me a different thing that they give developers, it seems. For the they tell about absolute change control and reality of what is in test environment, and reviewing contents is secondary to designing change-based exploratory testing that may grow the executable documentation - or not. Where Jira tickets are the theory, the pull requests are the practice. And many of the features show up with many changes over time where discussion-based guidance of the order of changes helps test for significant risks more early on. 

A lot of times my work is nodding to the new unit tests, functional tests in integration of particular services, end to end tests and then giving the application a possibility to reveal more than what the tests already revealed - addressing the exploratory testing gap in results based on artifacts and artifacts enhanced with imagination. 

That's the small loop routine, on change.

In addition, there's a feature loop routine. Before a feature starts, I usually work with a product owner to "plan testing of the feature", except I don't really plan the testing of the feature. I clarify scope to a level where I could succeed with testing, and a lot of times that brings out the "NOT list" of things that we are not about to do even though someone might think they too will be included. I use a significant focus on scoping features, scoping what is in a release, what changes on feature level for the release, and what that means for testing on each change, each feature, and the system at hand. 

In the end of a feature loop, I track things the daily change testing identifies, and ensure I review the work of a team not only on each task, but with the lenses of change, feature and system. 

I tend to opt in to pick up some tasks the team owns on adding executable documentation; setting up new environments; fixing bugs and the amount of work in this space is always too much for one person but there is always something I can pitch into. 

That's the feature loop routine, from starting together with me, to finishing together with me. 

The third loop is on improvement. My personal approach to doing this is a continuous retrospective of collecting metrics, collecting observations, identifying experiments, and choosing which one I personally believe should be THE ONE I could pitch in just now for the team. I frame this work as "I don't only test products, I also test organizations creating those products". 

It all seems so easy, simple and straightforward. Yet it isn't. It has uncertainty. It has need of making decisions. It has dependencies to everyone else in the team and need of communicating. And overall, it works against that invisible task list of find some of what others have missed for resultful testing. 

Bugs, by definition, are behaviours we did not expect. What sets Exploratory Testing apart from the non-exploratory is that our reference of expectation is not an artifact but human imagination, supported by external imagination of the application and any and all artifacts. 

Saturday, May 28, 2022

Sample More

Testing is a sampling problem. And in sampling, that's where we make our significant mistakes.

The mistake of sampling on the developers computer leads to the infamous phrases like "works on my computer" and "we're not shipping your computer". 

The mistake of sampling just once leads to the experience where we realise it was working when we looked at it, even if it it clear it does not work as someone else is looking at it. And we go back to our sampling notes of exactly what combination we had, to understand if the problem was in the batch we were sampling, or if it was just that one sample does not make a good test approach.

This week I was sampling. I had a new report intended for flight preparations, including weather conditions snapshot in time. If a computer can do it once, it should be able to repeat it. But I had other plans for testing it.

I wrote a small script that logged in, captured the report every 10 seconds, targeting 10 000 versions of it. Part of my motivation for doing this was that I did not feel like looking at the user interface. But a bigger part was that I did not have the focus time, I was otherwise engaged pairing with a trainee on the first test automation project she is assigned on. 

It is easy to say in hindsight that the activity of turning up the sample size was worthwhile act of exploratory testing. 

I learned that regular sampling on user interface acts as keep alive mechanism for tokens that then don't expire like I expect them to.

I learned that for expecting a new report every minute, the variation of how samples every 10 seconds I could fit in varies a lot, and could explore that timing issue some more.

I learned that given enough opportunities to show change, when change does not happen, something is broken and I could just be unlucky in not noticing it with smaller sample size. 

I learned that sampling allows me to point out times and patterns of our system dying while doing its job. 

I learned that dead systems produce incorrect reports while I expect them to produce no reports. 

A single test - sampling many times - provided me more value than I had anticipated. It allowed testing to happen, unattended until I had time to again attend. It was not automated, I reviewed the logs for the results, tweaked my scripts for the next day to see different patterns, and do now better choices on the values I would like to leave behind for regression concerns. 

This is exploratory testing. Not manual. Not automated. Both. Be smart about the information you are looking for, now and later. Learning matters. 

Friday, May 6, 2022

Salesforce Testing - Components and APIs to Solutions of CRM

In a project I was working on, we used Salesforce as source of our login data. So I got the hang of the basics and access to both test (we called it QUAT - Quality User Acceptance Testing environment) and production. I learned QUAT got data from yet another system that had a test environment too (we called that one UAT - User Acceptance Testing) with a batch job run every hour, and that the two environments had different data replenish policies. 

In addition to coming to realise that I became one of the very few people who understood how to get test data in place across the three systems so that you could experience what users really experience, I learned to proactively design test data that wouldn't vanish every six months, and talk to people across two parts of organization that could not be any more different.

Salesforce, and business support systems like that, are not systems product development (R&D) teams maintain. They are IT systems. And even within the same company, those are essentially different frames for how testing ends up being organised. 

Stereotypically, the product development teams want to just use the services and thus treat them as black box - yet our users have no idea which of the systems in the chain cause trouble. The difference and the reluctance to own experiences across two such different things is a risk in terms of clearing up problems that will eventually happen. 

On the salesforce component acceptance testing that my team ended up being responsible for, we had very few tests in both test and production environments and a rule that if those fail, we just know to discuss it with the other team. 

On the salesforce feature acceptance testing that the other team ended up being responsible for, they tested, with checklist, the basic flows they had promised to support with every release, and dreamed of automation. 

On a couple of occasions, I picked up the business acceptance testing person and paired with her on some automation. Within few hours, she learned to create basic UI test cases, but since she did not run and maintain those continuously, the newly acquired skills grew into awareness, rather than change in what to fit in her days. The core business acceptance testing person is probably the most overworked person I have gotten to know, and anything most people would ask of her would go through strict prioritisation with her manager. I got a direct route with our mutually beneficial working relationship. 

Later, I worked together with the manager and the business acceptance testing person to create a job for someone specialising in test automation there. And when the test automation person was hired, I helped her and her managers make choices on the tooling, while remembering that it was their work and their choices, and their possible mistakes to live with. 

This paints a picture of a loosely coupled "team" with sparse resources in the company, and change work being done by external contractors. Business acceptance testing isn't testing in the same teams as devs work, but it is work supported by domain specialists with deep business understanding, and now, a single test automation person. 

They chose a test automation tool that I don't agree with, but then again, I am not using that tool. So today, I was again thinking back to the choice of this tool, and how testing in that area could be organized. As response to a probing tweet, I was linked to an article on Salesforce Developers Blog on UI Test Automation on Salesforce. What that article basically says is that they intentionally hide identifiers and use shadow DOM, and you'll need a people and tools that deal with that. Their recommendation is not on the tools, but on options of who to pay: tool vendor / integrator / internal.

I started drafting the way I understand the world of options here. 

For any functionality that is integrating with APIs, the OSS Setup 1 (Open Source Setup 1) is possible. It's REST APIs and a team doing the integration (the integrator) is probably finding value to their own work also if we ask them to spend time on this. It is really tempting for the test automation person in the business acceptance testing side to do this too, but it risks delayed feedback and is anyway an approximation that does not help the business acceptance testing person make sense of the business flows in their busy schedule and work that focuses on whole business processes. 

The article mentions two GUI open source tools, and I personally used (and taught the business acceptance testing person to use) a third one, namely Playwright. I colour-coded a conceptual difference of getting one more box from the tool over giving to build it yourself, but probably the skills profile you need to work so that you create the helper utilities or that you use someone else's helper utilities isn't that different, provided the open source tool community has plenty of online open material and examples. Locators are where the pain resides, as the platform itself isn't really making it easy - maintenance can be expected and choosing ones that work can be hard, sometimes also preventively hard. Also, this is full-on test automation programming work and an added challenge is that Salesforce automation work inside your company may be lonely work, and it may not be considered technically interesting for capable people. You can expect the test automation people to spend limited time on the area before longing for next challenge, and building for sustainability needs attention. 

The Commercial tool setup comes to play by having the locator problem outsourced to a specialist team that serves many customers at the same time - adding to the interest and making it a team's job over an individual's job. If only the commercial tool vendors did a little less misleading marketing, some of them might have me on their side. The "no code anyone can do it" isn't really the core here. It's someone attending to the changes and providing a service. On the other side, what comes out with this is then a fully bespoke APIs for driving the UI, and a closed community helping to figure that API out. The weeks and weeks of courses on how to use a vendors "AI approach" create a specialty capability profile that I generally don't vouch for. For the tester, it may be great to specialise in "Salesforce Test Tool No 1" for a while, but it also creates a lock in. The longer you stay in that, the harder it may be to get to do other things too. 

Summing up, how would I be making my choices in this space: 

  1. Open Source Community high adoption rate drivers to drive the UI as capability we grow. Ensure people we hire learn skills that benefit their career growth, not just what needs testing now.
  2. Teach your integrator. Don't hire your own test automation person if one is enough. Or if you hire one of your own, make them work in teams with the integrator to move feedback down the chain.
  3. Pay attention to bugs you find, and let past bugs drive your automation focus. 


Thursday, May 5, 2022

The Artefact of Exploratory Testing

Sometimes people say that all testing is exploratory testing. This puzzles me, because for sure I have been through, again and again, frame of testing in organisations that is very far from exploratory testing. It's all about test cases, manual or automated, prepared in advance, maintained while at it and left for posterity with hopes of reuse. 

Our industry just loves thinking in terms of artefacts - something we produce and leave behind - over focusing on the performance, the right work now for purposes of now and the future. For that purpose, I find myself now discussing, more often, an artefact of answer key to all bugs. I would hope we all want one, but if we had one in advance, we would just tick them all of into fixes and no testing was needed. One does not exist, but we can build one, and we do that by exploratory testing. By the time we are done, our answer key to all bugs is as ready as it will get. 

Keep in mind though that done is not when we release. Exploratory testing tasks in particular come with a tail - following through to various timeframes on what the results ended up being, keeping an attentive ear directed towards the user base and doing deep dives in the production logs to note patterns changing in ways that should add to that answer key to all the bugs. 

We can't do this work manually. We do it as combination of attended and unattended testing work, where creating capabilities of unattended requires us to attend to those capabilities, in addition to the systems we are building. 

As I was writing about all this in a post on LinkedIn, someone commented in a thoughtful way I found a lot of value in for myself. He told of incredible results and relevant change in the last year. The very same results through relevant change I have been experiencing, I would like to think. 

With the assignment of go find (some of) what others have missed we go and provide the results that make up the answer key to bugs. Sounds like something I can't wait to do more of! 

Thursday, April 21, 2022

The Ghosts Among Us

It started some years ago. I started seeing ghosts. Ghost writers writing blog posts. Ghost conference proposers. Ghost tweeters. And when I saw it, I could not unsee it.

My first encounter with the phenomenon of ghost contributions was when I sought advice on how to blog regularly from one of the people I looked up to on their steady pace of decent content, and learner the answer was "Fiverr". Write title and a couple of pointers, send the work off to a distant country for very affordable prices, and put your own name on the resulting article. You commissioned it. You outlined it. You review it, and you publish it. 

If you did such a thing as part of your academic work, it would be a particular type of plagiarism where while you are not infringing someone else's copyright, you are ethically way off mark. But for buying a commercial service and employing someone, there is an ethical consideration but it is less clear cut. 

Later, I organised a conference with a call for collaboration. This meant we scheduled a call with everyone proposing a session. Not just the best looking titles, every individual. It surprised me when the ghosts emerged. There were multiple paper proposals by men, where the scheduling conversation was with a woman. There were multiple paper proposals by men, where the actual conversation was with a woman they had employed to represent them. The more I dig, the more I realise: while one name shows up on the stage, the entire process up to that point, including creating the slides may be by someone completely invisible. 

As someone who mentors new speakers, partial ghost writing their talk proposal has often been a service I provide. I listen to them speak about their idea. I combine that with what I know of the world of testing, and present their thing in writing in best possible light. What I do is light ghosting, since I very carefully collect their words, and many times just reorganise words they already put on paper. My work as mentor is to stay hidden and let them shine. 

Not long ago, I got a chance of discussing with a high profile woman on communication field on social media presence. Surprised I learned she was the ghost of multiple CxO level influencers in tech. She knew what they wanted to say, collected their words for them and stayed invisible for a compensation. 

I'm tired of the financial imbalance where ghost writing is a thing for some to do and others to pay for. I'm tired that it is so often a thing where women are rendered invisible and men become more visible, on an industry where people already imagine women don't exist. Yay for being paid. But paid to become a ghost in the eyes of the audience is just wrong.

It's a relevant privilege to be financially able to pay for someone to do the work to remain invisible. Yet it is the dynamic that the world runs on. And it seems to disproportionately erase work of women, and particularly women in low-income societies. It can only be fixed by the privileged actively doing the work of sharing the credit, or even, allocating the credit where it belongs. 

We could start with a book - where illustrations play as big if not bigger role than texts - and add the illustrator's name on the cover. We should know Adrien Szell

I have not yet decided if I should play the game and pay my share - after all, I have acquired plenty of privilege by now - or if I should use my platform to just point out this to erase it. 

It makes my heart ache when a woman calls me when her 6 months of daily work is erased by the men around her saying the work she put 6 months to is achievement of a man who was visiting for 6 months to work as her pair. 

It makes my heart ache when I have to every single time point out to my managers what and how I contribute. 

It makes my heart ache that Patricia Aas needs to give advice that includes on not sharing an idea without an artefact (slides - 10, demo -11) and be absolutely right about how things are for women in IT. 

We can't mention the underprivileged too much for their contributions. There's work to do there. Share the money, share the credit. 

Friday, April 15, 2022

20 years of teaching testing

I remember first testing courses I delivered back in the days. I had a full day or even two days of time with the group. I had a topic (testing) to teach and I had split that into subtopics. Between various topics - lectures - I had exercises. Most of the exercises were about brainstorming ideas for testing or collecting experiences of the group. I approached the lectures early on as summaries of things I had learned from others and later as summaries of my stories of how I had done testing in projects. People liked the courses but it bothered me that while we learned about things around testing, we did not really learn testing.

Some years into teaching, I mustered courage and create a hands-on work course on exploratory testing. I was terrified because while teaching through telling stories requires you to have those stories, teaching hands-on requires you to be comfortable with things you don't know but will discover. I had people pairing on the courses and we did regular reflections of observations from the pairs to sync up with ideas that I had guided people to try in the sessions. We moved from "find a bug, any bug" to taking structured notes or intertwining testing for information about problems and testing for ideas about future sessions of testing. The pairs could level up to the better in the pair, and if I moved people around to new pairs, I could distribute ideas a little but generally new pairs took a lot of time establishing common ground. People liked the course but it bothered me that while the students got better at testing, they did not get to the levels I hoped I could guide them to. 

When I then discovered there can be such a thing as ensemble testing (started to apply ensemble programming very specifically to testing domain), a whole new world of teaching opened up. I could learn the level each of my students contribute on with the testing problems. I could have each of them teach others from their perspectives. And I could still level the entire group on things I had that were not present in the group by taking the navigator role and modelling what I believed good would look like. 

I have now been teaching with ensemble testing for 8 years, and I consider it a core method more teachers should benefit from using. Ensemble testing combined with nicely layered task assignments that stretch the group to try out different flavours of testing skills is brilliant. It allows me to teach programming to newbies fairly comfortably in a timeframe shorter than folklore lets us assume programming can be taught in. And it reinforces learning of the students by the students becoming teachers of one another as they contribute together on the problems. 

There is still a need of trainings that focus on the stories from projects. The stories give us ideas, and hope, and we can run far with that too. But there is also the need of hands-on skills oriented learning that ensemble testing has provided me. 

In an ensemble, we have single person on a computer, and this single person is our hands. Hands don't decide what to do, they follow the brains, that make sense of all the voices and call a decision. We rotate the roles regularly. Everyone is learning and contributing. In a training session, we expect everyone to be learning a little more and thus contributing a little less, but growing in ability to contribute as the working progresses. The real responsibility of getting the task done is not with the learners, but with the teacher. 

We have now taught 3/4 half-day sessions at our work in ensemble testing format for python for testing -course, and the feedback reinforces what I have explained. 
"The structure is very clear and well prepared. Everything we have done I've thought I knew but I'm picking up lots of new information. Ensemble programming approach is really good and I like that the class progresses at the same pace."
"I've learned a lot of useful things but I am concerned I will forget everything if I don't get to use those things in my everyday work. The course has woken up my coding brain and it has been easier to debug and fix our test automation cases."
"There were few new ideas and a bunch of better ways of doing things I have previously done in a very clumsy way. Definitely worth it for me. It would be good to take time to refactor our TA to use the lessons while they are fresh."

The best feedback on a course that is done where I work is seeing new test automation emerge - like we have seen - on the weeks between the sessions. The lessons are being used, with the woken up coding brain or with the specific tool and technique we just taught. 



Friday, April 1, 2022

Why Ensemble Programming/Testing isn't Group Programming/Testing

It's two years since I took action on the words I use around this particular style of collaborative software testing (and development), and decided I will no longer be excluding people uncomfortable with the term mobbing which by vocabulary definition means bullying of an individual by a group. Repurposing mobbing to mean a collaborative programming style that centers kindness, consideration and respect just felt too much of a dissonance. While I don't have the energy to change the world or those who don't follow and choose to change themselves, I committed to changing myself. That alone was quite a significant energy commitment. 

I learned a new language. I would talk of ensembling, and ensemble programming and ensemble testing. Being the person who lead the way in figuring out ensemble testing in the first place, I went back to my old writings and replaced the terminology. I had a book out that I renamed to Ensemble Programming Guidebook. I changed the domain names I had. I made my peace with the fact that I might find myself speaking of the common thing with a new term. 

Soon I started to notice that I was too soon with making peace with acceptance, and many people I hold dear followed. Lisi Hocke, Lisa Crispin, Emily Bache - and with Emily, a lot of the programming world. 

To discover the new world, I collected and investigated options. I learned what different groups of animals are called, and the final selected word was one I would not have come by without the essential contribution from Denise Yu. As per our mutual agreement, this term is a result of collaboration, and the work we both did was essential. 

For two years now, the world has coexisted with the original. It has made its way to Wikipedia, and it is understood as a synonym even amongst people like Mob Mentality Show who have visible high stakes in the original name. Renaming your own thing isn't an easy thing, and the ripple effects of it are quite a commitment. 

I personally have reached a point where I no longer say "ensemble programming, previously known as mob programming" but the term lives on its own. I need to still explain it, as the conversations around the chosen term are no longer about how negative the connotations are but how ensemble is a difficult word for the English native speakers. We could have also said group/team programming as the options were back then, so I wanted to reiterate today on why those words did not get chosen.

Ensemble programming as a word tries to say this technique is different than just some group / team getting together to work. 

Group programming / testing says: 'all hands', with multiple computers. 
Ensemble programming / testing says: 'one hands', with a single (or a few for some activities) computer. 

Group programming / testing says: 'all voices, for themselves, you overhear some stuff'. 
Ensemble programming / testing says: 'all voices, one at a time, listen to the voice'

Both say all brains, but for a very different composition. Group has brains co-located, doing their own thing. Ensemble has brains co-located, doing the same thing. 

With these distinctions in mind, I look back at a training class I provided this week at work. We had whole group working together through single hands that rotated. We had primarily two voices taking turns in guiding the work, as the group did not know how to do the work they were there to learn. 

In an ensemble, hands listen and follow the voices. We work as a single mind that is stronger together through hearing the contributions through verbalizing them. 

Group / team testing does not come even close to expressing how this dynamic is different. 

Saturday, March 5, 2022

Services I provide for my team

I have now been with a new team for two months, and my work with the team is starting to take a recognizable shape. I'm around, and I hold space for testing. I define with examples where testing my team does did not provide all the results for quality we may wish for. And I refine, yet again with examples in addition to rules, what we are agreeing to build. Every now and then, I fix problems I run into, with the rule of thumb of being aware of reporting time in relation to fixing time, and taking the fixing upon myself if the two are proportionally positioned. 

Let's talk about what these concepts mean in practice. 

Holding space for testing

Right now as official communication goes, my team does not have a tester. They are in process of hiring one. I am around temporarily, with other responsibilities in addition to showing up for them. The team owns testing, and executes their idea of what it looks like: automating programmer intent on three levels, and showing up as brilliant colleagues so that no one is alone with responsibility of change. 

For this service, I listen more than I talk. I more often talk with raise of an eyebrow or other facial expression. When I talk, I talk in questions even in cases I think I hold the answer. I use my questions to share the questions we could all have. 

I listen to more than words - I listen to actions, and I listen to coherence of words and actions. By listening, I notice learning, I notice patterns of what is easy and what is hard. That information is input for other services. 

Being around, I remind people that testing exists without doing anything on it. I see programmers looking at me telling "you'd want me to test this", explaining how proud they are of the way they configured fast feedback and provide value by showing up whole-heartedly to share the joy. I understand. And I am delighted for their success.

Learning to do less to achieve more has been one of the hardest skills I have been working on. It's not about me. It's about the human system, it's about the results, it's about us all together. 

Defining lack of results with examples

When I turn to testing the verb, I start with (contemporary) exploratory testing. I might look what I see when I explore by changing developer intent tests on unit/integration/e2e levels, or I might look at what I see when I explore different users interfaces: the GUI, the APIs, the config files, the 3rd party integrations, the environments. It's a user who turns off the computer the system runs on. It's a user who reconfigures and reboots, It's a user who builds their own integration programming on the APIs. The visible user interface isn't the only access point for active actors in the system. 

I go and find some of what others may have missed. If I can drop an example when developing isn't complete, I can mention it in passing, changing the outcome. If I get to do ensemble programming, I do a lot of this. I don't (yet) with my current team. I drop examples in dailys, on our teams channel, and I watch for reactions. When time is right, defined with a fuzzy criteria of my limits of tracking, I turn the examples into written examples as bug reports for those that still need addressing. 

I put significant effort into using the examples as training for future avoidance, over bugs we just need to address. But with continuous work on building my ideas of what results we may be missing, I help my team not leak uncertainty from changes and features as much as I can. 

In last weeks, I have sensed management concern over things going on longer than first expected, but we are now building the baseline of not leaking. The whole estimating under conditions of uncertainty creates a game where we incentivize building what we agreed over what we learned we need, and awareness of cost (in time) needs to move to a continuous process. 

Refining what we build with rules and examples

I also do that "shifted left" testing and work with the product owner and the team on what we start working on. When product owner writes their acceptance criteria, I too write acceptance criteria, I don't merely review theirs. I write my own as a model of what is in my head, and I compare my model to their model. I improve theirs by combining the two models. Seeing omissions from creating multiple models from different perspectives seems to be more effective than thinking I have some magic "tester mindset" that makes me good at reviewing other people's materials. 

Right now my day to day includes significant effort in figuring out the balance of specifying before implementing and adding details we are missing. I'm learning again that the only model of rules and examples that turns to code is one with the developers, and as a tester (or as PO), I am only adding finesse on that model. Too much finesse is overwhelming. And I really want to avoid testersplaining features. 

Finding new rules and examples is expected, and some of the best work I can do in this area is to help us stay true and honest on what insights are new - to manage the expectations of time and effort. 

Fixing problems

Finally, I fix problems. I create pull requests of fixes, that the developers review. They are like bug reports, but with a step further in fixing it. 

I'm sure I do something other too, but these seemed like the services I have seen myself recently provide. 

Saturday, February 26, 2022

More to lifecycle for testing

 With work on recruiting, I have seen a lot of CVs recently. A lot of CVs mention "experience on SDLC" - software development lifecycle, and everyone has a varying set of experiences what it really means in practice to work in agile or waterfall. So this week I've done my share of conversations with people, modeling differences to how we set up "lifecycle" to test, and how others are doing it. Here's a sampling of what I have learned.

Sample 1. Oh So Many Roles. 

This team has embraced the idea of test automation, and defined their lifecycle around that. Per feature, they have a tester writing test cases, a test automation engineer implementing these written test cases in code, and an agreement in place where results on day to day app development belong to the developers to look at. 

My conclusion: not contemporary exploratory testing or even exploratory testing, but very much test-case driven. Leverages specialized skills and while you need more people, specializing people allow you to scale your efforts. Not my choice of style but I can see how some teams would come to this. 

Sample 2. So many amigas

This team has embraced the idea of scoping and specifying before implementing, and has so many amigas participating in the four amigas sessions. Yes, some might call this three amigos, but a story refinement workshop can have more than three people and they are definitely all men. So we should go for a gender-neutral feminine expression, right? 

For every story refinement, there is the before and after thinking for every perspective, even if the session itself is all together and nicely collaborative. People aren't at their best when thinking on their feet. 

My conclusion: Too much before implementation, and too many helpers. Cut down the roles, lighten up the process. Make the pieces smaller. This fits my idea of contemporary exploratory testing and leaves documentation around as automation. 

Sample 3. Prep with test cases, then test

This team gets a project with many features in one go, and prepares by writing test cases. If the features come quicker than test cases can be written, the team writes a checklist to fill in them as proper step-by-step test cases later. Star marks the focus of effort - in preparing and analyzing. 

My conclusion: not exploratory testing, not contemporary exploratory testing, not agile testing. A lot of wait and prep, and a little learning time. Would not be my choice of mode, but I have worked on a mode like this. 

Sample 4. Turn prep to learn time

This team never writes detailed test cases, instead they create lighter checklist (and are usually busy with other projects while the prep time is ongoing). Overall time and effort is lower, but otherwise very similar to sample 3. Star marks focus of effort - during test execution, to explore around checklists. 

My conclusion: exploratory testing, not contemporary exploratory testing, not agile testing. You can leave prep undone, but you can't make the tail in the end longer and thus are always squeezed with time. 

Conclusion overall

We have significantly different frames we do testing in, and when we talk about only the most modern ones in conferences, we have a whole variety of testers who aren't on board with the frame. And frankly, can be powerless in changing the frame they work from. We could do better. 


Core tasks for test positions

 In the last weeks, I have been asking candidates in interviews and colleagues in conversations a question: 

How Would You Test This? 

I would show the the user interface. I would show them a single approval test against settings API returning the settings. And I would show them the server side configuration file. The thing to test is the settings. 

I have come to learn that I am not particularly happy with the way people would test this. In most cases, describing the chosen test approach is best described as:

Doing what developers did already locally in end to end environment. 

For a significant portion, you would see application of two additional rules. 

Try unexpected / disallowed values. 

Find value boundary and try too small / too large / just right. 

I've been talking about resultful testing (contemporary exploratory testing) and from that perspective, I have been disappointed. None of these three approaches center results, but routine. 

For a significant portion of people centering automation, they would apply an additional rule. 

Random gives discovery a chance. 

I had few shining lights amongst the many conversations. In the best ones people ground what they see in the world they know ("Location, I'll try my home address") and seek understanding of concepts ("latitude and longitude, what do they look like). The better automation testers would have some ideas of how to know if it worked as it was supposed to for their random values, and in implementing it may run a chance of creating a way to reveal how it breaks even if they could not explain it. 

Looking at this from the point of view of me having reported bugs on it, and my team having fixed many bugs, I know that for most of the testers I have shown this to, they would have waited for the realistic customer trying to configure this for their unfortunate location to find out the locations in the world that don't work.  

I've come to see that many professional testers overemphasize negative testing (input validation) and pay all too little attention to the positive testing that is much more than a single test with values given as the defaults. 

As we have discovered essentially different, we also documented that. Whether we need to is another topic for another day.  

This experience of disappointment leads me into thinking about core tasks for positions. Like when I hire for a (contemporary exploratory) tester position, the core task I expect them to be able to do is resultful testing. Their main assignment is to find some of what others may have missed and when they miss out on all information when there is information to find, I would not want to call them a tester. Their secondary assignment is to document in automation to support discovery in scale over iterations. 

At the same time, I realize not all testers are contemporary exploratory testers. Some are manual testers. Their main assignment is to do what devs may have done in local in test environment and document it in test case. Later rounds of what they then do are use test cases again as documented to ensure no regression with changes. There is an inherent value also in being the persistently last one to check things before delivering them forward, especially in teams with little to none on test automation. 

Some testers are also traditional exploratory testers. Their main assignment, to find some of what others may have missed combined with lack of time and skills on programming leave out the secondary assignment I require in a contemporary exploratory tester. 

We would be disappointed in a contemporary exploratory tester if they did not find useful insights in a proportion that helps us not leak all problems to production, and contribute to automation baseline. We would be disappointed in a manual tester if they did not leave behind evidence of systematically covering basic scenarios and reporting blockers on those. We would be disappointed in a traditional exploratory tester if they did not find a trustworthy set of results, providing some types of models to support the continued work in the area. 

What then are the core tasks for automation testers? If we are lucky, same as contemporary exploratory testers. Usually we are not lucky though, and their main assignment is to document in automation basic scenarios in test environment. Their secondary assignment is maintain automation and ensure right reactions to feedback automation gives and the resultful aspect is delayed to first feedback of results we are missing. 

I find myself in a place where I am hoping to get all in one, yet find potential in manual testers or automation testers growing into contemporary exploratory testers. 

I guess we need to still mention pay. I don't think the manual tester or the automation testers should be paid what developers are paid, unless the automation testers are developers choosing to specialize in the testing domain. A lot of automation testers are not very strong developers not strong testers. I have also heard a proposal on framing this differently: let's pay our people for the position we want them to be in, hire on potential and guide on expectations to do a different role than what their current experience is. 

Sunday, February 20, 2022

How My Team Tests for Now

I'm with a new team, acting as the resident testing specialist. We're building a new product and our day to day work is fairly collaborative. We get a feature request (epic/story), developers take whatever time it takes to add it, and cycle through tasks of adding features, adding tests for the features and refactoring to better architecture. I, as the team's tester, review pull requests to know what is changing, note failing test automation to know what changes surprise us and test the growing system from user interfaces, APIs and even units, extending test automation either through mentions of ideas, issues or a pull request adding to the existing tests. 

For a feature that is ready on Wednesday, my kind of testing happens on the previous Friday, but I can show up any day in either pre-production or production environments and find information that makes changes to whatever we could be delivering the next week. While our eventual target is to be a day away from production ready, the reality now is two weeks. We have just started our journey of tightening our cycles. 

I tried drawing our way of working with testing into a picture. 

On the left the "Improve claims" process is one of our continuously ongoing dual tracks. I personally work a lot with the product owner in ensuring we understand our next requested increments, increasingly with examples. As important as understanding the scope (and how we could test it), is to ask how we can split it to smaller. As we are increasingly adding examples, we are also increasingly making our requests smaller. We start with epics and stories, but working towards merging the two, thus making stories something that we can classify into ongoing themes. 

In the middle is the four layers of perspectives that drive testing. Our developers and our pipelines test changes continuously, and developers document their intent in unit, api and UI tests in different scopes of integration. Other developers, including me as developer specializing in testing comment and if seeing the integrated result helps as external imagination, can take a look at. For now at least a PR is usually multiple commits, and the team has a gate at PR level expecting someone other than the original developer to look at it. All tests we require for evidence of testing are already included on the PR level. 

The two top parts, change and change(s) in pull request are the continuous flow. They include the mechanism of seeing whatever is there any day. We support these with the two bottom parts. 

We actively move from a developer's intent and interpretation to a test specialist centering testing and information to question and improve how well we did with clarified claims ending up into the implementation. Looking at added features somewhere in the chain of changes and pull requests, we compare to the conversations we had while clarifying the claims  with claims coverage testing. If lucky, developer intent matched. If not, conversations correct developer intent. As applying external imagination goes, you see different things when you think about the feature (new value you made available) and the theme (how it connect with similar things). 

When the team thinks they have a version they want out, they promote a release candidate and work through the day of final tests we're minimizing to make the release candidate a release, properly archived. 

With the shades of purple post-its showing where in the team the center of responsibility is, a good question is if tester (medium purple) is a gatekeeper in our process. Tester feeds into the developer intent (deep purple) with added information, but often not in the end of it all, but rather throughout and not stopping at release. The work on omissions continues while in production, exploring logs and feedback. There is also team work we have managed to truly share for all (light purple), supporting automations (light blue), and common decisions (black). 

There is no clearly defined time in this process. It's less of instruction on what exactly to do, and more of a description of perspectives we hold space for, for now. There's many changes on our road still: tightening release cycle, keeping unfinished work under the hood, connecting requirements and some selection of tests with BDD, making smaller changes, timely and efficient refinement, growing capabilities of testing to models and properties, growing environments … the list will never be completely done. But where we are now is already good, and it can and should be better. 

Friday, February 18, 2022

The Positive Negative Split Leads Us Astray

As I teach testing to various groups in ensemble and pair formats, I have unique insight into what people do when they are asked to test. As I watch any of my students, I know already what I would do, and how many of my other students have done things. Noticing students miss out on something, I get to have those conversations:

"You did not test with realistic data at all. Why do you think you ended up with that?" 

"You focused on all the wrong things you can write into a number input, but very little on the numbers you could write. Why do you think you ended up with that?"

"You tested the slow and long scenario first, that then fails so you need to do it again. Why do you think you ended up with that?" 

As responses, I get to hear the resident theory of why - either why they did not yet but would if there was more time, or more often, why they don't need to do that, and how they think there are no options to what they do as if they followed an invisible book of rules for proper testing. Most popular theory is that developers test the positive flows so testers must focus only on negative tests, often without consulting the developers on what they focus on. 

I typically ask this question knowing that the tester I am asking is missing a bug. A relevant bug. Or doing testing in a way that will make them less efficient overall, delaying feedback of the bug that they may or may not see. 

I have watched this scenario unfold in so many sessions that today I am ready to call out a pattern: ISTQB Test Design oversimplification of equivalence classes and boundary values hurts our industry

Let me give you a recent example, from my work. 

My team had just implemented a new user interface that shows a particular identifier of an airport called ICAO code. We had created a settings API making this information available from the backends, and a settings file in which this code is defined in. 

Looking at the user interface, this code was the only airport information we were asked to display for now. Looking at the settings API and the settings file, there was other information related to the airport in question like it's location in latitude and longitude values. Two numbers, each showing a value 50.9 that someone had typed in. How would you test this?

I showed it around for people asking this. 

One person focused on the idea of random values you can place in automation, ones that would be different every run time and mentioned the concept of valid and invalid values. They explained that the selection of values is an acceptance tester's job, even if the project does not have such separation in product development. 

One person focused on the idea that you would try valid and invalid values, and identified that there are positive and negative values, and that the coordinates can have more than one decimal place. We tested together for a while, and they chose a few positive scenarios with negative value combined with decimal places before calling it done. 

I had started with asking myself what kind of real locations and coordinates are there, and how I could choose a representative sample of real airport locations. I googled for ICAO codes to find a list of three examples, and without any real reason, chose third on the list that happened to be an airport in Chicago. I can't reproduce the exact google search that inspired me to pick that one, but it was one where the little info box of the page on google already showed me a few combos of codes and coordinates, where I chose 41.978611, -87.904724. I also learned googling that latitudes may range from 0 to 90 and longitudes range from 0 to 180.

It turned out that it did not work at all. Lucky accident brought me with my first choice to discover a combination of four things that needed to be put together to reveal a bug. 
  • The second number had to be negative
  • The second number had to be with more than four digits
  • The second number had to be less than 90
  • The first number had to be positive
Serendipity took me to a bug that was high priority: a real use case that fails. Every effort in analyzing with the simple ISTQB style equivalence classes and boundary values failed, you needed BBST style idea of risk-based equivalence and combination testing to identify this. The random numbers may have found it, but I am not sure if it would have been motivated to immediate fix like the fact that a real airport location for a functionality that describes airports does not work. 

Our time is limited, and ISQTB style equivalence classes overfocus us on the negative tests. When the positive tests fail, your team jumps. When the negative tests fail, they remember to add error handling, if nothing more important is ongoing. 

After I had already made up my mind on the feature, I showed it to two more testers. One started with coordinates of their home - real locations, and I am sure they would have explored their way to the other 3/4 of the globe Finnish coordinates would not cover. The other, being put on the spot fell into the negative tests trap, disconnected with the represented information but when pointing this out, found additional scenarios of locations that are relevantly different, and now I know some airports are below sea level - the third value to define that I had not personally properly focused on. 

Put the five of us together and we have the resultful testing I call for with contemporary exploratory testing. But first, unlearn the oversimplistic positive / negative split and overfocus on the negative. The power of testing well lies in your hands when you test. 

Sunday, February 13, 2022

Doing Security Testing

This week Wednesday, as we were kicking off BrowserStack Champions program with a meeting of program participants around a fireside chat, something in the conversation in relation to all things going on at work pushed an invisible button in me. We were talking about security testing, as if it was something separate and new. At work, we have a separate responsibility for security, and I have come to experience over the years that a lot of people assume and expect that testers know little of security. Those who are testers, love to box security testing separate from functional testing and when asked for security testing, only think in terms of penetration testing. Those who are not testers, love to make space for security by hiring specialists in that space and by the Shirky Principle, the specialists will preserve the problem to which they are a solution. 

Security is important. But like with other aspects of quality, it is too important for specialists. And the ways we talk about it under one term "security" or "security testing", are in my experience harmful for our intentions of doing better in this space. 

Like with all testing, with security we work with *risks*. With all testing, what we have at stake when we take risk can differ. Saying we risk money is too straightforward. We risk:

  • other people's discretionary money, until we take corrective action. 
  • our own discretionary money, until we take corrective action.
  • money, lives, and human suffering where corrective actions don't exist
We live with the appalling software quality in production, because a lot of the problems we imagine we have are about the first, and may escalate to the second but while losing one customer is sad, we imagine others in scale. When we hear RISK, we hear REWARD in taking a risk, and this math works fine while corrective actions exist. Also, connecting testing with the bad decisions we do in this space feels like a way of the world, assuming that bug advocacy as part of testing would lead to companies doing the right things knowing the problems. Speaking with 25 years of watching this unfold, the bad problems we see out there weren't result of insufficient testing, but us choosing the RISK in hopes of REWARD. Because risk is not certain, we could still win. 

The third category of problems is unique. While I know of efforts to assign a financial number to a human life or suffering, those don't sit well with me. The 100 euros of compensation for the victims of cybercriminals stealing psychotherapy patient data is laughable. The existence of the company limiting liability to company going bankrupt is unsettling. The amount of money police uses investigating is out of our control. The fear of having your most private ideas out there will never start to spark joy. 

Not all security issues are in the third category, and what upsets me about the overemphasis of security testing is that we should be adding emphasis to all problems in the third category. 

A few years ago I stepped as far away as I possibly can from anyone associating with "security" after feeling attacked on a Finnish security podcast. Back then, I wrote a post discussing the irony of my loss / company's loss categories proposing that my losses should be the company's losses by sending them a professional level services bill, but a select group of security folks decided that me running into a problem where it was a company's loss, ridiculing my professionalism was a worthwhile platform. While I did report this for slander (crime) and learned it wasn't, the rift remains. Me losing money for bug: testing problem. The company losing money for bug: security problem. I care for both. 

As much as I can, I don't think in terms of security testing. But I have a very practical way of including the functional considerations of undesired actors. 

We test for having security controls. And since testing is not about explicit requirements but also ensuring we haven't omitted any, I find myself leading conversations about timing of implementing security controls in incremental development from perspective of risks. We need security controls - named functionalities to avoid, detect, counteract and minimize impacts of undesired actors.

We test for software update mechanism. It connects with security tightly, with the idea that in a software world riddled with dependencies to 3rd party libraries, our efforts alone without the connected ecosystem are in vain. We all have late discoveries despite our best efforts but we can have the power of reacting, if only we are always able to update. Continuous delivery is necessary to protect customers from the problems we dropped at their doorstep, along with out lovely own functionalities. 

We test for secure design and implementation. Threat modeling still remains an activity that brings together security considerations and exploratory testing for the assumptions we rely our threat modeling decisions on as a superb pair. Secure programming - avoiding typical errors for a particular language - shows up as teams sharing lists of examples. Addressing something tangible - in readable code - is a lot more straightforward than trying to hold all ideas in head all the time. Thus we need both. And security is just one of the many perspectives where we have opportunities to explore patterns out of the body of code. 

We integrate tools into pipelines. Security scanners for static and dynamic perspectives exist, and some scanners you can use in the scale of the organization, not just a team. 

We interpret standards for proposals of controls and practices that the whole might entail. This alone, by the way, can be work of a full time person. So we make choices on standards, we make choices of detail of interpretation. 

We coordinate reactions to new emerging information, including both external and internal communication. 

We monitor the ecosystem to know that a reaction from us is needed. 

We understand legal implications as well as reasons for privacy as its own consideration, as it includes a high risk in the third category: irreversible impacts. 

And finally, we may do some penetration testing. Usually for the purpose of it is less of finding problems, but to say we tried. In addition, we may organize for a marketplace for legally hunting our bugs and selling our bugs with high implications to us over the undesired actors, through a bug bounty program. 

So you see, talking about security testing isn't helpful. We need more words rather than less. And we need to remove the convolution of assuming all security problems are important just as much as we need to remove the convolution of assuming all functional problems aren't. 

Friday, January 28, 2022

Software Maintenance

This winter in Finland has not been kind to our roads. I got to thinking of this sitting on the passenger seat of a car, slowly moving on a ice covered bumpy road, with potholes in the ice left from the piles of snow that did not get cleared out when weather was changing again. The good thing about those potholes are that they are temporary in the sense that given another change of weather, they get either filled or the ice melts. Meanwhile, driving is an act of risking your car. 

Similar phenomenon, of a more permanent type without actions, happens for the very same weather reasons, creating potholes in the roads. Impact for the user is the same, driving is an act of risking your car. Without maintenance, things will only get worse.

This lead me into thinking about software maintenance and testing. Testing is about knowing what roads need attending to. Some roads are built weak and need immediate maintenance. Others will degrade over time under conditions. Software does not keep running without maintenance any more than our roads can be safely driven on without maintenance.

Similarly, we have two approaches to knowing roads could use maintenance and testing of software:

  • automation: have someone drive through the road and identify what to fix out of a selected types of things that could be off
  • thinking: recognize conditions that increase risk  of selected types of things or could introduce new categories of problems we'd recognize when we see it but may have trouble explaining
Knowing you need maintenance is start of that maintenance. And having the machinery to drive through every road is an effort in itself, and we will be balancing the two. 

We care about knowing. But as much as we care about knowing, we care about acting on the knowledge more. 

Friday, January 21, 2022

In Search of Contemporary Exploratory Tester

We had just completed our daily, and a developer in the team had mentioned they would demo the single integration test they had included. Input of values to Kafka, output a stream, and comparing transformation in the black box between the input and output. I felt a little silly confirming my ideas of what was included in the scope thinking everyone else was most likely already absolutely clear on the architecture but I asked anyway. And as soon as I understood, I knew I had a gem of a developer in the team. From that one test (including some helper functions and the entire dockerized environment, I had the perfect starting point for the exploratory testing magic I love. But also, I realized I could so easily just list the things pairing with the dev, and we could fix and address the problems there might be together. Probably, I could also step away and see the developer do well and just admire the work. That's when I realized: I had finally landed a team where I would not get away with the traditional ideas of what testing would look like. 

This week I have been interviewing testers and the experience forces me to again ponder what I search for, and how would I know I have found something that has potential. Potential is the ability and willingness to learn, and to learn requires wanting to spend time with a particular focus. 

Interviewing reminded me of the forms of bad ideas: 

  • the developer who does not enjoy testing as the problem domain
  • the tester who has not learned to program, think in architectures nor work well with business people (the *established exploratory tester*)
  • the test automator who made particular style of programming manual tests to automation scripts their career (the *unresultful automator*) 
  • the total newbie who wants to escape testing to real work as soon as they can
I don't want those. I want something else. 

So I came up with two forms of what I may be looking to fill from a perspective of someone to hold space for testing. 

  1. Test systems engineer
  2. Contemporary exploratory testing specialist

Test systems engineer is a programmer who enjoys testing domain, and wants to solve problems in testing domain with programming. They want to apply programming with various libraries and tools to create fast feedback mechanisms to support the teams they contribute in. They won't take in manual test cases and just turn them to automation, but they will create an architecture that enables them to do the work smart. For this role, I would recruit developers with various levels of experience, and grow them as developers. A true success for this role in a team looks like whole team ownership of the systems they create, enabling them with a mix of test systems and application programming over time.

Contemporary exploratory test specialist is really hard to find. It is a tester who knows enough of programming to work with code at least in collaboration (pairing/ensembling), can figure out change by reading commits (as nothing changes without code changing with infra as code) and target testing for changes combining attended and unattended testing. Being put in a place where there is a choice of not ever looking at the application integrated as there's appearance of being able to do all in code, this would choose to experiment what we may have missed. Nudging left and whole team testing would be things - not waiting to the end of the pipeline, but building and testing branches for some extra exploring, pairing on creating things, reviewing unit test, defining acceptance criteria and examples are some of the tactics, but everything can also be shared with the team. Meditating on what we might have missed in testing and understanding coverage are go-to mechanisms. 

I think I will need to grow both. Neither are usually readily available in the market. And no one is ever ready in either of the two categories. 

The better your whole team, the more you need to be a chameleon that just adapts to the environment - and holds space for great testing to happen. 

Thursday, January 20, 2022

One Company, Five Testing Archetypes

In last two years, I have had an exception opportunity to deepen my understanding of the context archetypes, working in a company that encompasses the five. But before we dive into that, let me share an experience.

Facilitating a workshop on what successful looks like in test automation, I invite pairs of people to compare their experiences and collect insights and related experiences on what made things successful. The room is full of buzz as some of the finest minds in test automation compare what they do, and what they have learned. The pair work time runs out and we summarize things and here's what I learn: test automation is particularly tricky for embedded systems. 

For years, people could silence me by telling me how embedded systems are special. And sure they are, like all systems are special. But one thing that is special about working with embedded systems is how convoluted "testing" is. And today I wanted to write down what I have learned of the five types of archetypes of testing that make talking about testing and hiring "testers" very complex even within a single company. 

Software and Systems Testing is the archetype I mostly exist in. Coming to a problem software first, understanding that software runs on other software, that runs on hardware. That hardware can be built as we build software from components of various abstractions to integrate, or it can be a ready general purpose computer, but it exists and matters to some degree. Like a jenga tower, you will feel shaky on the top layers when the foundation is shaky, and that is usually the difference that embedded systems bring in.  But guess what - that is how you feel about software foundations too and there is a *remarkable* resemblance to embedded software development and cloud development in that sense. 

With software and systems testing, you are building a product in an assigned scope, overlaying your testing responsibility with the neighboring teams, and a good reminder on this archetype is that for testing, you will always include your neighbors even if you like to leave them out of scope of consideration for development purposes. Your product may have 34 teams contributing to it (I had last year) or it might have a lot simpler flow considering end to end. Frankly, I think the concept of end to end should be immediately retired looking at things from this archetype's perspective. 

Hardware (Unit - in isolation / in integration and Compliance) Testing is something I have been asking a lot of questions on in the last two years, and I find immensely fascinating specialty. The stuff I loved about calculating needed resistor sizes are the bread and butter in hardware, but making it only about the electronics / mechanics would be an underappreciating way of describing the complexities. Hardware testing is a specializing field with many things worthwhile standardizing and standards compliance makes it such a unique combo. The integrations of two hardware components have its own common sets of problems, and the physical constraints aren't making life simple. Hardware without software is a bit lifeless box, so hardware testing in integration and particularly compliance testing already brings in software and system perspectives, but from a very specific slice perspective. 

Really great hardware testers think they don't know software testing yet do well on helping figure out risks of features. I can't begin to appreciate the collaboration hardware designers and hardware testers are offering. The interface there, particularly for test automation success is as close to magic as I have recently experienced. 

Production Testing is the archetype that surprised me. Where hardware testing looks at design problems with hardware, production testing looks at manufacturing problems with hardware. And again, since hardware without software is a lifeless box, it is testing of each component as well as different scopes of the integrated systems. The way I have come to think of production testing, it is the most automation system design and implementation oriented kind of testing we do. Spending time on each individual piece of hardware translates to manufacturing costs, and the certainty of knowing the piece you ship to the other side of world will be quality controlled before sending is relevant. 

Being able to connect our production testing group to great unit testing trainer is one of my connections to learn to appreciate this. Ensembling on their test cases, seeing how they are different, reading their specs. And finally, being on the product side to build test interfaces that enable the production testing work - I had an idea but I did not understand it. 

Product Acceptance Testing is the archetype of testing I could also just call acceptance testing, and it is associated with promise of a delivery of a project not a product. If you need to tailor your product before it becomes the customer's system, you will probably have need of something like this. We call it FIT or FAT (in both cases, it's the integrated system but either in artificial or real physical environments) and the test cases have little in common with the kind of testing I do within Software and Systems Testing. This is demonstrating functionalities with open eyes, ready for surprises and really wishing there were none. 

IT Testing is the final archetype focused on the business systems that run even software companies. There may well still be companies that don't have software products (even if the transformation is well on its way to make every company a software company), but there are no companies who have no IT. Your IT system may be that self-built excel you use, or a tailored system you acquired, but when it runs *your business*, you want to test to see your business can run with it. Not being able to send invoices terminating your timely cash flow has killed companies. 

The difference in IT Testing comes from lack of power. The power is in money starting projects. The power is in agreeing more precisely what is included, or agreeing to pay by the hour. Constraints are heavy on the IT systems you are using as foundation, because this is not the thing you sell, this is something that enables you to sell. 

The archetypes matter when we try to discuss testing. Because it is all testing. It is just not working with the same rules, not even a bit.