A Seasoned Tester's Crystal Ball: May 2024

Friday, May 31, 2024

Taking Testing Courses with Login

When we talk about testing (programming) courses we've taken, I notice a longing feeling in me when I ask about the other's most recent experience. They usually speak of the tool ("Playwright!", "Robot Framework with Selenium and Requests Library!", "Selenium in Java!") whereas where I keep hoping they would talk about is what got tested and how well. From that feeling of mine, a question forms:

What did the course use as target of testing in teaching you?

For a while now, I have centered my courses around targets of testing, and I have quite a collection. I feel what you learn depends on the target you test. And all too many courses leave me unsatisfied with the students with their certificates of completion, since what they really teach is operating a tool, not testing. Even for operating a tool, the target of testing determines the lessons you will be forced to focus on.

An overused example I find is a login page. Overused, yet undereducated.

In its simplest form, it is this idea of two text fields and a button. Username, password, login. Some courses make it simple and have lovely IDs for each of the fields. Some courses start of making locators on the login page complicated so clicking them takes a bit of puzzle solving. In the end, you manage to create test automation for successful and unsuccessful login, and enjoy the power of programming at your fingertips - now you can try *all the combinations* you can think of, and write them down once into a list.

I've watched hundreds of programmed testing newbies with shine in their eyes having done this for their first time. It's great, but it is an illustration of the tool, it's not what I would expect you to do when hired to do "testing".

Sometimes they don't come in the simplest form. On a testing course targets, the added stuff scream education. Like this one.

From a general experience of having seen too many logins, here's things I don't expect to see in a login and it's missing things that I might expect to see if a login flow gets embellished for real reasons. If you're take on automating something like this is that you can automate it, not that it has stuff that never should be there in the first place, you are not the tester I am looking for.

Let's elaborate the bugs - or things that should make you #curious like Elizabeth Zagroba taught on her talk at NewCrafts just recently. You should be curious on:

Why is there a radio button to log in as admin vs. user, and why is Admin the default? There are some but very few cases where the user would have to know and asking that in a login form like this is unusual at best, but also only the minority users who are both would naturally have a selection like this. For things where I could stretch my imagination to see this as useful, the default would be User. The judgmental me says this is there to illustrate how to programmatically select a text box.
Why is there dropdown menu? Is that like a role? While I incline to think this too is there to illustrate how to programmatically select from list I also defer my judgement to the moment of login in. Maybe this is relevant. Well, was not. This is either half of an aspired implementation or there for demo purposes. And it's missing label explaining it, unlike the other fields.
Why is there terms and conditions to tick? I can already feel the weight of the mild annoyance of having to tick this every single time, with changing conditions hidden in there, and you promising your first borne child is yet another Wednesday some week. The judgmental me says this is here to show functional problem of not requiring ticking it when testing. And the judgmental me is not wrong, login works just fine without what appears to be compulsory acceptance of terms, this time with default off to communicate higher level of committing when I log in.

The second level judgement I pass upon people through this is that testers end up overvaluing being able to click when they should focus on needing to click and waste everyone's time with that and this is a trap. I could use this to rule out testers except overcoming this level of shallowness can be taught in such a short time that we shouldn't gatekeeper on this level of detail.

I don't want to have the conversation of not automating this either. Of course we automate this. In the time I am writing this, I could already have written a parametrized test with username and password as input that then clicks the button. However, I'd most likely not care to write that piece of code.

Login in a concept of having authentication and authorization to do stuff. Login is not interesting in its own right, it is interesting as a way of knowing I have or don't have access to stuff. Think about that for a moment. If your login page redirects you to an application like this one did, is login successful? I can only hope it was not on the course I did not take but got inspired on to write this.

I filled in the info, and got redirected on a e-store application. However, application URL and another browser, I get to use the very same application without logging in. I let our a deep sigh, worried for the outcome of the course for the students.

Truth be told, before I got to check this I already noted the complete absence of logout functionality. That too hinted that the login may be an app of its own for testing purposes only. Well, it does illustrate combinations you can so easily cover with programmatic tests. What a waste.

What work in projects around login looks like, really? We can hope it looks like taking something like Keycloak (an open source solution in this space), styling a login page to look like your application, avoiding the thousands of ways you can do login wrong. You'll still face some challenges but successful and failing login aren't the level you're expected to work on.

What you would work on with most of your programmatic testing is the idea that nothing in the application should work if you aren't authorized. You would be more likely to automate login by calling an API endpoint giving you a token you'd carry through the rest of your tests on the actual application functionality. You'd hide your login and roles into fixtures and setups, rather than create login tests.

The earlier post I linked above is based on a whole talk I did some years back on the things that were broken in our login beyond login.

Learn to click programmatically, by all means. You will need it. But don't think that what you were taught on that course was how to test login. Even if they said they did, they did not. I don't know about this particular one, but I have sampled enough to know the level of oversimplification students walk away with. And it leads me to thinking we really really would need to do better in educating testers.

Tuesday, May 28, 2024

Compliance is Chore, not Core

It's my Summer of Compliance. Well, that's what I call the work I kicked off and expected to support even when I switch jobs in the middle. I have held a role of doing chores in my team, and driving towards automating some of those chores. Compliance is a chore and we'd love if it was a minimized to invisible, while producing the necessary results.

There is, well, for me, three compliance chores.

There's the compliance to company process, process meaning this is required of development in this company. Let's leave that compliance for another summer.

Then there's the two I am on for this summer, open source license compliance and security vulnerability handling. Since a lot of the latter is from managing the supply chain of other people's software, these two kind of go hand in hand.

You write a few line of code of your own. You call a library someone else created. You combine it with an operating system image and other necessary dependencies. And all of a sudden, you realize you are distributing 2000 pieces from someone else.

Dealing with the things others created for compliance is really a chore, not core to the work you're trying to get done. But licenses need attending to and use requires something of you. And even if you didn't write it, you are responsible for distributing other people's security holes.

Making things better starts with understanding what you got. And I got three kinds of things:

1. Application-like dependencies. These are things that aren't ours but essentially make up the bones of what is our application. It's great to know that if your application gets distributed as multiple images, each image is a licensing boundary. So within each image with application layer, you want to group things with copyleft (infectious license) awareness.

2. Middleware-like dependencies. These are things that your system relies on, but are applications of their own. In my case, things like rabbitMQ or keycloak. Use 'em, configure but that's it. But do distribute and deploy, so compliance needs exists.

3. Operating system -like dependencies. Nothing runs without a layer in between. We have agreements on licensing boundary between this and whatever sits on top.

So that gives us boundaries horizontally, but also vertically in a more limited degree.

Figuring out this, we can describe our compliance landscape.

The only format this group in particular redistributes software is executables (the olden way) and images (the new way). Understanding that. these get built up was part of the work to do. We identified inputs, outputs we would need and impacts we seek on us having to create those outputs.

I use the landscape picture to color code our options. Our current one "scripts" takes source code and dependencies as input, ignores base images and builds a license check list and license.txt - with a few manual compliance checks on knowing what you seek in the patterns of license change. It's not hard work, but it's tedious. Fails for us on two of the impacts -- we do chore work to create and maintain the scripts, unable to focus on core; requires manual work every single time.

We're toying with two open source options: ORT (Open Source Review Toolkit) that shows promise to replace our scripts, and possibly extend to image based scans as its open source project. Does not really come best wrapped as service. Syft+Grype+GoTemplates that seems to some of the tricks, but leaves things open in the outputs realm.

And then we're toying with an open source service offering, where money does buy you a solution, with FOSSA.

I use the word toying as an uncommittal way of discussing a problem I have come to understand in last weeks.

Running a compliance scan for base images, there is significant differences in numbers of components identified with Syft vs. Docker SBOM vs. Docker Scout vs. self-proclaimed at the source. There's quality assessment tool for SBOMs that does checking for many other things but not correctness showing significant other differences. And that is just the quality of the SBOM piece of the puzzle.

We started off with the wrong question. It is no longer a question if we are "able to generate SBOMs" but instead we are asking:

should we care that different tools provide different listing for same inputs, as in, are we *really* responsible for quality or just the good faith effort
how we should make those available next to you our releases
how we scale this in an organization where compliance is chore not core

This 'summer of compliance' is forcing me to know more of this than I am comfortable with. When quality matters, it becomes more difficult. *If* it matters.

Saturday, May 25, 2024

To 2004 and Back

Year 2004 was 20 years ago. Today I want to write about that year, for a few reasons.

It's the first year in which I have systematically saved up my talks on a drive
It's the year when I started doing joined talks with Erkki Pöyhönen and learned to protect access to materials by using Creative Commons Attribution -license
I have just finished taking my 2004 to my digital legacy project, meaning my slides from that year are no longer on my drive but available with Creative Commons Attribution -license on GitHub.

To put 2004 in context, I was a fairly new senior tester then. I became a senior because I became a consultant. There is no chance in hell I was actually a senior then. But I did wonders with the appreciation and encouragement of the title.

My topics in 2004 were Test automation, Test Planning and Strategies, Test Process Improvement, and Agile Testing. Most of my material back then was in Finnish. I believed that we learn best on our native language. I was drawing from what was discussed in the world that I was aware of, and clearly I was aware of Cem Kaner.

Looking at my materials from then, I notice few things:

Most of my effort went into explaining people here what people elsewhere say about testing. I saw my role as a speaker as someone who would help navigate the body of knowledge, not in creating that body of knowledge.
The materials are still relevant and that is concerning. It's concerning because same knowledge needs still exist and not enough has changed.
I would not deliver any of the talks that I did then with the knowledge and understanding I have now. My talk on test automation was on setting up a project around it, which I now believe is more of a way of avoiding it than doing it. Many people do it with a project like that, and it's a source of why they fail. My talk on concepts in planning and process make sense but carry very little practical relevance to anything other than acquisition of vocabulary.
My real work reflected in those slides is attributed to ConformiQ, my then employer. I did significant work in going through Finnish companies assessing their testing processes, and creating a benchmark. That data would be really interesting reflection on what testing looked like in tens of companies in Finland back then, but it's proprietary except for the memories I hold that have since then impacted what I teach (and learn more on).

I talked about Will Testing Be Automated in the Future and while I did not even answer the question in my first version of the talk in writing, second one came with my words written down:

Automation will play a role in testing, especially when we see automation as something applicable wider than in controlling the program and asserting things.
Only when it is useful - moving from marketing promises to real substantial benefits
Automation is part of good testing strategy
Manual and automated aren't two ways of executing the same process, but it's a transformation of the process

None of this is not true, but it is also missing the point. It's what I, knowing now what I did not know then, call feigned positivism. Words sounding like support, but are really a positive way of being automation aversive. It's easy to recognize feigned positivism when you have changed your mind, but it's harder to spot live.

Being able to go back and reflect is why I started this blog back in its day. I expected to be wrong, a lot. And I expected that practicing being wrong is a way of reinforcing a learning mindset.

That one still holds. 2004 was fun, but it was only one beginning.

Friday, May 17, 2024

Programming is for the juniors

It's the time of the year when I learn the most, because I have responsibility over a summer trainee. With Four years in Vaisala, it's four summer trainees, and my commitment to follow through with the summer of success means I will coach and guide even when I will soon be employed by another company. With a bit of guidance, we bring out the remarkable in people, and there will not be seniors if we don't grow them from juniors. Because, let face it, it's not like they come ready to do the work from *any* of the schools and training institutions. Work, and growing in tasks and responsibilities you can take up, that is where you learn. And then teaching others is what solidifies if you learned.

This summer, I took special care in setting up a position that was a true junior one. While it might be something that someone with more experience gets done faster, the junior pay allows for them to struggle, to learn, to grow and still contribute. The work we do is one where success is most likely defined by *not leaving code behind for others to maintain* in any relevant scale, but reusing the good things open source community has to offer us in the space of open source compliance, licenses and vulnerabilities. I have dubbed this the "summer of compliance", perhaps because I'm the person who giggles my way through a lovely talk on malicious compliance considering something akin to that a real solution when we have corporate approaches to compliance, and forget learning why these things matter in the first place.

If the likely success is defined by leaving no code, why did I insist of searching for a programmer for this position? Why did I not allow people to do things a manual way, but guided with expectations from getting started that we write code.

In the world of concepts and abstractions, code is concrete. Code grounds our conversation. If you as a developer disagree with your code, your code in execution wins that argument. Even when we love saying things like "it's not supposed to do THAT". Watch the behavior and you're not discussing a theory but something you have.

A screenshot of post by Maaret Pyhäjärvi (@maaretp@mas.to) beautified by Mastopoet tool. It was posted on May 16, 2024, 15:29 and has 4 favourites, 0 boosts and 0 replies. "You are counting OS components, show me the code" => "Your code is counting OSS components but it's a one line change". This is why I don't let trainees do *manual* analysis but I force them to write scripts for the analysis. Code is documentation. It's baseline. It is concrete. And it helps people of very different knowledge levels to communicate.

Sometimes the mistakes when reviewing the detailed steps of the work reveal things like this - a subtle difference on what is OS (operating system) and OSS (open source software), for someone who is entirely new to this space and how the effort put to writing the wrong code was a small step to fix, whereas the wasted effort with manually having done similar analysis would have really wasted more effort for the mistake.

Watching the conversation unfold, I recognize we have already moved through there bumps on our road of the programming way.

The first one was the argument on "I could learn this first by doing it manually". You can learn it by doing it with programming too, and I still remember how easy it is to let that bump ahead become the excuse that keeps you on the path of *one more manually done adjustment*, never reaching the scale. With our little analysis work, the first step to analyze components on one image felt like we could have done it manually. But the fact that getting 20 was 5 minutes of work, that we got from pushing ourselves through the automate bump.

We also run into two other bumps on our road. The next one with scale is realization that what works for 5 may not work for 6, 11, 12, 17, and 20. The thing you built that works fine for one, even for the second will teach you surprising things in scale. Not because of performance. But because data generated that you are relying on is not always the same. For the exceptions bump, you extend the logic to deal with real world outside what you immediately imagined.

Testers love the exceptions bump so much that they often get stuck on it, and miss more relevant aspects of quality.

The third bump on our week is the quality bump, and oh my, the conversation of that one, that's not an easy one. So you run a scanner, and it detects X. You run second scanner, and does not detect X. Which one is correct? It could be correct as is (different tools, different expectations), but also it could be false positive (it wasn't one for anyone to find anyway) or false negative (it should be found). What if there is Y to detect that none does, how would you know that then? And in the end, how far do you need to go in quality of your component identification so that you can say that you did enough not to anymore be considered legally liable?

The realizations that popular scanners have significant percentages difference in what they find, and that if legal was an attack vector against your company, that your tools quality matter for it, it's definitely sinking into the deep end with the quality bump.

I am sure our summer trainee did not expect how big of apart asking why and what then is of the work. But I'm sure the experience differs from a young person's first job in the service industry. And it makes me think of those times for myself nostalgically, appreciating that there's an intersection on how our paths cross through time.

Back to click-bait-y title - programming really is for the juniors. It is for us all, and we would do well on not letting the bumpy road stop us from ever going that way. Your first hello world is programming, and there's a lifetime of learning still available.

Wednesday, May 15, 2024

Done with Making These Releases Routine

I blogged earlier on the experience of task analysis of what we do with releases, and shared that work for others to reflect on. https://visible-quality.blogspot.com/2024/02/making-releases-routine.html

Since then, we have moved from 2024.3 release to 2024.7 release, and achieved routine. Which is lovely in the sense of me changing jobs, which then tests if the routine can be held without me present.

In the earlier analysis, I did two things.

Categorized the actions
Proposed actions of low value to drop

Since then, I have learned that some of the things I would have dropped can't be, as they are so built in. Others that I would keep around, I have so little routine / traction from anyone other than me and since they are compliance oriented, I could drop them too without impacting the end result.

Taking a snapshot of changes I have experimented through, for 2024.3 release I chose to focus on building up two areas of competence. I coached the team's tester very much on the continuous system testing, and learned that many of the responsibilities we allocate for continuous system testing are actually patching of product ownership in terms of creating compliance track record.

The image of same categories but actual work done for 2024.3 versus 2024.7 shows the change. No more cleaning up other people's mess in Jira. No more manual compliance to insufficiently managed "requirements" to show systematic approach that only exists with significant extra patching. The automation run is the tests done, and anything else while invisible is welcome without the track of it. Use time on doing testing, over creating compliance records.

While the same structure shows how much work one can throw out of a traditional release process, restructuring what remains does a better job describing the change. The thing we keep is the idea that master is ready for release any time, and developers test it. Nothing manual must go in between. This has been the key change that testers have struggled with on continuous releases. There is no space for their testing, and there is all the space for their testing.

In the end, creating a release candidate and seeing it got deployed could be fully automatic. Anything release compliance related can be done in continuous fashion, chosen to be dropped or at least heavily lightened, and essentially driven by better backlog management.

With this release, 30 minutes I call my work done. From 4 months of "releasing" to 30 minutes.

If only the release would reach further than our demo environment, I would not feel as much like I wasted a few years of my life as I am moving to a new job. But some things are still outside my scope of power and influence.

Monday, May 13, 2024

Using Vocabulary to Make Things Special

Whenever I talk to new testers about why they think ISTQB Foundation course was worthwhile of their time, one theme above all is raised: it teaches the lingo. Lingo, noun, informal - often humorous: a foreign language or local dialect; the vocabulary or jargon of a particular subject or group of people. Lingo is a rite of passage. If your words are wrong, you don't belong.

And my oh my, do we love our lingo in testing. We identify the subcultures based on the lingo, and our attitude towards the lingo. We correct in less and more subtle ways. To a level that is sometimes ridiculous.

How dare you call a test developer wrote unit test if it is unit in integration test not unit in isolation test? Don't you know that contract testing is essentially different than api testing? Even worse, how can you call something test automation when it does not do testing but checking? Some days it feels like these are the only conversations we have going on.

At the same time folks such as me come up with new words to explain things are different. My personal favorites of words I am inflicting the world are contemporary exploratory testing to bring it both to its roots but modernize it with current understanding of how we build software, and ensemble testing because I just could not call it mob testing and be stuck with the forever loop of "do you know mobbing means attacking". Yes, I know. And I know that grooming is a word I would prefer not to use on backlog refinement, and that I don't want to talk about black-white continuum as something where one is considered good and other bad, and thus I talk about allowlisting and blocklisting, and I run Jenkins nodes not slaves.

Lingo, however, is a power move. And one of those exploratory testing power moves with lingo is the word sessions. I called it a "fancy word" to realize that people who did some work in the exploratory testing space back in the days clearly read what I write enough to show up to correct me.

Really, let's think about the word sessions. What things other than testing we do in sessions? What are the world we more commonly use if sessions aren't it?

We talk about work days. We talk about tasks. We talk about tasks that you do until you're done, and tasks you do until you run out of time. We talk about time-boxing. We talk about budgeting time. We talk about focus. But we don't end up talking about sessions.

On our conversations of words, I watched a conversation between Emily Bache and Dave Farley on software engineering and software craftership. It was a good conversation, and created nice common ground wanting to identify more with the engineering disciplines. It was a good conversation to watch because while it on shallow level was on the right term to use, it was really talking about cultures and beliefs. The more words used around those words were valuable to me, inspirational, and thought provoking.

We use words to compare and contrast, to belong, to communicate and to hopefully really hear what the others have to say. And we can do that centering curiosity.

Whatever words you use, use them. More words tends to help.