Monday, October 24, 2022

How to Maximize the Need for Testers

Back when I was growing up as a tester, one conversation was particularly common: the ratio of developers in our teams. A particularly influential writing was from Cem Kaner et al, on Fall 2000 Software Test Managers Roundtable on 'Managing the Proportion of Testers to (Other) Developers".  

The industry followed the changes in the proportion from having less testers than developers, to peaking at the famous 1:1 tester developer ratio that Microsoft popularized, to again having less testers than developers to an extend where it was considered good to have no developers with testing emphasis (testers), but have everyone share the tester role. 

If anything, the whole trend of looking for particular kinds of developers as test system responsibles added to the confusion of what do we count as testers, especially when people are keen to give up on the title when salary levels associated for the same job end up essentially different - and not in favor of the tester title. 

The ratios - or task analysis of what tasks and skills we have in the team that we should next hire for a human-shaped unique individual - are still kind of core to managing team composition. Some go with the ratio to have at least ONE tester in each team. Others go with looking of tasks and results, and bring in a tester to coach on recognising what we can target in the testing space. Others have it built into the past experiences of the developers they've hired. It is not uncommon to have developers who started off with testing, and later changed focus from specializing into creating feedback systems to creating customer-oriented general purpose systems - test systems included. 

As I was watching some good testing unfold in a team where testing happens by everyone not only by the resident tester, I felt the need of a wry smile on how invisible testing I as the tester would do. Having ensured that no developer is expected to work alone and making space for it, I could tick off yet another problem I was suspecting I might have to test for to find the problem, but now instead I could most likely be enjoying that it works - others pointed out the problem. 

To appreciate how little structural changes can make my work more invisible and harder to point at, I collected the *sarcastic* list of how to maximise the need of testers by ensuring there will be visible for for you. Here's my to-avoid list that makes the testing I end up doing more straightforward, need of reporting bugs very infrequent, and allows me to focus more of my tester energies in telling the positive stories of how well things work out in the team. 

  1. Feature for Every Developer
    Make sure to support the optimising managers and directors who are seeking for a single name for each feature. Surely we get more done when everyone works on a different feature. With 8 people in the team, we can take forward 8 things, right? That must be efficient. Except we should not optimize for direct translation of requirements to code, but for learning when allocating developers features. Pairing them up, or even better *single piece flow* of one feature for the whole team would make them cross-test while building. Remember, we want to maximise need of testers, and having them do some of that isn't optimising for it. The developers fix problems before we get our hands on them, and we are again down a bug on reporting! So make sure two developers work together as little as possible, review while busy running with their own and the only true second pair of eyes available is a tester. 

  2. Detail, and externalised responsibility
    Lets write detailed specifications: Build exactly this [to a predetermined specification]. All other mandates of thinking belong with those who don't code, because developers are expensive. That leaves figuring out all higher mandate levels to testers and we can point out how we built the wrong thing (but as specified). When developer assumptions end up in implementation, let's make sure they have as many assumptions strongly hold with an appearance of great answers in detail. *this model is from John Cutler (@johncutlefish)
    There's so much fun work to find out how they went off the expected rails when you work on higher mandate level. Wider mandate for testers but let's not defend the developers access to user research, learning and seeing the bigger picture. That could take a bug away from us testers! Ensure developers hold their assumptions that could end up in production, and then a tester to the rescue. Starting a fire just to put it out, why not. 

  3. Overwhelm with walls of text
    It is known that some people don't do so well with reading essential parts of text, so let's have a lot of text. Maybe we could even try an appearance of structure, with links and links, endless links to more information - where some of the links contain essential key pieces of that information. Distribute information so that only the patient survives. And if testers by profession are something, we are patient. And we survive to find all the details you missed. And with our overlining pens of finding every possible claim that may not be true, we are doing well when exceptional patience of reading and analyzing is required. That's what "test cases" are for - writing the bad documentation into claims with a step by step plan. And those must be needed because concise shared documentation could make us less needed. 

  4. Smaller tasks for devs, and count the tasks all the time
    Visibility and continuously testing, so let's make sure the developers have to give very detailed plans of what they do and how long that will take. Also, make sure they feel bad when they use even an hour longer than they thought - that will train them to cut quality to fit the task into the time they allocated. Never leave time between tasks to look at more effective ways of doing same things, learning new tech or better use of current tech. Make sure the tasks focus on what needs *programming* because it's not like they need to know about how accessibility requirements come from most recent legislation, or how supply chain security attacks have been in their tech, or expectations of common UX heuristics, more to the testers to point out that they missed! Given any freedom, developers would be gold-plating anyway so better not give room for interpretation there. 

  5. Tell them developers they are too valuable test and we need to avoid overlapping work
    Don't lose out on the opportunities to tell developers how they were hired to develop and you were hired to test. Remember to mention that if they test it, you will test it anyway, and ensure the tone makes sure they don't really even try. You could also use your managers to create a no talking zone between a development team and testing team, and a very clear outline of everything the testing team will do that makes it clear that the development team does not need to do. Make sure every change comes through you, and that you can regularly say it was not good enough. You will be needed the more the less your developers opt in to test. Don't care that the time consuming part is not avoiding testing work overlap, but avoiding delayed fixing - testing could be quite fast if and when everything works. But that wouldn't maximise need of testers, so make sure the narrative makes them expect you pick up the trash. 

  6. Belittle feedback and triage it all
    Make sure it takes a proper fight to get any fixes on the developers task lists. A great way of doing this is making management very very concerned over change management and triaging bugs before, developers get only well-chewed clear instructions. No mentions of bugs in passing so that they might be fixed without anyone noticing! And absolutely no ensemble programming where you would mention a bug as it is about to be created, use that for collecting private notes to show later how you know where the bugs are. You may get to as far as getting managers telling developers they are not allowed to fix bugs without manager consent. That is a great ticket to all the work of those triage meetings. Nothing is important anyway, make the case for it. 

  7. Branching policy to test in branch and for release
    Make sure to require every feature to be fully tested in isolation on a branch, manually since automation is limited. Keep things in branches until they are tested. But also be sure to insist on a process where the same things get tested integrated with other recent changes, at latest release time. Testing twice is more testing than testing once, and types of testing requiring patience are cut out for testers. Maximize the effect by making sure this branch testing cannot be done by anyone other than a tester or the gate leaks bad quality. Gatekeep. And count how many changes you hold at the gates to scale up the testers. 

  8. Don't talk to people
    Never share your ideas what might fail when you have them. They are less likely to find a problem when you use them if someone else got there first. It might also be good to not use PRs but also not talk about changes. Rely on interface documentation over any conversation, and when documentation is off, write a jira ticket. Remember to make the ticket clear and perfect with full info, that is what testers do after all. A winning strategy is making sure people end up working neighbour changes that don't really like each other, the developers not talking bodes ill for software not talking either. Incentivising people to not work together is really easy to do through management. 

Sadly, each of these are behaviors I still keep seeing in teams. 

In a world set up for failing in the usual ways, we need to put special attention to doing the right thing.

It's not about maximising the need of testers. The world will take care bigger and harder systems. The time will take care of us growing to work with the changing landscapes in project's expectation on what the best value from us is, today.

There is still a gap in results of testing. It requires focused work. Make space for the hard work. 

Saturday, October 22, 2022

Being a Part of Solution

A software development team integrated a scanning tool that provides two lists: one about licenses in use, and another one about supply chain vulnerabilities in all of the components the project relies on. So now we know. We know to drop one component for licenses list to follow an established list of what we can use. And we know we have some vulnerabilities at hand. 

The team thinks of the most natural way of going forward, updating the components to their latest. Being realistic, they scan again, to realize the numbers are changing and while totals are down some, the list is far from empty. List is, in fact, relevant enough that there is a good chance there is not new more relevant vulnerabilities on the list. 

Seeking guidance, team talks to security experts. The sentiment is clear: the team has a problem and the team owns the solution. Experts reiterate the importance of the problem the team is well aware of. But what about the solution? How do we go about solving this? 

I find this same thing - saying fixing bugs is important - is what testers do too. We list all the ways the software can fail, old and new, and at best we help remind that some of the things we are now mentioning are old but their priority is changing. All too much, we work on the problem space, and we shy away from the solutions.

To fix that listing that security scanners provide, you need to make good choices. If you haven't made some bad choices and some better choices, you may not have the necessary information of experimenting into even better choices. Proposals on certainly effective choices are invaluable. 

To address those bugs, the context of use - acting as a proxy for the users to drive most important fixes first - is important. 

Testers are not only information providers, but also information enrichers, and part of teams making the better choices on what we react on. 

Security experts are not just holders of the truth that security is important, but also people who help teams make better choices so that the people spending more time on specializing aren't only specializing on knowing the problem, but also possible solutions.

How we come across matters. Not knowing it all is a reality, but stepping away from sharing that responsibility of doing something about it is necessary. 

Monday, October 17, 2022

Test Automation with Python, an Ecosystem Perspective

Earlier this year, we taught 'Python for Testing' internally at our company. We framed it as four half-day sessions, ensemble testing on everyone's computers to move along on the same exercise keeping everyone along for the ride. We started with installing and configuring vscode and git, python and pytest, and worked our way through tests that look for patterns on logs, to tests on REST apis, to tests on MQTT protocol, to tests on webUI. Each new type of test would be just importing libraries, and we could have continued the list for a very long time. 

Incrementally, very deliberately growing the use of libraries works great when working with new language and new people. We imported pytest for decorating with fixtures and parametrised tests. We imported assertpy for soft assertions. We imported approval tests for push-results-to-file type of comparisons. We imported pytest-allure for prettier reports. We imported requests for REST API calls. We imported paho-mqtt for dealing with mqtt-messages. And finally, we imported selenium to drive webUIs. 

On side of selenium, we built the very same tests importing playwright to driver webUIs, to have concrete conversations on the fact that while there are differences on the exact lines of code you need to write, we can do very much the same things. The only reason we ended up with this is the ten years of one of our pair of teachers on selenium, and the two years from another one of our pair of teachers on playwright.  We taught on selenium. 

You could say, we built our own framework. That is, we introduced the necessary fixtures, agreed on file structures and naming, selected runners and reports - all for the purpose of what we needed to level up our participants. And they learned many things, even the ones with years of experience in the python and testing space. 

Libraries and Frameworks

Library is something you call. Framework is something that sits on top, like a toolset you build within. 

Selenium library on python is for automating the browsers. Like the project web page says, what you do with that power is entirely up to you. 

If you need other selections made on the side of choosing to drive webUIs with selenium (or playwright for that matter), you make those choices. It would be quite safe bet to say that the most popular python framework for selenium is proprietary - each making their own choices. 

But what if you don't want to be making that many choices? You seem to have three general purpose selenium-included test frameworks to consider: 
  • Robot Framework with 7.2k stars on github
  • Helium with 3.1k stars on GitHub (and less active maintainer on new requests) 
  • SeleniumBase with 2.9k stars on GitHub
Making sense of what these include is a whole another story. Robot Framework centers around its own language you can extend with python. Helium and SeleniumBase collect together python ecosystem tools, and use conventions to streamline the getting started perspective. All three are a dependency that set the frame for your other dependencies. If the framework does not include (yet) support for Selenium 4.5, then you won't be using Selenium 4.5. 

Many testers who use frameworks may not be aware what exactly they are using. Especially with Robot Framework. Also, Robot Framework is actively driving people from selenium library to RF into a newer browser library to RF, which includes playwright. 

I made a small comparison of framework features, comparing generally available choices to choices we have ended up with in our proprietary frameworks. 

Frameworks give you features you'd have to build yourself, and centralise and share maintenance of those features and dependencies. They also bind you to those choices. For new people they offer early productivity, sometimes at the expense of later understanding. 

The later understanding, particularly with Robot Framework being popular in Finland may not be visible, and in some circles, has become a common way of recognising people stuck in an automation box we want to get out of. 

Friday, October 14, 2022

WebUI Testing

I don't know about you, but my recent years have seen a lot of systems where users are presented with a webUI. The embedded devices I've tested ship with a built-in web server serving pages. The complex data processing pipelines end up presenting chewed up observations and insights on WebUI. Some are hosted in cloud, others in own server rooms. Even the windows user interfaces turned into webUIs wrapped in some sort of frame that appears less of a browser, but is really just a specializing browser. 

With this in mind, it is no surprise that test automation tooling in this space is both evolving, and source of active buzz. The buzz is often on the new things, introducing something new and shiny. Be it a lovely API, the popularized 'self-healing' or the rebranded 'low-code/no-code' that has been around at least as long as I have been in this industry, there is buzz. 

And where there's buzz, there's sides. I have chosen one side intentionally, which is against Robot Framework and for libraries in the developer team's dominant language. For libraries I am very hard trying to be, as they say, Switzerland - neutral ground. But how could I be neutral ground, as a self-identified Playwright girl, and a member of Selenium Project Leadership Group? I don't really care for any of the tools, but I care for the problems. I want to make sense of the problems and how they are solved. 

In the organization I spend my work days in, we have a variety. We have Robot Framework (with Selenium library). We have Robot Framework (with Playwright library). We have python-pytest with Selenium, and python-pytest with Playwright. We have javascript with Selenium, Playwright and Testcafe, and Cypress. The love and enthusiasm of doing well seems to matter more for success than the libraries, but jury is still out. 

I have spent a significant amount of time trying to make sense on Playwright and Selenium. Cypress, with all the love it is getting in world and my org, seems to come with functional limitations, yet people always test what they can with the tools, and figure out ways of telling that is the most important thing we needed to do, anyway. Playwright and Selenium, that pair is a lot trickier. The discussion seems to center around both testing *real browsers*. Playwright appears to mean a superset-engine browser that real users don't use and would not recognise as real browser. Selenium appears to mean the real browser, the users-use-this browser, with all the hairy versions and stuff that add to the real-world complexity in this space. The one users download, install on their machines and use. 

Understanding this difference on what Playwright and Selenium drive for you isn't making it easy for me.

 I have strong affinity for the idea of risk-based testing, and build the need of it on top of experiences of maintaining cross-browser tests being more work than value. In many of the organizations I have tested in, we choose one browser we automate on, and cover other browsers by agreeing a rotation based on days of the week in testing, time of doing one-off runs of automation half-failing with significant analysis time or agreeing on different people using different browsers while we use our webUI. We have thought we have so few problems cross-browser hearing the customer feedback and analyzing behaviors from telemetry that the investment to cross-browser has just not been worth it. 

With single browser strategy in mind, it matters less if we use that superset-engine browser and automation never sees users-use-this browser. There is the eyes-on-application on our own computers that adds users-use-this browsers, even if not as continuous feedback for each change automation can provide. Risk has appeared both low in likelihood and low in impact when it rarely has hit a customer. We use the infamous words "try Chrome as workaround" while we deliver fix in the next release. 

Reality is that since we don't test across browsers, we believe this is true. It could be true. It could be untrue. The eyes-on sampling has not shown it to be untrue but it is also limited in coverage. Users rarely complain, they just leave if they can. And recognising problems from telemetry is still quite much of a form of art. We don't know if there are bugs we miss on our applications if we rely on superset-engine browsers over users-use-this browsers. 

Browsers of today are not browsers of the future. At least I am picking up a sense of differentiation emerging, where one seems to focus on privacy related features, another being more strict on security, and so on. Even if superset-engine browsers are sufficient for testing of today, are they sufficient for testing in five years with browsers in the stack becoming more and more different from one another. 

Yet that is not all. The answers you end up giving to these questions are going to be different depending on where your team's changes sit on the stack. Your team's contribution to the world of webUIs may be your very own application, and that is where we have large numbers. Each of these application teams need to test their very own application. Your team's contribution may also be on the framework applications are built on. Be it Wordpress or Drupal, or React or Vue, these exists to increase productivity in creating applications and come to an application team as a 3rd party dependency. Your team's contribution could also be in the browser space, providing a platform webUIs run on.  

Picture. Ecosystem Stack

This adds to the trickiness of the question of how do we test for the results we seek. Me on top of that stack with my team of seven will not want to inherit testing of the framework and browser we rely on, when most likely there's bigger teams already testing those and we have enough on our own. But our customers using that webUI we give them, they have no idea if the problem is created by our code, the components we depend on, or the browser we depend on to run this all. They just know they saw a problem with us. That puts us in a more responsible spot, and when the foundation under us leaks and gives us a bad name, we try making new choices of the platform when possible. And we try clear timely reports hoping our tiny voices are heard with that clarify in the game with mammoths. 

For applications teams we have the scale that matters the most for creators of web driver libraries. And with this risk profile and team size, we often need ease, even shortcuts. 

The story is quite different on the platforms the scale of applications rely on. For both browsers and frameworks, it would be great if they lived with users-use-this browsers with versions, variants and all that, and did not shortcut to superset-engine type of an approach where then figuring out something is essentially different becomes a problem for their customers, the webUI development community. The browser and framework vendors won't have access (or means to cover even if they had access) to all our applications, so they sample applications based on some sampling strategy to think their contributions are tested and work. 

We need to test the integrated system not only our own code for our customers. Sitting on top of that stack puts our name on all the problems. But if it costs us extra time to maintain tests cross-browser for users-use-this browser, we may just choose we can't afford to - the cost and the value our customers would get are not in balance. I'm tired of testing for browser and framework problems in the ever-changing world because those organizations wouldn't test their own, but our customers will never understand the complexities of responsibilities across this ecosystem stack. 

We would love if our teams could test what we have coded, and a whole category of cross-browser bugs would be someone else's problem. 

Saturday, October 8, 2022

When do YOU start testing?

This week I was a guest in podcast where we found one another in Polywork. It's a development podcast, and we talked of testing, and the podcaster is an absolute delight to talk to. The end result will air sometime in December. 

One question he asked left me thinking. Paraphrasing, he asked: 

"When do you start testing?" 

I have been thinking about it since. I was thinking about it today listening to TestFlix and the lovely talk by Seema Prabhu and the vivid discussion her talk created on the sense of wall between implementing and testing. And I am still thinking about it. 

The reason I think about this is that there is no single recipe I follow. 

I start testing when I join a project, and will test from whatever I have there at that moment. Joining a new product early on makes the testing I do look very different than joining a legacy product in support mode. But in both cases, I start with learning the product hands-on, asking myself: "why would anyone want to use this?" 

After being established in the project, I find myself usually working with continuous flow of features and changes. It would be easy to say I start testing features as we hear about them, and I test them forever since until they are retired. More concretely, I am often taking part in handoff of the request of that feature, us clarifying it with acceptance criteria that we write down and the product owner reviews to ensure it matches their idea. But not always, because I don't need to work on every feature, as long as we never leave anyone alone carrying the responsibility of (breaking) changes. 

When we figure out the feature, it would be easy to say that as a tester, I am part of architecture discussions. But instead, I have to say I am invited to be part of architecture discussions, but particularly recently I have felt like the learning and ownership that needs to happen in that space benefits from my absence, and my lovely team gives a few sentence summary that makes me feel like I got everything from their 3-hours - well, everything that I needed anyway, in that moment. Sometimes me participating as a tester is great, but not always. 

When the first changes are merged without being integrated with the system, I can start testing them. And sometimes I do. Yet more often, I don't. I look at the unit tests, and engage with questions about them and may not execute anything. And sometimes I don't look at the change when it is first made.

When a series of changes becomes "feature complete", I can start testing it. And sometimes I don't. It's not like I am the only one testing it. I choose to look when it feels like there is a gap I could help identify. But I don't test all the features as they become feature complete, sometimes I test some of the features after they have been released. 

Recently, I have started testing of features we are planning for roadmap next year. I test to make sure we are promising on a level that is realistic and allows us to exceed expectations. As a tester here, I test before a developer has even heard of the features. 

In the past, I have specialized in vendor management and contracts. I learned I can test early, but reporting results of my testing early can double the price of the fixed price contract without adding value. Early conversations of risks are delicate, and contracts have interesting features requiring a very special kind of testing. 

When people ask when do I start testing, they seem to look for a recipe, a process. But my recipe is varied. I seek for the work that serves my team best within the capabilities that I have in that moment of time. I work with the knowledge that testing is too important to be left for only for testers, but that does not mean that testers would not add value in it. But to find value, we need to accept that it is not always located in the same place. 

Instead of asking when do I start testing, I feel like reminding that I never stop. And when you don't stop, you also don't start. 

When do YOU start testing?