Wednesday, September 11, 2024

Do Thee TDD?

Sampling many customer organizations, I can't help but to note a customer theme we aren't answering well. The question is if we are doing test-driven development.

A lot of us know what it is. We usually have learned to recognize it as possibly two different patterns: 

  1. TDD while programming. Super-small loops (inside out,  'Chicago school'). Or small loops with mocks at play (outside in, 'London school'). 
  2. ATDD (BDD, SBE - lots of names for similar idea) where examples characterize the feature before adding it. 
For a lot of the customers through, I realize these two are more intertwined. And the conversation very often gets derailed to defining if the test *really happened before*, and how often did it make sense for each of the developers to write the test first ('isolating a bug is great test first') or write it as part of the few hours-few days feature they are on ('easier to capture intent in the same pull request when I first figured out how to get it done'). In a scale the customer looks at, you can't really tell if it was before or after. In scale of the developer learning techniques to better control and describe the intent and not miss relevant bits with short-loop-after, learning the test-driven development techniques, both Chicago and London styles to mix them up probably does a whole world of good. 

The customers concern is not always whether the test came first. But it is if it came before (ATDD style) and if it came with the change itself (included in PR). 

I find myself characterizing the answers to this team with slightly more granularity: 
  • Level -1. Test after with tester tests and bug reports. This happens a lot too. The 'nightly run' where analyzing the failures takes a week. We've all been there. Lets hope for a generation of developers who will look puzzled at that statement. 
  • Level 0. No Sign of TDD. When code is merged with pull request, significant effort of testing follows in subsequent pull requests. There could be test changes with the original pull request, but their intent tends to be to get old tests to pass. 
  • Level 1. Short-Loop-After. When code is merged, so are tests. Same pull request. Thus in same repo, going into the pipeline. Little care if it was a mix of before and after writing the implementation because the loop is short enough. This more driven and continuous than we ever used to have and we should celebrate. 
  • Level 1b. Disciplined TDD. When code is merged, so are tests. Mixing outside in and inside out, with and without mocks, but the developers consistently write tests first. 
  • Level 2. Acceptance criteria with examples. Examples from customers, illustrating core things that are different after the change, and introduction of the new behavior.  Just having the examples around help developers with a clearer definition of done, and less looping back to new information to learn. Things aren't obvious to everyone in the same way. 
  • Level 3. BDD automation before implementation. Examples passing one by one drive the idea of are we done with the change. 

The three first teams I think of are on levels -1, 0 and 1. They all aspire to level 2. 

Smaller steps may make it more manageable as a change. Where are you, and where are you heading? 

Monday, September 9, 2024

Learning to test in Dynamics365 projects

How do you become an expert in something you did not know yet? By learning about it. You have a foundation of knowledge you probably acquired on other products, and if the foundation is large enough, you will be bound to see similarities. This is how I feel about being thrown at my first Dynamics 365 project. 

Learning in public - explaining how my learning evolves and how my thinking evolves - gives me a chance of learning from people who know things I did not. And it provides the odd chance that my learning is something of use to someone else. 

What's this about?

Dynamics 365 is one of the (many) platforms. You may have, like me, experienced SAP. Or Salesforce. Or Guidewire. Or Odoo. And you can continue the listing. What these essentially are things enabling reuse. I personally like to call them platform products. There is a lot of common functionality for all their users. Yet there is even more own data, configurations, integrations and changes so that the resulting system looks different, is used different, and most definitely holds up information and processes of highly different organizations. They are the epitome of modern reuse. If you could buy a product and use the product everyone else uses too, maybe you did not have to build your very own system. Meanwhile, tailoring enough means that the theory of reuse meets practice in this thing we lovingly call testing, where the rubber meets the road and good plans go to meet empirical evidence to ensure our business still runs with all the plans in place. 


This particular one is a product platform in cloud done by Microsoft. Organization count 1. It is usually configured, integrated and extended by integration partner. Organization count 2. In integrations, there may be a load of other systems as data sources and data targets. Organization count 2+N. In the start of the chain is the organization that assigned responsibilities for all the other organizations, the owner of the system/service, the customer with their users. Organization count 3+N, and responsibilities of ownership

Your usual testing vocabulary isn't helping me

Calling some of this testing acceptance testing isn't really helping me. And particularly, calling some of this unit testing isn't helping me, almost the opposite. Surely if we configure functionality (or decide to not configure it, and working with defaults), it makes sense to verify that the behavior I get is the behavior I want. Most often that testing through needs to happen with at least a partial integrated system, and it may really well be just partial. This drives the design I would need testing vocabulary to reflect towards testing components/services, integrations, and flows across components, services and integrations. Instead of shifting left, here I need shifting down. I need to understand the smaller scope I can verify a functionality in. And if I succeed in that, the feedback granularity for the organization that is expected to react to the feedback is better. 

Theoretically speaking, it would be great if these platform products shipped with tests for the defaults. They rarely do. If they did, I could test with defaults, adapt and extend those tests to test with my configurations, and build a systematic feedback that tells the chain of responsibilities. 

However, I usually end up in these projects from the ownership organization perspective. For me to know if our business flows work, I approach this with the idea of testing core business flows with the application, targeting it with the knowledge of changes. It tends to be better if the chain works, and a chaos ensues if the system is significantly broken. 

That Test Automation thing?

This comes along quite naturally. You have rolling updates (where you may not be able to delay the update at all), and you have quarterly updates (where staying without updating is not possible as an approach, for good reasons). But this means you have pretty much continuous responsibility for testing in the organization of ownership. 

Some people rely on staying close to defaults, and approach this with taking the risk that if the product platform does not work with defaults, it gets rolled back and fixed by the product platform organization. The closer to defaults, the more likely you are to be able to play with timings so that the first wave of installers got whatever was on your way. There's risk, but the risk may be manageable close to defaults. 

Yet usually we are not close to defaults. The further away from defaults we shift, the more there is functionality the product platform organization is unaware of, unable to test for, and thus responsibility for it surviving change is allocated later in the chain. 

You would usually invest in test automation for this. It could be component level, for things where you go furthest from the defaults. It could be process level, to catch things on the basic flows. Or it could be an intricate web of both of these. Plus the test automation that tells you when to point blame towards the product platform. 

In the whole chain, assigning the responsibilities to strategically design the necessary automation is on the organization of ownership. This is where the low code tools find their most lucrative points of entry. 

However, the "no code" approaches are just a visual programming language. If it diffs poorly, it is poorly maintainable. It's a balance, and a belief system. I don't think acceptance testers recording automation tests is the way to go. Shifting down for designing per component / service feedback is the way to go. Visibility of these tests is the way to go. 

Technologies, architectures - it all maps to common web / cloud

Scratching this just a little deeper, I come to realize I have very basic web / cloud things in scale. 

Web pages can be automated with Selenium, Playwright - well, any of the web driver libraries and related testing frameworks. The "scary" parts shadow DOMs, dynamic id's and deeply nested components could perhaps use help of a tool that hides some of that locator complexity. But if it's complex enough, hiding it also means taking away power to maintain it. 

REST APIs can be automated with any of the language specific libraries. 

Why did I want a commercial tool I would have to learn? Or why would I choose to teach that commercial tool to my fellow testers over teaching them the basics of programming for test automation purposes that I know even business testers are capable of learning? 

Let's say the jury is most out in this space. I'll write more when that makes sense to me.

The First Experience - A Users Experience

My first touches to these projects come from having used systems with this - without really realizing I had. Connecting the realization to use examples, I also have examples of missing functionalities on Safari, being forced to use incognito mode and cleaning caches to be able to get some of these tools to work. 

The real question is that since the users experience has directed me to not use Safari, would we care to use all browsers? And what drives the browser differences - I'll learn. 

The Lingo

Finally, in addition to testing vocabulary, there is the product lingo. D365FO, D365CE, feature names, change listings, scope of each project. I find myself classifying: product platform vs. configuration to make sense of it. 

Turns out there is D365RF - the common test automation keywords for robot framework. 


Is this how you take on new testing assignments too?

With a baseline thinking written down, I'll let you know how much more I know in a few weeks. 









Monday, September 2, 2024

Who Are You Paying to Learn AI with You?

Two years ago, Heini Ahven published a research paper (thesis) on AI in testing, concluding from her interviews that there are two particular hurdles for AI in testing: Data and Customer to pay for it. While in two years we have shifted away from needing data and primarily discussing use of generative models now,  the essential challenge of Customer to pay for it remains. Customers are making choices of who they bet on to pay to learn AI with them. 

It would seem to me that these things are good reasons to bet on us:

  • We have a track record of having created customer specific systems with AI in them
  • We have a track record of having new products with AI in them, most notably in modernization of legacy systems, but generally too many to list
  • We publicly say (and can back it up) that we have already invested 1M into AI in the last year, and built quite a platform of knowledge with it
  • We know software development, and we know testing. And we know these in scale. 
That's the high level. Yet I think that reframing the question from do we have solutions to who are you paying to learn AI with you is the way to go. And personally I think that you would do well learning AI with me and my crowd. 
  • We have looked at large numbers of testing tools with AI in them, and can help you sort out positioning of those tools
  • We have used tools with AI in creating test artifacts, and can help you sort out sociotechnical guardrails of use of these tools so that you can steer your learning
  • We're happy to pick up a tool you want to learn with even if we haven't yet, and amplify your learning with success in mind 
There is a lot going on in the scale I get to pull from. I chose 6 activities, 4 values that are core to the approach I work with. 

We need to know where we are to make sense of where we are heading. We were expecting improvement and agreeing how improvement can be recognized is key.

We experimented already, and we scale to experiment with more customers. Any solution in this space has learning at heart, and keeping learning at heart steers to benefits. 

We collected sociotechnical guardrails for different kinds of applications of AI in testing. A lot of what we have been learning we can feed into new organizations, and improve with ongoing learning that benefits us all. 

We rely on building new habits that are good habits, to instill and sustain a culture of learning. This usually means we need to work with people in working through the change rather than input material. 

What we learn, we teach on. Sharing is a way of continuously seeking improvement. 

Some of this we package and make available in scale that helps anyone. These are new tools and services emerging. We recognize attribution is also IP, and we recognize scale will have a mix of different kinds of IP. 

In these six activities, we value four things: 
  • Our approach is human-centric and we are learning best ways to have people in the loop
  • We seek enhancing to better
  • Our expectation is incremental, with controlled investments that can expect results
  • By focusing on many customers while carefully hearing each customers specific challenges, we seek to make helpful impact in testing field in scale
This said, the question remains: who are you paying to learn AI with you, and could be us? 

The author is Director, Consulting Expert at CGI Finland, focusing on AI-driven application testing. She usually writes about her work that isn't specific to CGI and felt like making an exception today. She is seeking primarily Finnish customers to join in increasing use of AI in testing, and believes open calls for collaboration are preferable to approach when seeking early adopters. Expectation for her new position at CGI is that she meets customers 130 times per year, and you scheduling a short conversation on how she could help would be mutually beneficial while unusual approach. She can be reached at maaret.pyhajarvi@cgi.com.  

Do testers need to be devops engineers too?

At a point of my testing career, I specialized in understanding test environments. It started off with seeing connections between subsystems, and recognizing compatible data. Well, there was no other choice to test effectively in insurance industry, where a new IBM mainframe test environment (I got one of those too!) took 1 million and 1 year. I can't remember if it was time when we still had Finnish mark as unit of money, or if it was already euro time, I just remember the overwhelming sense of responsibility for enabling  project with a million spending. That time of my career added awareness to any environment I would test in since, and categorizing workstations and servers and versions became a routine. 

When cloud later landed on my world, the categorization and foundation of test environments was particularly useful. Recognizing locations for storage, compute and specialized services and setting up connections and dependencies between all these geographically distributed and provisioned as needed with controls we have and controls that we recognize but don't have helped a lot in figuring out what it was that I was testing. 

It was easy to grasp that the new project that I just started with will have better working test environment early in the week, and we could expect troubles as the week advances. I could see what symptoms are likely to be about having provisioned a smaller test environment, what are likely to be results of data and memory, and I can design the way I test particular things around that weekly and daily cadence that I recognize going on for the test environment. 

I was thinking about his today, as I saw someone asking if testers need to understand CI/CD, and what of it, and what specific skills are we supposed to have in that space? 

Many of my colleagues extending in test automation space go CI/CD pipelines route after they realize that programmatic tests that run on their machine manually won't be much in the way of automation. There is a significantly higher value if tests are run right after a change that could break things is done and that requires designing the tests into a CI/CD pipeline. Many of those colleagues find barely a day a week for doing testing, when after running the tests nightly turns into optimized sets in the pipelines, and environments turn into dockerized orchestrated platforms where nothing changes in the infra without changing lines of code (Infrastructure as Code). 

I still work with testers who understand environments on the level I used to - recognizing that there is a different address for two different yet same test environments, with heuristics on what to pay attention to each. They, like me before I integrated a lot of this CI/CD pipelines stuff into my thinking, use environments with specific timing patterns to control the version they are experiencing in testing. They may design environments on the level of not allowing change, because installing means often unavailable or out of control we understand. These testers need to understand CI/CD as mechanism of publishing and scheduling, but go no further. 

Increasingly, I work with testers who design and enhance pipelines. While they don't need to do all the changes themselves, they need to read pipelines to see what goes on. Red has a reason, and drives their days of working to see where red is coming from. Majority of people in this group configure new jobs within the same realm of examples, and don't really take things further. Only some bring in new tools. But the new tools part, that is something people seem to love doing.

Then there are people who live in pipelines. Tests are placeholders and boxes, but they rarely have time to go and think about their coverage themselves. Working for the pipelines is the work. Making them run on better infra. Adding new tools. Upgrading the existing tools. Building all the machinery that could support the teams. These people, even with testing background, tend to call themselves devops engineers, to emphasize their attention to infrastructure and pipelines. 

When hiring for a tester, you may expect any of these levels of knowledge. A lot of people search for the middle ground. More and more, we expect people to come with the knowledge of what control pipelines give to your test environments and options of testing. 

And more and more, finding a balance where people know enough yet still manage to test not only build pipelines is what we seek.


Saturday, August 31, 2024

Ethical Stress of the Consulting Work

For three months now, I have been adjusting into a new identity. With my new job, I am now a consultant, a contractor and a service provider. I work for managing a product of testing services, and provide some of those services. 

It's not my first time on this side of the table, but it's my first time on this side of the table knowing what I know now. 20 years ago when I was a senior consultant, I was far from senior. I was senior in a particular style of testing, driven to teach that style forward and learn as much as I could. And while I got to work with 30 something customers opening up new business of testing services back then, I was blessed with externally provided focus and bliss of ignorance.

I had a few reasons to stop being a consultant back then: 

  • Public speaking. I wanted to speak in public, and as a consultant your secondary agenda of sales was getting in the way. Not really for me, but in eyes of others. I got tired of explaining that I would not be able to go to my organization for sponsorship just so that I could speak, and that I was uncomfortable building that link when contents should drive the stage. I knew being in customer organizations for exactly the same work would change the story. And it did. And that mattered to me. 
  • Power structures.With customer and contractors, there is a distribution of power. When a major contractor in a multi-customer environment kicked out of steering group the testing representatives of two out of three organization citing "competitive secrets" and I was the only one allowed in room to block the play that was about to unfold, I learned a lesson: being in the customer organization was lending me power others lacked. Back then I had no counteracts, and I do now, having been in boardrooms as both a testing expert advising board members, and member of those boards. 
Thus 20 years later, I knew what I was doing when I joined consulting to solve the problem of testing competences in testing services in scale. I knew consultancies are the place where this can be solved in scale. I knew the numbers are not on individual customers side, and scale means I need to serve many customers. I knew I needed to level up the testing services to become the testing services I had so much difficulty purchasing when I was on the customer side. 

What I did not know is that the job of consulting would teach me about ethical stress. Because when you serve many yet invoice some, your life is a daily balancing with your sense of fairness. And you will feel the push of just working for one customer so that the overhead of context switching would not be on any of them. The teaching of ethics this particular organization offers adds to the stress. Working by the hour is an extra mental load. 

It's not that I can't deal with it. It's just that it is so much more than it was before that it sticks out, and I need to label it: 

Ethical stress is the continuous sense of having to balance the different perspectives. 

If it takes me an hour to report hours, who should pay for that hour? 
If I get interrupted with another customer while working for a different one, who pays for the reorientation time?
If I have to create a plan and actually have to follow that plan even though I know better, do I look worse even though I am better? 

Having recognized this, I now use it to discuss no estimates in agile teams. Because ethical stress is the big thing estimating and following detailed hours brings to people in the team. It is an invisible motivation killer, impacting how we do the tasks to make them trackable rather than flow best way we know how. 

Ethical stress costs us energy. And sometimes we are so focused on teaching the ethical part of things that we forget the stress part of the same. 


Thursday, August 22, 2024

Prepare to shift down

While the world of testing conferences is discussing shift left - a really important movement in how we make the efforts count instead of creating failure demand, we are noticing another shift: shift down. 

For years, we have been discussing the idea that sustainable success in programmatic testing space requires you to decompose testing differently. Great automation for testing purposes is rarely built from automating what you considered end to end flows a human experiences. Great automation optimizes for speed and accuracy of feedback, granularity in both time - immediately available for the developer who made an impacting change - and location. No speculation on the source of the problem saves up a whole lot of effort. 

As I talked about these two shifts today as part of a talk in applying AI of today in testing as a foundational concept in terms of how I believe we should first shift left and down and a personal suspicion that we might like what AI is doing for us after the shifts, am also realizing positioning organizations at these two locations helps make sense of the kinds of tools and workflow ideas they are automating from. 


What is this shift down that we talk about? Shift left is a common place term and I prefer not having left or right but single commit delivery making things continuous. One can dream. But shift down, that is not as commonly discussed.


Shift down is this idea that test-driven development is great, yet limited by the intent of the individual developer. A lot of developers are good and getting better daily at expressing and capturing that intent, and having that intent is hugely beneficial in routinely accepting / rejecting generated code from modern tools and choosing to stay on controls. From a sample set of projects most certainly not built with TDD and with unknown level of unit testing, we have seen a report recounting that 77% of bugs that escaped after all our efforts to production could be in hindsight reproduced with a unit test, meaning there is a lot more potential for doing good work of testing on the unit testing level. I like to play with the word exploratory unit testing, which is kind of a way of stretching the intent of today with learning in the context of the code, to figure out some of this 77%.

For a few crafters I have had the pleasure to work with, Test-Driven Development and Exploratory Unit Testing could be interchangeable. For others, the latter encourages us to take time to figure out the gap, especially with regards to legacy code where those who came before us left us less than complete set of tests to capture their intent. 

Shifting down pushes conversations to unit tests; component tests; subsystem tests; and guides us to design for decomposed feedback. We've been on that shift as long as the other one. 

Tuesday, August 20, 2024

How to make testing fun and enjoyable?

I have believed and experienced that testing is fun and enjoyable for 27 years. I have had that experience enough to talk about my primary heuristic from stage:

Never be bored.


 This confuses people, especially when their idea of testing is repetition growing over time. 

You keep replenishing the test results. Sort of same results. Except that while the tests may be the same, you don't have to be. You can vary things, and return to common baseline when the variation takes you to surprising information. Every change, every changer in the moment is different. And it's like a puzzle figuring out how to create a spider web of programmatic tests that tells you enough while not all, and yet look at each change with the curiosity of 'what might go wrong here'. 

If I feel bored, I introduce variation. I change the user I log in with. I change the colleague I pair with. I change the order in which I test in. I write test automation that does not fit our existing patterns of how we automate. I write detailed public blog posts while I test unlike normally. I experiment with separating programmatic tests that always run into suites where I run each suite every second day to save up replenishment resources. Well, the list of variations is kind of endless. 

I love testing. And I have been testing today for a new system under test (for me). 

In order to be able to test the way I love testing, I have to be able to have a solid foundation of programmatic tests that we grow gradually as output of our testing, capturing the pieces of learning worth keeping around. Today, I want to recognize a few things that I need to keep testing fun and enjoyable. 

  1. Agency. You don't give me a test case to automate. You give me a feature to test, and out of that I will automate some test cases. But thinking you plan and I execute takes the fun out of my testing. Even the more junior folks do better starting with WHY not HOW. 
  2. Smart constraints. You don't tell me that programmatic tests need to mimic written step by step test cases. That makes me use my time in updating two documentation sets for same purpose, and doing busywork is not fun. 
  3. Test environment. You don't deny me access to exploring of an old version while I design and collect ideas for how we should test changes for the new version. External imagination - the product without the change - makes the task more productive, and it's fun to do good work. There needs to be enough of these to go around for us all, every day. 
Notice how my fun and enjoyment isn't asking for documentation or answers to all things about the product. Not having those around is sometimes different kind of fun, even if I prefer us starting with a better agreement, you can be sure I will discover things outside it. It also does not include great people and friendly developers, because today I choose to believe that people are good and want to do good. Us discovering exactly how many jokes going around that requires is part of the variation. 

A colleague inspired this post by wishing that we had common templates and a unified front on what test documentation looks like. Figuring out how I could ever do that, when I do decent plans and strategies but not to a template should be fun. While it's fun and enjoyable, it is less impactful for the good results I would want out of my testing. Plans are more often ways for me to think the big picture than the most relevant deliverable. 

That's my shortlist (today), what's yours?