A Seasoned Tester's Crystal Ball

Saturday, January 3, 2026

The Words are a Trap

Someone important to me was having a bad day at work, and send me a text message to explain their troubles. Being in a completely different mindspace working on some silly infographic where the loop to their troubles may exist but comes with longer leash than necessary, instead of responding to what I had every change of understanding, I send them the infographic. They were upset, I apologized. We are good.

No matter how well we know each other, our words sometimes come off different than our intentions. Because communication is as much saying and meaning as it is hearing and understanding.

Observing text of people like those who are Taking testing! Seriously?!? and noting the emphasis they put on words leaves me thinking that no matter how carefully they choose their words, I will always read every sentence with three different intentions because I can control it, they can't. Words aren't protected by definitions, but they are open to the audience interpretation.

I am thinking of this because today online I was again corrected. I should not say "manual testing", the kind of poor quality testing work that I describe is not testing, it's checking. And I wonder, again, why smart people in testing end up believing that correcting the words of majority leads to them getting the difference between poor quality and good quality testing, and factors leading up to it.

A lot of client representative I meet also correct me. They tell me the thing I do isn't testing, it's quality assurance. Arguing over words does not matter, the meaning that drives the actions matters.

Over my career I have been worried about my choice of words. I have had managers I need to warn ahead of time that someone, someday will take offense and escalate, even out of proportion. I have relied on remembering things like in 'nothing changes if no one gets mad' (in Finnish: mikään ei muutu jos kukaan ei suutu - a lovely wordplay). Speaking your mind can stir a reaction that silence avoids. But the things I care for are too important to me to avoid the risk of conflict.

I have come to learn this: words are trap. You can think about them so much you are paralyzed from taking action. You can correct them in others so much that the others don't want to work with you. Pay attention to be behaviors, results, and impacts. Sometimes the same words from you don't work, but from your colleague they do.

We should pay attention more to listening, maybe listening deeper than the words, for that connection. And telling people that testing is QA or that testing is checking really don't move the world to a place where people get testing or are Taking testing... Seriously.

Wednesday, December 31, 2025

Routines of Reflection 2025

As I woke up to a vacation day 31.12.2025, a thought remained from sleep: I would need to rethink the strategies of how I use my time, and how I make my choices for the next year. I was trying to make sense into the year we are about to leave behind, and I knew that if there was a word I would use to describe it, it would most likely be consistent effort. On holidays and weekends, the consistent effort was into reading and I have been through more books in a year than I have read in the last ten combined (fiction, 51 titles on Kindle finished in 2025 and 73 in 2024, starting on the week I turned 50). On work, it was whatever was the theme of the week / month / quarter and I had adjusted direction learning so much throughout the year.

While efforts feel high and recognizable, I am not convinced with the strategies behind those efforts, and particularly the impact that I am experiencing or even aspiring. I am, after all, in a lovely unique career position where I have a lot of power over choices we make on testing, in an organization where I have a lot of learning to do on how to work on power with people, and particularly power with other organizations. Consulting, and my role in the AI-enhanced application testing transformation force every day to be one full of learning.

Describing the effort

As consultants, we track our hours used, leaving me with data of my year at work.

So know that I used 7% of my annual work hours on receiving visible training. This included:

Participating in conferences I did not speak at: Agile Tampere, Oliopäivät Tampere, Krogerus Data Symposium
Classroom training on Sales (did not like this), Delivery framework (liked this), start of Growth training (loving this).
Ensemble learning for ISTQB Advanced Test Automation -certification and completion of full set of four Advanced certificates.
Ensemble learning for CPACC Accessibility certification and completion of the certification, with start of accessibility advocacy that comes with holding the certificate without exam every five year.

I was on a sick leave 3% of my annual used hours. This feels more than usual, but still only 6 days. Two very classic flu that my mom always said takes 2 weeks or 14 days to recover from, whichever comes first. One bout of backpain when forced to adjust back to office and different ergonomic. I guess this is the investment of meeting more people face to face again.

39% of my hours I have done something where we specifically agreed on something I would deliver for the clients, and the customers would pay for it. Those hours were split between six distinct customers. Thematically people paid me to transform stale testing organizations, teach contemporary exploratory testing, introduce and improve test automation and AI, bring in throughput metrics, and to test in projects where others before me had not managed to find the problems customers would have to find.

That leaves 51% of my efforts into two categories of admin and proposal. Admin is when we run our own organization, and that took 20% of my time. I have direct reports, and I testing community of excellence to facilitate. I hired 4 people in 2025. Proposal is when we are creating partnerships with the clients but they aren't paying for that work, and it was 31% of my effort. On proposals, I worked with 72 organizations out of which 51 clients, 17 partners in delivery for the customers, and 4 internal customers.

You can guess from the numbers that this was particularly challenging. I learned to take notes, categorize and model problems and solutions through consistent practice. And I appreciate the window into the challenges of testing that the work I hold now offers. If only I could figure out how to better turn it into impact - and I will.

Proposal category included also all of the public speaking I did that was not paid, which was 23/25 sessions I delivered in 2025.

I spoke on:

AI, particularly Agents in GitHub Copilot for non-automation use cases
Python, teaching a 8-piece series of python for testing at work instead of complaining for same amount of time that some people did not know the basics - they do now.
Contemporary exploratory testing, seeing versatile problems in target applications and combining automation into it

A particular learning event that of the year was making a typo worth 24M euros, a definite gem to my collection of costs of bugs. That story gets better as soon as I prove that without the typo, the 24M would have been ours and I may need another year for that. :D

Socials

The year on social media was interesting. I got to feel the split on multiple dimensions.

I was on Mastodon (a lot!) and LinkedIn, and I missed the time of Twitter. On Mastodon, I tallied to 1.1K followers and 4.2K posts. On LinkedIn, I tallied 12,008 followers, 778,849 annual impressions (-34.1% less than previous year). I blogged less than before, and had 314K views taking the total now to 1,422,454 page views.

I was in Finnish and English on LinkedIn, and missed the possibility of just picking one.

I learned that a fairly popular blog with consistency showing up combined with AI makes AI answer my kinds of questions better, without referencing me. If you care about work (results) over the credit, best ideas win - never been more true.

Challenges

As we all should know, movement is not work. Effort alone does not bring impact. So I find myself making a tally of the challenges.

Level of testing skill
The controls at scale organization, allocation and targets
Sense of agency with understanding of impacts

If there was a theme of insights this year solidified, it was the insight on lack of testing skills in testers. There is a little too much reliance on external sources or serendipity, and a little too little of intentional search for relevant information and continuously improving testing systems. With the exceptional levels of educational materials available, the level of this information turned into practice continues to be a challenge. The issue has grown over time since testers are, due to various transformational reasons, more often one in a team of developers.

To address something in scale is a whole another problem. I learned about changing organizations in practice and how organizations, money allocation and target setting are tools of scale that I have been so bad at and need to learn more on. I have focused on problems of information and examples, but I need to solve the frame that allows people to change.

There's a lot of learned practices of weaponized helplessness and lack of seeing systems for own impact that means a lot of people find themselves smaller cogs in the system than I would believe they have power on. Our ideas of what is possible and what is safe, and what is mine to change or even comment on have a significant impact on what we are capable of achieving together.

Impacts

I'd like to think that some of the testing advice or inspirations I have provided this year have impacts that I only learn later on. Kind of like receiving a message this year from two people I worked with 10 years ago, one telling that I still impact their career on a regular basis due to the timing of when our professional paths crossed, and another telling me their organization has now better diversity mechanisms because our time together was one where I invested effort into letting people know I am not "guys" and that I would risk personal negative consequences for working for social justice.

So with all the reflection, I leave a call for myself and the community around me on finding out ways of fixing challenge 1 - skill of testing. While I have a sense of need of personal contribution in that space, I also know that the only way we solve problems in scale is democratization of knowledge and working together. So that is up next, going for 2026.

Closing off

I still think my reflection wins over what social-media-based AI tools can do. Top quote is my challenge 1.

And me on Github (screenshot on Mastodon). My work is not on Github.

Monday, December 15, 2025

Participant skills to retrospectives

I'm an avid admirer of retrospectives and the sentiment that if there was one agile practice we implement, let it be retrospectives. We need to think about how we are doing on a cadence, and we need to do it so that we ourselves get to enjoy the improvements. Thus retrospectives, not postmortems. Because let's face it: even if I learned today that we have a lessons learned from past projects database, for me to apply other people's lessons learned, it's likely that no amount of documentation on their context is sufficient to pass me the information. Retrospectives maintain a social context in which we are asked to learn.

Last week I argued that best of retros I have been in were due to great participants with an average (if any) facilitator. My experience spoken out loud resulted in a colleague writing a while blog post on facilitators skills: https://www.linkedin.com/pulse/retrospectives-why-facilitators-skills-matter-more-than-spiik-mvp--fmduf/?trackingId=yc302cSWzR0ZhHTSNoyVmQ%3D%3D

Success with retrospectives appears to be less of an issue with roles and responsibilities, than of a learning culture. When in a team where each gig is short and you could be out any moment (consulting!), it takes courage to open up about something that could be improved. The facilitator does not make it safe. The culture makes it safe. And facilitator, while they may hold space for that safety and point out lack of it, anyone can.

When someone speaks too much, we think it's the facilitator skills that matter in balancing the voices. Balancing could come from anyone in the team. Assuming the facilitator notices all things feels like too much on a single person.

Building cultures where work does not rely on a dedicated role is kind of what I'd like to see. Rotating the role on the way to such state tends to be a better version that having someone consistently run retros.

Having facilitation skills correlates with having participation skills too. At least it changes the dynamic from passive participant already afraid to express their ideas to an active contributor.

Friday, November 28, 2025

Observations of a habit transformation

A month ago, I gave a colleague an assignment. They were to create typescript playwright automation using Github Copilot and Playwright Agents. While making progress on the tests was important, learning to use agents to support with that work was just as important.

We had a scope for a test, which was one particular scenario previously created with a recording style automation tool. Recording took usually an hour, but there was no fixing the script. Whenever it would fail, a rerecording was the chosen form of maintenance. No one knew anymore if the thing that was recorded now matched what was recorded when the test was originally imagined. The format of the recording was an xml pudding where pulling out things to change took more effort than anyone had been willing to invest.

Halfway through the month, I checked with how the work was progressing to learn that it had seemed easier to work without agents due to familiarity. With a bit of direction that was no longer an option for continuing.

Three days before the deadline, I checked with how the work was progressing to learn the scope of the test had been forgotten and something new and shiny was being tested, mostly for playing with the Playwright Agents. With a bit of direction the scope was done by the review meeting.

Yes, I know I should be checking in more frequently. That option however was not a possibility.

Looking at what got done, I learned a few things though.

I learned that 134 LOC was added into 8 functions.

I learned three new significant capabilities (env configuration, data separation and parametrization, and fixtures) were added, and the scope of what the intended design of the original test had been had been captured.

I learned that making test reliable by adding verify for waiting to be at right place before proceeding had taken significant amount of work.

I learned that one type of element was never seen by the Playwright Record tool, and that required handcrafting the appropriate locators.

I learned that using agents comes with more context that I had not fully managed to pass on. If your agents out of the box are called planner, generator and healer, the idea that you might want to skip the planner or even write your own just following the existing as examples was not straightforward.

Seeing this unfold in hindsight from the pull requests, I modeled the process of how it was built.

First things were either recorded or prompted out after AI. Recording was clearly the preferred, controllable way of starting.

Then things were made work by adding things recording did not capture.

Then a lot of work was done on structure and naming.

There was a few iterations of making it work and making it pretty.

So I compared notes with some of the other assignments like this that I have given to people.

There were five essentially different ideas of how work like this would get done.

1) Working through the steps of something, make it work, make it pretty, was a preferred method for newer automator.

2) Writing it with the end in mind was usually a choice of a more seasoned automator

3) Agenting ourselves through the steps was again a preferred method for a newer automator, producing insufficient results.

4) Agenting ourselves with the end in mind seemed to produce better results for directing for agentic style of writing (reading, reviewing and deciding on new directing) style of tests

5) Agenting ourselves with the test name in mind was the aspiration but steps to walk through have some more maturing to do.

So today, I wanted to make a note of a theory - how you frame your steps and what is your model decomposition of work will greatly impact the outcomes you get on this style.

Tuesday, November 25, 2025

Platform Product Testing

In the lovely world of consulting, I find myself classifying the types of engagements I work with. While Product / System Testing holds a special place in my heart with the close collaboration with teams building the product together, I come often to a table where the Product is platform product, the building is configuring and specially constrained programming, and the purpose of the platform product is to enable reuse for similar kinds of IT system needs. I often call this IT Testing.

With IT Testing, a firm belief on previous experience on the platform product at hand runs strong. That makes me particularly fascinated in modeling the similarities and differences, and insisting I can learn to test across platform products. So today I decided to take moment to explain what I have gathered so far on Platform Product Testing for Dynamics 365, SAP S4/Hana, Salesforce and Guidewire. The listing of platform products is more than these four, and I very intentionally excluded some of my lovely friends from work such as Infor and ServiceNow.

What makes testing of a platform product based system different is the experience of inability to tell what is a platform product problem (or feature), what is something you had control over in changing, and what comes from your rules combined with your data. For it to work for the business purpose it was acquired, the culprit does not seem like a priority. If it does not run your business the way your business needs running, there is a problem. Recognizing a problem starts then figuring out what to do with such problems. Acceptance testing with business experts is essential and critical, but also very disruptive to business as usual if it needs repeating regularly.

Since most of the functionality comes from the Platform Product, your integration project costs are usually optimized by focusing their testing on the things your contractor is changing and thus responsible for. This may mean that Acceptance testing sees an integrated end to end system, while other testing has been more isolated. Automation, if it exists, is the customers choice in investing in essentially multivendor feedback where some of the parts are the product that, theoretically, was tested before given to you - just not with your configurations, integrations and data that run your business.

Let's talk a bit about the platform products.

Dynamics 365, Power Platform, is a set of Microsoft Platform Products giving you ERP and CRM types of functionalities with lots of low-code promises.

Salesforce is primarily CRM-types of functionalities, and it's a cloud-based multi-tenant platform.

SAP S/4HANA is with ERP-types of functionalities and enough history so that the new and old mix.

Guidewire is insurance-focused platform product.

My curiosity with these started with noting vocabulary. A thing we know well in testing is a concept of a test environment. They come in two forms: long-running (production-like) and ephemeral. For Salesforce the environments are called sandbox and scratch org. For SAP matching concepts to get are testing environments, development environments and the supporting tooling of transport/ChaRM. For Dynamics 365 we talk about solution packages and expect understanding of containers. And for Guidewire we talk of bundles, patches and upgrades. While I recognize the dynamics of how things work in each, I get corrected a lot on use of wrong words.

Each of these lovely platform products comes with its own programming language. Salesforce gives us Apex. Guidewire introduces us to Gosu. Dynamics drives us to low code power apps components configurations. SAP gives us ABAB and configurations. For someone who holds dear the belief that sufficiently complex configuration is programming, I find these just fascinating.

My highlights so far are:

Dynamics 365

Got to love containers as an approach and Azure DevOps makes this feel to me more like modern product development for deployment tooling side.
UI automation requires understanding of user-specific settings, and I hear UI can be fragile. Locators for test automation aren't straightforward.
Test from APIs and a bit from end to end in UI.
Pay attention to solution layering discipline, and automate deployment and data seeding.
Get started:

store solutions in source control, build import/export pipelines via Power Platform Build Tools, prefer API tests and small UI smoke suites

SAP S/4HANA

Automated change impact analysis relying on the structure of Transports/ChaRM is kind of cool given your tools of test management match the support. It is also generally not optional and trying other things can be trouble.
Config changes have impacts across modules and business process chain testing is essential
Get started:

map transports to test suites, automate test runs on transport promotion, use S/4 HANA test automation tool where available and treat integration flows as first-class tests.

Salesforce

Multi-tenant means quota limits. Stay within the limits. Testing too big is trouble.
CI/CD and scratch orgs allow for lovely change-based test automation practice. Use mocks for integrations.
Smart scoping of data for test purposes helps, plan for data subsetting and refresh cadence.
Locators for test automation can be next level difficult for Shadow DOM and dynamic components.
Get started:

enforce Apex test coverage, minimize data creation in tests, use scratch orgs + CI for PR validation, monitor governor limits during pipeline runs.

Guidewire

Product model–driven testing: insurance product model serves as testing anchor
Collected open source toolset as 'Guidewire Test Framework' and enforced rules around sufficient use of them guide the ecosystem towards good practices like test automation and coverage.
Limitations on some contracts on use of AI can significantly limit hopes for use of AI
Get started:

create a policy lifecycle regression pack; adopt Guidewire Testing Framework; run regression against each product model drop; negotiate test environment refresh cadence with vendor/ops.

For all reuse of test artifacts across clients is theoretically possible. For all, test data management is necessary but execution of its practice differs: Salesforce and D365 drive synthetic data and subsetting approaches, and SAP and Guidewire require larger production-like data sets; fast refresh capability and data masking is universal. All come with a CI/CD pipeline but each have a platform specific recommended one: Salesforce DX for Salesforce, Power Platform build tools + Azure Devops for Dynamics365, transport/ChaRM automation for SAP.

Universal truths, maybe:

Need for strong regression testing due to vendor-driven releases
Presence of custom code + configuration layer you must retest
Requirement for representative test data
Complexity of cross-module / cross-app business processes
Integration-heavy test design (APIs, services, middleware)
Organizational constraints around AI-generated artifacts
Upgrade regression risk as a consistent pain point

Decided this could be helpful - Platform testing capabilities and constraints comparison. At least it helps me with learning coverage as I venture further into these.

Capability / Constraint	Guidewire	Salesforce	Dynamics 365 / Power Platform	SAP (S/4HANA)
Metadata-driven development	(✔️) Config layers, product model	✔️ Core concept	✔️ Solutions + Dataverse	(✔️) Mostly configuration, less metadata-portable
Proprietary programming language	✔️ Gosu	✔️ Apex	— (PowerFx only for Canvas, but not core platform)	✔️ ABAP
Strict platform resource limits	(✔️) Some internal limits	✔️ Governor limits	(✔️) API limits & throttling	(✔️) Performance constraints by module
Vendor-controlled releases with required regression	✔️ Product model upgrades & patches	✔️ Seasonal releases	✔️ Wave updates	✔️ Transport-based releases & upgrade cycles
Automated test impact analysis supported	—	(✔️) Through metadata diffs + DX	(✔️) Via solution diffs & pipelines	✔️ Transport-level impact analysis
Native test automation tooling	✔️ Guidewire Testing Framework	(✔️) Apex tests + UI Test Builder (limited)	(✔️) EasyRepro / Playwright guidance but not “native”	✔️ SAP S/4HANA Test Automation Tool
UI layer highly changeable / automation fragile	(✔️) Angular-based UI, moderate	✔️ Lightning DOM changes often	✔️ Model-driven apps update frequently	(✔️) Fiori stable but customizable
Complex cross-module business processes	✔️ Policy ↔ Billing ↔ Claims	(✔️) Depends on org complexity	(✔️) Depends on app footprint	✔️ Core ERP complexity across modules
Strong CI/CD support from vendor	(✔️) Limited compared to others	✔️ Salesforce DX	✔️ Azure DevOps + Build Tools	(✔️) SAP CI/CD + ChaRM
Easy ephemeral environment creation	—	✔️ Scratch orgs	✔️ Dev/Test environments via Admin Center	— (environments heavy, transports rely on fixed landscapes)
Heavy dependency on realistic test data	✔️ Policy & claims data	(✔️) For integration flows	(✔️) For model-driven logic	✔️ Mandatory for end-to-end flows
Contractual constraints on AI-generated code/config	(✔️) Vendor & client contracts commonly restrictive	(✔️) Org policies vary	(✔️) Varies by tenant/governance	(✔️) Strong compliance usually restricts
Complex upgrade regression risk	✔️ High	(✔️) Medium	(✔️) Medium	✔️ Very high
Platform-driven integration patterns (APIs, services)	✔️ SOAP/REST internal services	✔️ REST/Bulk/Messaging	✔️ Dataverse APIs + Azure	✔️ BAPIs/IDocs/OData
Stable API layer for automation	(✔️) Good internal APIs	✔️ Strong API surface	✔️ Dataverse APIs stable	✔️ Strong API layer but complex

Out of all the difference, this one is the most defining: ephemeral test environments: Salesforce, Dynamics 365 AND vendor-native automation tooling: SAP, Guidewire.

Habit of AI in Quality Engineering

It was June 2024, and I was preparing to meet a journalist for an interview due to my recent appointment as Director for AI-Driven Application Testing at CGI Finland. The journalist drove the interview with two requests:

Demo and show!
What is our approach at CGI on this

New to the company, it took me a bit of reflecting on what is our approach and if we have *one* or *many* approaches. A year and a half later is a good time to reflect on the ideas I put together back then, and what my reality ended up looking like.

For the demos I chose back then, I had three:

GenAI code reviews and even if I had ended up with production experience, I still follow on a monthly basic the slow start of using genAI code reviews in customer projects. After all, the bottleneck with genAI moved from writing code to reading code, and some reports include numbers such as writing code 2-5x as fast, but code reviews being 90% slower.

GenAI pair testing and using commonly available genAI tools as external imagination for purposes of exploratory testing. After all, from screenshot to ideas and observations, I already had more help from the tools than from majority of tester colleagues too focused on empirical proofreading of requirements.

Digital twinning for a test expert was already on version 2 as people at CGI had used my CC-BY materials to experiment with genAI helper that would have my ideas of exploratory testing as context. While I might not need the answer of what would I advise, it was a fun demo of insights of having been building a base for being able to do this for two decades of open materials.

Copiloting test automation code with polyglot approach of various languages and driving forward relevant efforts towards automating tests. After all, we tend to want deterministic examples we can track rather than regenerated things moving control away from whoever is operating the quality signaling efforts. What made this interesting is the foundation of hands-on experience from 2021 onwards and the roman numerals example where humans outperform the genAI.

There were things at the company I had not become aware of back then, that all had later an impact on how I would think about building the habits

CGI DigiShore is an AI solution for modernizing legacy applications. I found a lot of value in generating artifacts to understand yesterday's code for testing purposes, and building concepts towards DigiShore Coverage for Testing.
CGI AppFactory is a delivery concept where we optimize delivery teams of humans and agents. While the official messaging might not so say, I learned from discussions with my fellow developers that regardless of titles, the delivery mode has a foundation in exploratory testing and understanding how we would continuously explore while documenting tests with modern automation.
CGI Navi is an artifact generator that can be run in hosted mode when all your data should not leave and ship to someone further away. With it I learned more on contractual trust relationships between organizations, and driving genAI use forward in practice even when that trust for some sets of data cannot be in place.

I didn't only look at CGI Intellectual Property, but the wider community with our partners, whatever was cooking in the ways AI was making its way to commercial testing tools for test repositories, test automation and test data, and what became available in open source as well as in Finland.

In hindsight, that 1.5 years was well used in modeling the AI black box for function and structure making comparisons and recognizing uniqueness of approaches easier. It allowed for recognizing the need of a choice to build a stronger habit of AI in quality engineering in practice, and thus now drive forward a significant agent-to-human -ratio for contemporary exploratory testing.

Looking back at the slides and the reality that unfolded. I decided in June 2024 that my approach towards the change we need in testing would be four steps:

We have established our understanding with AI we continue to collaborate on observing and integrating the latest in the field. We have recognized the automation transformation needs significant work and is both a foundation for AI in testing and any modernized approaches to software development with worthwhile quality signaling. We have started moving selected public materials to our open source for test sharing (https://github.com/QE-at-CGI-FI) and adjusted my previously used CC-BY to CC-BY-NC-SA, and open yet more restrictive license. This reflect what brought me to CGI: industrialization at scale of resultful exploratory testing I learned to teach over the last three decades. And finally, we have done what we do with our clients and their various stakeholders, often in multivendor teams.

Building the habit of AI has required deeper understanding of data sensitivity classifications, isolating different sensitivity data, negotiating reasonable sandboxes for use of leading packaged genAI tooling both from a UI and programmatically from APIs, and having those sensitivity considerations leading to a hosted solution. It has driven my personal explorer's agent to human ratio to over 20, and while measuring time it saves isn't feasible, it is driving a real change in how I test with contemporary exploratory testing with a pipeline of task-based agents to capture some knowledge we have previously accepted as tacit.

Finally, the journalist asked me for the approach. For the approach, I combined all that is dear to me:

Resultful testing where the level of practice needs to be better than what we experience in scale
Scaling habits by democratizing knowledge, and allowing time to learn in layers
Open information while working towards business incentives that allow for sustainable work
Learning by doing, learning by teaching
Scaling by practices and tools
Better quality signals with metrics

The community, both at CGI and at large has been on this journey with me. With a variety of clients more curious than ready for the change of habits, our habits take a continuous balance of awareness of where in the journey we are with different clients.

If back then I was unsure if there needed to be one way or if this way fits, I now know I work in a community of curious professionals that see cultivation of many routes and integrating for greater benefit something of value. Some corporations are built as community.

Friday, November 7, 2025

Dual-purpose Task Design

Dual-purpose thinking is big in the world right now. Dual-purpose technologies are innovations with application for both civilian and military uses, such as AI, drones and cybersecurity perspectives with any software we rely on. A few mentions of these in a conference I was listening in to, and my mind was building connections to using dual-purpose to explain exploratory testing.

In the last weeks, I have found myself explaining to my fellow consultants that our CVs do work for us when we sleep or work on something else. I started explaining this to help people see why they should update their CVs, and what I see in the background processes. We get requests to let our customers know the scale of certain experiences, and we collect that information from the CV collections. Those CVs have a dual purpose. They serve as your personal entry to introduction with interesting gigs. But they also serve as a corporate entry to things that are bigger than the individual. Usually without us having to actively work on them.

Today, as I was ensemble testing a message passed through an API and processed through stages, I realized that dual-use thinking is something core to how I think about exploratory testing too. I design my testing tasks so that instead of them being separate, they are dual use. I optimize time by actively seeing the overlap. And it saves me a ton of trouble we were facing today.

I described this in a post on LinkedIn:

An ensemble testing session today showed the difference between manual testing, automated testing and contemporary exploratory testing in a fairly simple and concrete way.

Imagine you are testing with a message to API, with the three approaches.

Manual testing is when you run the message through postman. You get confused on what values you changed last because you did not name your tests for whatever you tried just now, there was no version control and information on what value X means is held in your head. The baseline versions of these you carefully document in Jira Xray, but not with the message. That you create manually from templates whenever you need it.

Automated testing is when you spend significant time in creating the message with whatever keywords you have, and then verifying the right contents with whatever keywords you have. Because of the mess of keywords and rules around how carefully these must be crafted to qualify, from a combo of values in a message to automated testing, there's quite a distance.

Because the first is fast and the second is slow, you carefully keep the two of them separate by having separate tasks. Maybe even separate people, but even when they are both on you, intertwining the two is not an option.

I push for the third option that I keep calling contemporary exploratory testing. Edit your messages in version controlled files in VSCode. Send them over with simple or complicated key keywords. The main point is you can leave your new combinations behind in version control as you discover them. Structure them towards whatever has all the right checks in place when the time is right, but build ways of structuring and naming that help document the concepts your inputs and checks have. See that you are building something while exploring that helps you refresh your results.

This all summed up as a picture with texts on the ideas I was working through. I seek ideas that are beyond what I see in day to day, hands-on testing with people capable of "manual testing" and "test automation". I seek Contemporary Exploratory Testing with Programmatic Tests.