Thursday, February 5, 2026

Quality adventures vibe coding for light production

The infamous challenge to a geek with a tool: "we don't have a way of doing that". This time, I was that geek and decided it was time to vibe code an enrollment application for a single women to women vibe coding event with diversity quota for men. That was a week ago, and it's been in "production", bringing me tester-joy, allowing 38 people to enroll and 2 people run into slight trouble with it. 

What we did not have is a way for enrolling to this event so that the participant list was visible, quota of how many people joined monitored, with possibility to have a diversity quota. Honestly, we do. I just decided to fly with the problem description and go meta: vibe code enrollment application for vibe coding event. 

The first version emerged while watching an episode of Bridgerton. 

While letting the agent code it, I did some testing to discover: 

  • the added logo was reading top to bottom instead of left to right in Safari. I know because I tested. 
  • boundary of 17 is 17, and boundary of 3 is 4. Oh wait, no? Comparing two things the same way and one is different and wrong.
  • removing a feature, emails, was harder than it should be. When it was removed, there are three places to take it away from and of course it was left in one and nothing works momentarily. 
  • there was no retry logic for saving to database - who needs that right? Oh, the two people whose enrollment will get lost and they tell me about it.
With that information, off to production we go. True to vibe coding mentality, there is one environment: production. People enroll. Two people reach out that they are sure they enrolled but their names vanished. I was sure that was true and lovely that they let me know, but they managed to enroll, just a few places lower in the waiting list than what would have been their fair positioning. 

Less than a week later, the first event is almost fully booked and the queue is longer than what fits in an event. We decide a second session is in order, and that's back to more vibe coding. Adding two features, selection of day from two options, and time-limited priority enrollment for those already in queue for first event who want to join the second. 


Armed with some extra motivation for the final episode of Bridgerton, I decide to go for the known bug too. Asking for fix requires knowing what fix to ask for. And while at it, I asked for some programmatic tests too. And reading security warnings on Supabase. Turns out that combo made me lose all my production data in an 'oops', since one browser held a full list of participants while one showed none, and as it happens the application removed all lines from database to insert new lines based on what was in local storage. For a moment I thought this was due to removing anonymous access to delete in the database. 

So added more bugs that needed addressing while at this a second time: 
  • every users local storage contents overwrote whatever was in database by then. Almost as if it was a single user application!
  • there were many error cases that needed handling on the write to database failing in the first place to not lose people's enrollment information. They did find it (not surprising) and got in touch for it (surprising). I guess DDoS day was not in my plans when expecting network reliability. 
  • security alerts (that I read) in supabase alerted me on security misconfiguration with out of box version. 
  • Lost all data for testing in production! At least I had a backup copy of all but one row of data. 
Funny how it took me two sessions to start missing discipline of testing: separate test environment; repeatable tests in the repo. 

I'd like to think that I am more thoughtful when turning from vibe coding on hobby time to doing AI-driven software development at work. The difference in those two feel a whole lot like great exploratory testing. 





Wednesday, January 28, 2026

The Box and The Arrow.

A lot of what I used to write in my blog, I find myself writing as a LinkedIn post. That is not the greatest of strategies given the lack of permanence to anything you post there, so I try to do better.  

My big insight last week: the box and the arrow. Well, this is actually D and R from DSRP toolset for systems thinking applied to value generation. In an earlier place of work when I hired consultant, they came with very different ideals of hour reporting that then mapped to cost. 

One invoiced always full hours unless sick. Being a CEO of their own company, I am sure their days included something (if nothing else, tiredness) other than our work, but I had their attention and was happy with the outcomes. I framed it as "pay for value". 

One invoiced always half hours, but produced as much value as the first one. They wanted to emphasize that the other work they did for service sales, upskilling themselves, and common tools development they used but help IP for wasn't ours to pay for. That was fine too, and I framed it as "unique access to top talent for our benefit". 

One invoiced 100% of hours, including hours their company used on managing them. No other responsibilities, their work existence was in service of us. That too was fine, made it simple for them to make sense of shaping their capabilities. I framed that as "do you even want to keep them if they are not getting trained/coached". 

The cost to value for these three was very different, depending on what I got in the service for free. 

So I have modeled now the box and the arrow. The box is the service. The arrow is the relationship that transforms the service. One of the things in our arrow is how we collect information, in 2025 we interviewed more than 1,800 business and technology executives. This is not a "fill this questionnaire". This is us sending our top management to discuss things with a lot of people, every day of the year, systematically. Listing things I consider our arrow that are easy to copy wouldn't be fair for me to do. But the list is long. And a lot of that work in the arrow is paid from whatever the margins enable. 




We need better ways of comparing services. Maybe, just maybe, the question of "what you give us for free" is more insightful than I first thought. 

A simplified analogy to why people shop at Prisma or Citymarket: the smiling greeting. It does not change the sausage they buy, but it changes their experience while buying the sausage.

Friday, January 16, 2026

The Results Gap

Imagine you are given an application to test, no particular instructions. Your task, implicitly, is to find some of what others have missed. If quality is great, you have nothing to find. If testing done before is great, none of the things you find surprise anyone. Your work, given that application to test is figure out that results gap, and if it exists in the first place.

You can think of the assignment as being given a paper with text written in invisible ink. The text is there, but it takes special skill to turn that to visible. If no one cares what is written on the paper, the intellectual challenge alone makes little sense. Finding some of what others have missed, of relevance to the audience asking you to find information is key. Anything extra is noise.

Back in the days of some projects, the results gap that we testers got to work with was very significant, and we learned to believe developers are unable to deliver quality and test their own things. That was a self-fulfilling prophecy. The developers "saving time" by "using your time" did not actually save time, but it was akin to a group of friends eating pizza and leaving the boxes around, if someone did not walk around pointing and reminding of the boxes. We know we can do better on basic hygiene, and anyone can point out pizza boxes. It may be that there is other information everyone won't notice, but one reminder turned to a rule works nicely on making those agreements in our social groups. With that, the results gap got to be the surprises.

Results gap is space between two groups having roughly the same assignment, but providing different results. Use of time leads to the gap, because 5 minute unit testing and 50 minute unit testing tend to allow for different activity. Availability of knowledge leads to the gap, because even with time you might not note problems without a specific context of knowledge. Availability to production like environments and experiences leads to the gap, both by not recognizing what is relevant for the business domain but even being able to see it due to missing integrations or data.

Working with the results gap can be difficult. We don't want us using so much time on testing that was already someone else's responsibility. Yet, we don't want to leak the problems to production, and we expect the last group assigned responsible to testing to filter out as much of what the others missed as possible. And we do this best by sizing the results gap, and making it smaller, usually through coaching and team agreements.

For example, realizing that by testing and reporting bugs, our group was feeding the existence of the results gap lead to a systemic change. Reporting bugs by pairing to fix them helped fix the root cause of the bugs. It may have been extra effort on testing on our group, but saved significant time in avoiding rework.

Results gap is a framing used for multiple groups agreed responsibilities towards quality and testing. If no new information surprises you production time, your layered feedback mechanisms bring you good enough quality (scoping and fixing enough) with good enough testing (testing enough). Meanwhile, my assignments as a testing professional are framed in contemporary exploratory testing, where I combine testing, programming and collaboration to create a system of people and responsibilities where quality and testing leaves less of a results gap for us to deal with.

Finally, I want to leave you with this idea: bad testing, without results, is still testing. It just does not give much of any of the benefits you could get with testing. Exploratory testing and learning actively transforms bad testing to better. Coverage is focused on walking with potential to see, but for results, you really need to look and see the details that the sightseeing checklist did not detail.

Tuesday, January 6, 2026

Learning, and why agency matters

Some days Mastodon turns out to be a place of inspiration. Today was one of those. 

It started with me sharing a note from day-to-day at work, that I was pondering on. We have a 3 hour Basic and Advanced GitHub Copilot training organized at work that I missed, and I turned to my immediate team asking 1-3 insights of what they learned as they were at the session. I knew they were at the session because I had approved hours that included being in that session. 

I asked as a curious colleague, but I can never help being also their manager at the same time. The question was met with silence. So I asked a few of the people one on one, to learn that they had been in the session but zoned out for various reasons. Some of the reasons included having hard time to relate to the content as it was presented, the French-English accent of the presenters, getting inspired by details that came in too slow taking time to search information online on the side and just that the content / delivery was not particularly good. 

I found it fascinating. People take 'training' and end up not being trained on the topic they were trained on, to a degree they can't share one insight the training brought them. 

For years, I have been speaking on the idea of agency, sense of being in control, and how important that is for learning-intensive work like software testing. Taking hours for training and thinking about what you are learning is a great way of observing agency in practice. You have a budget you control, and a goal of learning. What do you do with that budget? How do you come out having used that budget as someone who know has learned? It is up to you.

In job interviews when people don't know test automation, they always say "but I would want to learn". Yet when looking back at their past learning in space of test automation, I often find that the "I have been learning in past six months" ends up meaning they have invested time in watching videos, without having being able to change anything in their behaviors or attain knowledge. They've learned awareness, not skills or habits. My response to claims of learning in the past is to ask for something specific they have been learning, and then asking to see if they now know how to do it in practice. Most recent example in this space was me asking four senior test automator candidates on how to run robot framework test cases I had in IDE - 50% did not know how. We should care a bit more about our approaches to learning in terms of it is impactful. 

So these people, now including me, had the opportunity of investing 3 hours to learning GitHub Copilot. Their learning approach was heavily biased on the course made available. But with strong sense of agency, they could do more.

They could:

  • actively seek the 1-3 things to mention from their memories 
  • say they didn't do the thing and in the same time they did Y and learned 1-3 things to mention
  • not report the hours into training even if the video was playing while they did something completely unrelated
  • stop watching the online session and wait for video to have control over speed and fast-forwarding to relevant pieces
  • ...

In the conversations on Mastodon, I learned a few things myself. I was reminded that information intake is a variable I can control from high sense of agency in my learning process. And I learned there is a concept of 'knowledge exposure grazing' where you are snacking information, and it is a deliberate strategy for a particular style of learning. 

Like with testing, being able to name our strategies and techniques allows us control and explainability to what we are doing. And while I ask as a curious colleague / manager, what I really seek is more value for the time investment. If your learning teaches others in a nutshell, you are more valuable. If your learning does not even teach you, you are making poor choices. 

Because it's not your company giving you the right trainings, it's you choosing to take the kinds of trainings in the style that you know works for you. Through experimentation you learn what are the variables you should tweak. And that makes you a better learner, and a better tester. 



 

Saturday, January 3, 2026

The Words are a Trap

Someone important to me was having a bad day at work, and send me a text message to explain their troubles. Being in a completely different mindspace working on some silly infographic where the loop to their troubles may exist but comes with longer leash than necessary, instead of responding to what I had every change of understanding, I send them the infographic. They were upset, I apologized. We are good.

No matter how well we know each other, our words sometimes come off different than our intentions. Because communication is as much saying and meaning as it is hearing and understanding. 

Observing text of people like those who are Taking testing! Seriously?!?  and noting the emphasis they put on words leaves me thinking that no matter how carefully they choose their words, I will always read every sentence with three different intentions because I can control it, they can't. Words aren't protected by definitions, but they are open to the audience interpretation. 

I am thinking of this because today online I was again corrected. I should not say "manual testing", the kind of poor quality testing work that I describe is not testing, it's checking. And I wonder, again, why smart people in testing end up believing that correcting the words of majority leads to them getting the difference between poor quality and good quality testing, and factors leading up to it. 

A lot of client representative I meet also correct me. They tell me the thing I do isn't testing, it's quality assurance. Arguing over words does not matter, the meaning that drives the actions matters. 

Over my career I have been worried about my choice of words. I have had managers I need to warn ahead of time that someone, someday will take offense and escalate, even out of proportion. I have relied on remembering things like in 'nothing changes if no one gets mad' (in Finnish: mikään ei muutu jos kukaan ei suutu - a lovely wordplay). Speaking your mind can stir a reaction that silence avoids. But the things I care for are too important to me to avoid the risk of conflict. 

I have come to learn this: words are trap. You can think about them so much you are paralyzed from taking action. You can correct them in others so much that the others don't want to work with you. Pay attention to be behaviors, results, and impacts. Sometimes the same words from you don't work, but from your colleague they do. 

We should pay attention more to listening, maybe listening deeper than the words, for that connection. And telling people that testing is QA or that testing is checking really don't move the world to a place where people get testing or are Taking testing... Seriously.



Wednesday, December 31, 2025

Routines of Reflection 2025

As I woke up to a vacation day 31.12.2025, a thought remained from sleep: I would need to rethink the strategies of how I use my time, and how I make my choices for the next year. I was trying to make sense into the year we are about to leave behind, and I knew that if there was a word I would use to describe it, it would most likely be consistent effort. On holidays and weekends, the consistent effort was into reading  and I have been through more books in a year than I have read in the last ten combined (fiction, 51 titles on Kindle finished in 2025 and 73 in 2024, starting on the week I turned 50). On work, it was whatever was the theme of the week / month / quarter and I had adjusted direction learning so much throughout the year. 

While efforts feel high and recognizable, I am not convinced with the strategies behind those efforts, and particularly the impact that I am experiencing or even aspiring. I am, after all, in a lovely unique career position where I have a lot of power over choices we make on testing, in an organization where I have a lot of learning to do on how to work on power with people, and particularly power with other organizations. Consulting, and my role in the AI-enhanced application testing transformation force every day to be one full of learning. 

Describing the effort

As consultants, we track our hours used, leaving me with data of my year at work. 

So know that I used 7% of my annual work hours on receiving visible training. This included:

  • Participating in conferences I did not speak at: Agile Tampere, Oliopäivät Tampere, Krogerus Data Symposium
  • Classroom training on Sales (did not like this), Delivery framework (liked this), start of Growth training (loving this). 
  • Ensemble learning for ISTQB Advanced Test Automation -certification and completion of full set of four Advanced certificates. 
  • Ensemble learning for CPACC Accessibility certification and completion of the certification, with start of accessibility advocacy that comes with holding the certificate without exam every five year. 
I was on a sick leave 3% of my annual used hours. This feels more than usual, but still only 6 days. Two very classic flu that my mom always said takes 2 weeks or 14 days to recover from, whichever comes first.  One bout of backpain when forced to adjust back to office and different ergonomic. I guess this is the investment of meeting more people face to face again. 

39% of my hours I have done something where we specifically agreed on something I would deliver for the clients, and the customers would pay for it. Those hours were split between six distinct customers. Thematically people paid me to transform stale testing organizations, teach contemporary exploratory testing, introduce and improve test automation and AI, bring in throughput metrics, and to test in projects where others before me had not managed to find the problems customers would have to find. 

That leaves 51% of my efforts into two categories of admin and proposal. Admin is when we run our own organization, and that took 20% of my time. I have direct reports, and I testing community of excellence to facilitate. I hired 4 people in 2025. Proposal is when we are creating partnerships with the clients but they aren't paying for that work, and it was 31% of my effort. On proposals, I worked with 72 organizations out of which 51 clients, 17 partners in delivery for the customers, and 4 internal customers. 

You can guess from the numbers that this was particularly challenging. I learned to take notes, categorize and model problems and solutions through consistent practice. And I appreciate the window into the challenges of testing that the work I hold now offers. If only I could figure out how to better turn it into impact - and I will. 

Proposal category included also all of the public speaking I did that was not paid, which was 23/25 sessions I delivered in 2025. 

I spoke on: 
  • AI, particularly Agents in GitHub Copilot for non-automation use cases
  • Python, teaching a 8-piece series of python for testing at work instead of complaining for same amount of time that some people did not know the basics - they do now. 
  • Contemporary exploratory testing, seeing versatile problems in target applications and combining automation into it
A particular learning event that of the year was making a typo worth 24M euros, a definite gem to my collection of costs of bugs. That story gets better as soon as I prove that without the typo, the 24M would have been ours and I may need another year for that. :D

Socials

The year on social media was interesting. I got to feel the split on multiple dimensions. 

I was on Mastodon (a lot!) and LinkedIn, and I missed the time of Twitter. On Mastodon, I tallied to 1.1K followers and  4.2K posts. On LinkedIn, I tallied 12,008 followers, 778,849 annual impressions (-34.1% less than previous year). I blogged less than before, and had 314K views taking the total now to 1,422,454 page views. 

I was in Finnish and English on LinkedIn, and missed the possibility of just picking one. 

I learned that a fairly popular blog with consistency showing up combined with AI makes AI answer my kinds of questions better, without referencing me. If you care about work (results) over the credit, best ideas win - never been more true. 

Challenges

As we all should know, movement is not work. Effort alone does not bring impact. So I find myself making a tally of the challenges.
  1. Level of testing skill
  2. The controls at scale organization, allocation and targets
  3. Sense of agency with understanding of impacts
If there was a theme of insights this year solidified, it was the insight on lack of testing skills in testers. There is a little too much reliance on external sources or serendipity, and a little too little of intentional search for relevant information and continuously improving testing systems. With the exceptional levels of educational materials available, the level of this information turned into practice continues to be a challenge. The issue has grown over time since testers are, due to various transformational reasons, more often one in a team of developers. 

To address something in scale is a whole another problem. I learned about changing organizations in practice and how organizations, money allocation and target setting are tools of scale that I have been so bad at and need to learn more on. I have focused on problems of information and examples, but I need to solve the frame that allows people to change. 

There's a lot of learned practices of weaponized helplessness and lack of seeing systems for own impact that means a lot of people find themselves smaller cogs in the system than I would believe they have power on. Our ideas of what is possible and what is safe, and what is mine to change or even comment on have a significant impact on what we are capable of achieving together. 

Impacts 

I'd like to think that some of the testing advice or inspirations I have provided this year have impacts that I only learn later on. Kind of like receiving a message this year from two people I worked with 10 years ago, one telling that I still impact their career on a regular basis due to the timing of when our professional paths crossed, and another telling me their organization has now better diversity mechanisms because our time together was one where I invested effort into letting people know I am not "guys" and that I would risk personal negative consequences for working for social justice. 

So with all the reflection, I leave a call for myself and the community around me on finding out ways of fixing challenge 1 - skill of testing. While I have a sense of need of personal contribution in that space, I also know that the only way we solve problems in scale is democratization of knowledge and working together. So that is up next, going for 2026. 

Closing off

I still think my reflection wins over what social-media-based AI tools can do. Top quote is my challenge 1. 


And me on Github (screenshot on Mastodon). My work is not on Github. 


Monday, December 15, 2025

Participant skills to retrospectives

 I'm an avid admirer of retrospectives and the sentiment that if there was one agile practice we implement, let it be retrospectives. We need to think about how we are doing on a cadence, and we need to do it so that we ourselves get to enjoy the improvements. Thus retrospectives, not postmortems. Because let's face it: even if I learned today that we have a lessons learned from past projects database, for me to apply other people's lessons learned, it's likely that no amount of documentation on their context is sufficient to pass me the information. Retrospectives maintain a social context in which we are asked to learn. 

Last week I argued that best of retros I have been in were due to great participants with an average (if any) facilitator. My experience spoken out loud resulted in a colleague writing a while blog post on facilitators skills: https://www.linkedin.com/pulse/retrospectives-why-facilitators-skills-matter-more-than-spiik-mvp--fmduf/?trackingId=yc302cSWzR0ZhHTSNoyVmQ%3D%3D

Success with retrospectives appears to be less of an issue with roles and responsibilities, than of a learning culture. When in a team where each gig is short and you could be out any moment (consulting!), it takes courage to open up about something that could be improved. The facilitator does not make it safe. The culture makes it safe. And facilitator, while they may hold space for that safety and point out lack of it, anyone can. 

When someone speaks too much, we think it's the facilitator skills that matter in balancing the voices. Balancing could come from anyone in the team. Assuming the facilitator notices all things feels like too much on a single person. 

Building cultures where work does not rely on a dedicated role is kind of what I'd like to see. Rotating the role on the way to such state tends to be a better version that having someone consistently run retros. 

Having facilitation skills correlates with having participation skills too. At least it changes the dynamic from passive participant already afraid to express their ideas to an active contributor.