Thursday, July 31, 2025

Code Reviews Have Already Changed

Reading just about anything these days, your mind is filled with doubt: is the thing I am reading written by a person or generated with AI (from a prompt of a person). And that matters because if writing takes 1 minutes, reading it 600 times for 10 second takes 100 minutes. Producing text can be automated. World has changed that way.

Reading was the problem already before generative AI. In my TSQA talk from 2022 'Something in the way we test', I was addressing the ridiculous notion of writing 5000 test cases or spending 11 working days just reading them through. In the upcoming 2 years after that I learned over and over again that there really was no relevant answers to any of the realistic queries we had with business representatives captured in the 5000 test cases I chose to put to the side. 

Reading is an even more significant problem now with generative AI. That makes reviewing before making people read things more essential than ever. 

Six months ago I started drafting a talk (without a stage for it to be presented) with the title: 
Code Reviews Have Already Changed

This was a talk that built on the TSQA talk, with a genAI perspective of recent experiences and a call for action in really learning to do what I had another talk formulating experiences with:

RAGified Exploratory Testing Notetaking

This talk was built on years of experiences of taking notes, and how those were supercharged when using them with genAI in RAG-framing. 

During summer break I came to realize that I don't have need of doing talks, so I might just as well write selected pieces into my blog.  

So here are my two selected pieces: 

1) Selenium open source project year of CodiumAI / Qodo

While active as a member of project leadership group for Selenium, I had fun watching dynamics that I wanted to go back to with research perspective. That is, an AI review assistant was in place, and it had fascinating impacts. 

Dehumanized feedback was easier to dismiss - emotion management for giving feedback in code reviews is a real thing, and having genAI generate code reviews generated a stream of silent dismissals UNTIL it finds something relevant that reveals people read. The combo of PRs and chats provides a fascinating record of this, showing that most feedback does not warrant a reaction and is clearly easier to dismiss than real people's review. 

Simultaneously, you could see AI lowering the bar for people to try contributing without always knowing what they were doing. Some open source projects went as far as refusing AI-assisted contributions. Others, like Selenium, saw an increased load of attending to people's emotions when they would get feedback from reviews. 

Code reviews have changed: 

  • There is more of them to do, and the first reviewer post AI does not seem to exercise sufficient critical thinking with context
  • Knowing something is AI-generated is valuable information to save time on emotion management labour
2) Ownership of generated code / text is hard at work

A colleague in-between-projects was working with me on a proof of concept on a particular combo of platform product + test automation. People in between projects have a lot of time and focus, while those of us in projects have other things going on. Instead of my assistance, genAI was present for a month. 

When the day came that I reviewed the code, result was deleting most of it as unnecessary. There same capabilities were already available, and you'd know if you read the first page of the tool tutorial. 

"GenAI made me do it" was poor excuse for not reading a definitive tutorial and creating something to delete for a month. 

Similarly, more recently, someone worked a week before I was available. On the week I was available, I spent 2 days reading what had been written and trying to preserve some of the work coming before mine, with a lot of effort. Today I learned that I had spent 2 days preserving AI-generated text because you just don't throw away a reported week of someone else's work. 

Ownership has changed:
  • Your agency is essential. Accepting delays in feedback still causes trouble. 
  • Knowing your text is AI-generated (include prompts) would be more helpful way of contributing to next steps than generating the text. Protecting feelings costs time. 
  • Generated text, with knowledge of it being generated, acts as external imagination and sets you up to compete to do better. 
If I took the time to put this together into the story I would tell on stage, it would still make a great talk. For that to happen, someone needs to do the work of inviting me because I will not do the work of participating in calls for proposals. And I won't pay to speak.



Wednesday, July 30, 2025

Accepting and constraining leadership

As I leave my summer vacation 2025 behind, I have learned a bit about leadership and decision making. Particularly, I have learned that that in my role as service director responsible for testing services and applying AI in application testing, I have also hit a threshold on my capability of making decisions. 

That means that the entire summer vacation, was an exercise of distributed decision-making in the family, and requiring my teenage children to take their fair share of deciding what is it that we are having for lunch today. 

For someone who spent decades fighting against the idea of being called "leader" with an identity of always rather being the first follower for ideas worth amplifying, the amount of decisions required has been overwhelming. Overwhelming enough so that I no longer want to be deciding what to eat, when and where to meet but more like hating the fact that so few people step up to doing their share. 

Reading some of the things people write about management roles isn't exactly helping. The ideas of assigning power over people with a blank slate bother me. So I looked around for models that could help me make sense of it all. And I wanted to share three favorites. 

First model is about splitting power, as a reminder of having always that power to self-organize - expectation of agency, autonomy and empowerment. Kind of foundation to my ideas of why I did not want to go to leadership, I was comfortable on decisions that would impacted my results - immediate and growth. With the new role, I apply more power with -kind of approaches and classify the minimal scope of things where I have to apply power over anything or anyone. 

What helped me make sense on team level, was mandate levels model by John Cutler. Recognizing how assignments arrive and are framed, and clarifying the framing has been invaluable. 

And finally, the third one is more oriented to how I communicate on decisions, since the framing I have is clearly less than obvious to others. I need more words to explain where I am coming from. 

For the decisions of lunch, we did "What is the issue and how should it be dealt with?". Out of all the degrees of consultation, it was the only one that helped with my sense of overwhelm. 

So I have a new frame for the leadership I exercise: I continue to reject power over, and lead towards power with and power to. It may be more work, but it is work that fits my ideas of how the world should be. 

Saturday, June 28, 2025

The Art of Framing

There's an actively harmful institutionalized practice in the world of testing: ISTQB certifications. It is one of those institutions at this point of my career that I am risking more than I can ever hope to gain to speak against it, and my realistic option to speak against it is to make it extra personal. 

I'm well aware that for my work, I have customers who, much to my dislike, require ISTQB certifications from our consultants. I have lots of colleagues with those certifications, I just calculated that we have 30 individuals with Advanced level certifications, just like I counted that while I am one of the 30 individuals, I have now 4/4 of the Advanced level certifications. 

Yesterday I added ISTQB Advanced Test Automation Specialist certification on myself, because yesterday it awarded my employer 1 point more for a bid with a deadline at 4pm. I got mine at 2pm so that I would do everything in my power to ensure me and my lovely colleagues have a chance to work for next five years. 

I wasn't the only one working to get the certificate. I had 8 colleagues who worked on the certificate. They supported my learning by showing up to do five 1h ensemble learning sessions where we learned to answer as the test requires, even when that is incorrect advice for real projects. Being a social learner, those 5 hours of failing to answer right in front of my peers were how I became certified.  I was particularly proud of a colleague who, while being an excellent tester and test automator with 5 years of experience, struggles with the type of learning the certification requires you to complete. They used full working days for four weeks on intense study to go through both Foundations and Advanced Test Automation, and while they failed on the latter, they will get through it soon. 

ISTQB certifications are a harmful institutionalized practice, because none of us who want work for next five years are powerful enough to walk away. If our customers choose us based on it, we jump through the hoops of getting it. Our customers choose what training our people get, and it is harmful that they choose this. Because this does not create good testers. There are real options that create good testers, and they only don't choose those, but because of the system of having limited budget for training, they choose they don't get people with the right courses. 

What are the right courses then? Well, anything that I have taken that has grown me into the guru I am now. I still say guru with a tongue in cheek, because while I know that I do good testing and design good groups of people who do testing, I'm not a guru. I am someone who being rather unremarkable, got the change of being remarkable. 

Education and learning is my gist. I promote it, I work for it. Because we need something better institutionalized than this ISTQB certification stuff. I am where I am because I learn. And I will become more deserving of guru because I am not done learning.

All of the above is how I frame why I choose to show up every now and then to share a perspective against ISTQB while establishing that I know the contents, not just from research but being able to pass that mark. I quote my credentials because the customers who I hope will learn to ask for something better don't know me for guru, they know me for a tester. And testers aren't high and mighty. Even the ones who are gurus are bottom feeders in the overall hierarchy of roles.

I remind people that: 

  • I am one of the authors of ISTQB Foundation Syllabus - and my effort of making it better were undermined for business reasons of established training providers
  • I hold copyright that I never passed on to my work on ISTQB Foundation syllabus. They sent me a contract and I never signed. No matter what they write in the forewords, it is a lie and reflects on how they operate
  • I have limited time and I choose to be for not against and I am for better use of our learning time. Go to conferences. Take BBST courses. Support independent course providers by taking their courses. Your people will do better work if you make better choices. 
  • I hold ISTQB Foundation certificate (CTFL), ISTQB Advanced Test Manager, ISTQB Advanced Test Analyst, ISTQB Advanced Technical Test Analyst and ISTQB Advanced Test Automation Engineer certificates. I passed them by learning to answer as they wanted, not what is right. 
Credentials are to establish that I have done my fair share of learning what I speak about. And I speak about giving the month of learning time for my colleagues to do better courses. 

There is something I have that many of my colleagues don't, and it's not the guru status. It is a support network that allows me to do first work and then an equivalent of learning on my own time. I am not bound by the same budget limits because I have made a choice 20 years ago to use my holidays and free time on speaking at conferences. I consider reading professional books fun pastime. And I acknowledge that this choice, being a single mom of two kids, from a family with single mom and six kids, required an exceptional support network in my extended family. 

Most of us get one course a year, and we need to make good choices. With ISTQB institutionalized as it is, the choice is harmful for our industry. 

All of this is a long response to a comment from a colleague that this time was brave enough to leave the comment visible long enough I could read it. The two previous ones I could guess had a similar sentiment of hating me and framing me with much less cause than I do. 

We need better institutions. And while I may use me to get the message across, I also chose my current position to get the message across to more people that matter. My colleagues can get daily education and I use my power, the bit that hold internally on answering all the questions of non-tester managers asking how to support their brilliant testers on their growth paths. 

No matter the platform I am on, I am dragging people to stand there with me if they want it. And generally speaking, we want to do well at our work. My metric is now how far I have climbed, but how many people I get to climb with me. 






Friday, May 2, 2025

Name Three Smells for This Week

The world of testing is not ready and done. We are testing, sure, yet there is so much work on doing better we would need. I wanted to share the three stories that blow my mind this week. Remember, while I see a lot of projects that my company works with, I also see a lot of projects my company does not work with but community at large works with. 

These are my smells of the week. 

1. Manual automated testing

Testers have been automating testing for some years, and have some hundreds of test cases automated. Sounds great. But something feels a little off - for quite some time now, no new tests have been added; effort of testing stays on a high level as the reason quoted is maintenance. 

With a discussion, we dig in a little deeper. Tests are not reliably green, which then means they are run with an isolated pipeline step that is executed manually. 

Asking another, there isn't even a pipeline. The automated tests are executed manually on a testers computer, on demand. 

Necessary cleanup: quarantine for unreliable as route to move your tests to actual pipelines that run on change. 


2. Leaky regression testing

A new release went to production, a month of regression testing was done before it, and yet, 20+ problems reported within days post-production. Maybe a month is not sufficient?

With a discussion, we dig in a little deeper. We learn that regression testing in this case means that we don't look at the changes that were done, but we protect basic business scenarios by repeating the same tests we did every time. The trouble is, changes are different every time, and they tend to not break on the basic business scenarios but on where ever the changes were made. Yet we insist on more blind regression testing over opening our eyes to try and follow the changes. 

Necessary cleanup: stop pretending regression tests without change-based analysis are the solution. Implement everything as code, follow the changes and you are likely to do better. 


3. Test case obsession

A new senior tester joins a product team with a history of high automation and success with continuous integration and deployment. The tester concludes that not having a manual tests repository is a problem, and works to transform, manually, automated tests to test cases because they have always seen test cases in Jira Xray or equivalent as core of what good testing is to them. 

Necessary cleanup: understand that code is documentation too. Be careful with the kind of experiences you recruit for. 



Wednesday, April 16, 2025

Resultful Testing

All of March, I did not manage to make time to blog. There was a lot going on:

  • Modeling and understanding how to describe the differences in results of testing for different setups of testing
  • Modeling and finding people with a contemporary exploratory tester profile, where people would need to know how to test, program for test purposes, and collaborate with various stakeholders
  • Experimenting with an exercise for people to test and program for test purposes to see if they fit the contemporary exploratory tester profile. 
  • Selenium & Appium Conference in Valencia, Spain
  • Usual work
A lot of my thinking is around the idea that to recognize resultful testing (testing that produces results it would be fair to expect of testing), you need to test and know testing. There is a lot of experiences where people of various roles believe they can't test in a resultful scale. They can test, kind of like everyone can sing, but not test, kind of like not get on stage to sing for an audience of hundreds. Resultful is limited by attitudes, effort put on testing, and to a degree, abilities of people. Believing in growth mindset, however, ability and skill follows effort. 

There are many teams without testers who do a good job on resultful testing. Some of these teams heavily rely on acceptance testers in customer organizations to complete the results, but others have more product development default of enjoying the results without acceptance testing. 

There are also many teams with testers, who do a good job on resultful testing. And there are separate testing teams that glue on resultful testing, kind of like acceptance testers would do, but by representing the same or different organization but at least being a separate independent team that still collaborates. 

This is nothing new, but a movement that the entire industry has been in for years. Testing tends to be more valuable integrated with development. It's feedback that, when trusted in the teams, is a productivity boost, not just a time saving on testing by automating it. 

I find myself going back to the model I created a few years ago on results gap, and classifying it by splitting projects into two categories. 


Sometimes the results gap I work with as a tester feels like my work is garbage collecting, and coaching a team on not to litter. Other times, I work for real quality and surprises. 

My assignment, regardless is this: Find (some of) what others may have missed. Present it in consumable chunks, prioritized and enriched with context information for the decisions. 
 



Tuesday, February 25, 2025

Habit of AI in Quality Engineering

I did a talk today, and published slides. I encouraged actively seeking uses to learn, rather than seeking the one tool that does the thing for you. 

https://speakerdeck.com/maaretp/habit-of-ai-in-quality-engineering

Someone from my audience provided an AI-generated example of exactly that one-up attitude to your use of time and speech to text, test to summarized my messages with a basic approved chatbot genAI. 

Overlayed in red things where either I or AI missed the mark on the message. 

  • Introduction to AI in Quality Engineering:
    • The document begins with an introduction to the speaker's experiences with AI in quality engineering, highlighting the relevance of AI tools in improving testing processes.
    • The speaker shares their initial encounter with GitHub Copilot in 2021, emphasizing its potential to assist in coding and testing tasks.
  • Job Interview Experience:
    • The speaker recounts a job interview where they used GitHub Copilot during a pair programming session. This experience demonstrated how AI could assist in testing and coding tasks.
    • The interview involved converting integers to English words, and the speaker used GitHub Copilot to generate multiple solutions, showcasing the tool's capabilities and limitations.
  • Evolution of GitHub Copilot:
    • The document details the evolution of GitHub Copilot from its early access days to its current state in 2025. Initially, the tool had limitations, but it has since seen significant improvements and new features.
    • The speaker highlights the introduction of features like multi-file context, chat interfaces, and agent-type functionalities, which have enhanced the tool's usability and effectiveness.
  • Role of AI in Testing:
    • AI is portrayed as a valuable pair programmer and testing assistant, capable of generating code and test cases. However, human oversight and decision-making remain crucial to ensure quality and accuracy.
    • The speaker emphasizes the importance of evaluating AI-generated outputs and making informed decisions based on the context and requirements of the task at hand.
  • Building AI Habits:
    • The document discusses the importance of integrating AI tools into daily tasks to build effective habits. The speaker encourages using AI tools like GitHub Copilot and Edge Copilot to enhance productivity and learning.
    • Practical advice is given on how to start using AI tools in everyday work, such as seeking second opinions, refining prompts, and continuously learning from AI interactions.
  • Challenges and Considerations:
    • The speaker addresses the challenges of balancing automation with manual testing, noting that while AI can assist in many tasks, it is essential to maintain a human touch to ensure comprehensive testing.
    • Considerations are given to the quality and accuracy of AI-generated outputs, emphasizing the need for thorough evaluation and validation.
  • Future of AI in Quality Engineering:
    • The document concludes with a forward-looking perspective on the potential of AI to transform testing practices. The speaker envisions a future where AI tools are seamlessly integrated into quality engineering processes.
    • Continuous learning and adaptation are highlighted as key factors in leveraging AI effectively, with an emphasis on staying updated with new features and advancements in AI technology.

The summary layers a tone that isn't mine. The tone of *focusing on today*, considering learning a journey, and the conclusion that for anything and everything we do, we are either learning or contributing. It's that learning that saves percentages of time to allow us to do more, not the autogeneration of contributions. Learning is curation of AI-contributions, deciding on the reigns. 

Friday, February 21, 2025

Remembering job interviews

As I delivered a talk today on 'Lessons Learned from Landing a Job Offer with GenAI', someone from the audience wrote a comment on an impressive learning journey a serendipitous chain of events started for me. The comment was dwelling in the back of my mind, bringing out realization that I have quite a collection of stories of what job interviews can be like for a senior. Some of the jobs I ended up taking and others not, as interviews are two-sided explorations of needs and aspirations. I wanted to draft together a view into some of these. 

The interview when GitHub Copilot was new

I had gone through a few rounds of conversations, and the last step of the process was a pair programming interview. It was my first pair programming interview, and I approached the interview with concerns: 

  • Testing ME on programming skill? I'm a tester, and while I write code and pair on all kinds of tasks, writing code is not where I shine. Layering feedback on top of code as it's been written or as it has been written, that's my ballpark. 
  • Pairing in an interview? Watching me do without working with me is not pairing. That alone was enough to feel wary. 
The instructions were to come with IDE and setup of my own, and we'd work from there. 

The serendipitous event that made this a story worth a stage was that GitHub Copilot had just been released to the public. October 29th 2021 was release date, and December 10th 2021 was my interview date. I also was lucky to get access early on. 

When I could have been preparing for interviewing without generative AI, I just could not resist using it, and taking it along with me. I also had an idea: if I was to show what I do as *tester* on code, having a programmer who is not refusing to pair with me would be a good idea. 

I practiced with Roman Numerals -kata. I learned a lot about how it would be tested by doing the usual moves I do for exploratory testing. I read up some "specs". I generated a selection of outputs programmatically for inputs. I compared generated outputs for outputs of other programs, particularly a web application with praise on how good it was for roman numerals, and excel. 

While I did not become the Roman Numerals expert I am now, I got started. I am the expert on it I am now, because I turned that into an exercise I did with some hundred people, crowdsourcing learnings and collating those into a talk known as 'Let's Do a Thing and Call It Foo'. 

Showing up to the interview, I found it a little funny how the exercise I was expected to work on was not Roman numerals (1 --> I) but numbers to text (1 --> One). These are very similar problems. 

While I recognized the idea of my pair was to get me to write examples TDD style and grow the application, we did only a little bit of that. We ended up writing a line of comment, selecting our chosen implementation from GitHub Copilot 10 options list, and focusing on writing example tests and approval tests, and talking about my choices of those. 

I let the company know I would not be joining fairly soon after. Being the only tester and the only woman, and expecting my life to include bringing that perspective in did not feel like the right personal choice. They would not have understood extra load that places on me. 

The interview where we tested together for the whole day

I had again gone through multiple layers of conversations, and even a full day psychological evaluation to quality me as a possible leader for the organization. I had given them an option of seeing me in action. I was training a course concept of exploratory testing where we could teach my future colleagues (and I could meet them), with us testing their application. We set that up. 

The course is really fairly standard, and I have run that for a lot of different companies and knew what I was up against. The experience was also fairly standard: we found, with my facilitation, significant bugs in their latest version of software that they considered important and had not found without my facilitation. I learned my future colleagues would have been lovely, and they would have welcomed me to the organization. 

I ended up not accepting the position after I felt they should have offered to compensate me for the day I taught them, and I did not like the way one of the hiring managers was challenging testing when they should have shown support. 

The interview that was a workshop on creating my own job description

This interview was for a company that I knew already I wanted to work at. That is a lovely starting point, and the whole interview experience was really built around allowing me a good start at the right place of the company. 

The two interview sessions set up for me were with colleagues I interviewed for what they'd like from someone like me to incorporate that into my job description. I wrote my own job description that then became a part of the offer I accepted. It was a great way of landing me with support into a fairly big and complicated, even siloed organization. 

Exceptional, and I worked there for multiple years. 

The interview that was psych evaluation telling me I am not fit for my career

This one was me applying for a position with a standard approach. I went through interviews that I don't remember in particular, but the half-day testing at an evaluation center, that one I remember. This was my first experience of those, and I had significantly more than 10 years of tester career behind me by then. Unlike the other psych evaluation that was assessing my strengths and weaknesses as a manager, this one was testing my intellectual abilities and creating a profile of my preferences with a questionnaire of some sort. 

I will always remember how hilarious I found it that I got a paper telling me I am not likely to be successful in a tester career. Not only I was it then, but continued to be after it, but at least know I have it on paper that no one should let me test. Apparently I am not cut out for it. Or, they don't know why would be cut out to do a good job on testing. That is more likely.

The organization said they have a principle decision on not hiring without this service provider's recommendation. I did not get the job, and I am not convinced I would have taken it even if it was given to me. 

The interview where we went through an improvement plan where texts were written by me (CC licensed)

On this one, the recruiting manager came in with a TPI (test process improvement) assessment report, to discuss my approach to helping them improve testing while doing it. The conversation was lovely, but the report made it memorable. I had written most of the texts in that report. Not that the recruiting manager knew that before. 

It became soon clear that I knew the structure of the report, the likely conclusions, could enhance them in the moment. And correct some of the mistakes I had in my public creative commons -available materials that had been helping write that report. 

I took the job, and loved my time there. 

The interview where they made me test a text field

This one was a fun one. It was my second time joining the same organization, and a result of people I had worked with before inviting me to interview with them. While I had been gone, things had changed. In the interview I had an architect who wanted me to show I know how to test because that is apparently how you test testers. 

They asked me to test a text field. And I told them I was around when that assessment  exercise was created and had talked about the exercise on conference stages since, enhancing the context of the exercise to actually having a text field we could test - with real context to it. 

They asked me to test a chair then, or tell me how. I refused to play along, politely. I did not consider that something that was a worthwhile test for my skills, but more of a humorous conversation. 

I got the job, took the job and absolutely loved my time there. More than anywhere else, even if I have loved working where ever I have been. 

The interview where they made me test notepad

This one was my foundational interview, for 1st ever job I had in testing. I had no clue what testing is. They invited me to a classroom setup with many other people, sat me in front of a computer and told me to report discrepancies between English and Finnish versions of Notepad. Some bugs had been seeded into the Finnish version, and I was expected to systematically report them. 

This is how I became a tester. 

The interview where we talked about meaningful work

To conclude this, I need to talk about my last job interview that landed me this position I am in now. It was pleasant experience of meeting people twice, to talk about my aspirations, my search for meaningful work and meaningful systems, and their organization. 

It felt painless, collaborative and appreciative. Then again, the people interviewing me were aware of me and my work, even if they did not know me. 

I'm very happy I accepted the position, and I am even more happy that they made the position something I could not have known to ask. Carving the right shape for me is what I appreciate the most

Others I don't remember specifically

I'm sure there are others. After all, I have been around a while. I have been loyal to my employer for the time I am there, and open about my ideas of what I want to spend my limited time on next. Knowing I will commit a minimum of two years and work to leave my places of work in a better state than they were has generally been helpful. 

Are your stories as varied as mine?