Tuesday, April 7, 2020

Developer-centric way of working with three flight levels

The way we have been working for the last years can best be described as a developer-centric way of working.

Where other people draw processes as filters from customer idea to the development machinery, the way I illustrate our process is like I have always illustrated exploratory testing - putting the main actor in the center. With the main actor, good things either become reality or they fail doing so.


In the center of software development is two people: 
  • the customer willing to pay for whatever value the software is creating for them
  • the developer turning ideas into code
Without code, the software isn't providing value. Without good ideas of value, we are likely to miss the mark for the customer. 

Even with developers in the center, they are not alone. There's other developers, and other roles: testers, ux, managers, sales, support, just to mention a few. But with the developer, the idea either gets turned into code and moved through a release machinery (also code), and only through making changes to what we already have something new can be done with our software. 

As I describe our way of working, I often focus on the idea of having no product owner. We have product experts and specialists crunching through data of a versatile customer base, but given those numbers, the product owner isn't deciding what is the next feature to implement. The data drives the team to make those decisions. At Brewing Agile -conference, I realized there was another way of modeling our approach that would benefit people trying to understand what we do and how we get it done: flight levels.

Flight levels is an idea that when describing our (agile) process, we need to address it from three perspectives:
  1. Doing the work - the team level perspective, often a focus in how we describe agile team working and finding the ways to work together
  2. Getting the work coordinated - the cross team level of getting things done in scale larger than a single team
  3. Leading the business - the level where we define why the organization exists and what value it exists to create and turn into positive finances
As I say "no product owner", I have previously explained all of this dynamic in the level of the team doing the work, leaving out the two other levels. But I have come to realize that the two other levels are perhaps more insightful than the first.

For getting work coordinated, we build a network from every single team member to other people in the organization. When I recognize a colleague is often transmitting messages from another development team in test automation, I recognize they fill that hole and I serve my team focusing on another connection. We share what connections give us, and I think of this as "going fishing and bringing the fish home". The fluency and trust in not having to be part of all conversations but tapping into a distributed conversation model is the engine that enables a lot of what we achieve. 

For leading the business, we listen to our company high level management and our business metrics, comparing to telemetry we can create from the product. Even if the mechanism is more of broadcast and verify, than co-create, we seem to think of the management as a group serving an important purpose of guiding the frame in which we all work for a shared goal. This third level is like connecting networks serving different purposes. 

The three levels in place, implicitly, enabled us to be more successful than others around us. Not just the sense of ownership and excellence of skills, but the system that supports it and is quite different from what you might usually expect to see. 

Saturday, March 28, 2020

A Python Koans Learning Experiment

I'm curious by nature. And when I say curious, I mean I have hard time sticking to doing what I was doing because I keep discovering other things.

When I'm curious while I test, I call it exploratory testing. It leads me to discover information other people benefit from, and would be without if I didn't share my insights.

When I'm curious while I learn a programming language, I find myself having trouble completing what I intended, and come off a learning activity with a thousand more things to learn. And having a good plan isn't half the work done, it is not having started the work.

On my list of activities I want to complete on learning Python, I have had Python Koans. Today I want to complete that activity by reporting on its completion and what I learned with it.

Getting Set Up

The Python Koans I wanted to do were ones created by Felienne Hermans. On this round of learning yet-another-programming-language (I survived many with passing grades at Helsinki University of Technology as Computer Science major), I knew what I wanted to do. I picked Koans as the learning mechanisms because:
  • Discovery learning: learning sticks in me much better when instead of handing me theory to read, I get examples illustrating something and I discover the topic myself
  • Small steps: making steady progress through material over getting stuck on a concept - while Koans grow, they are usually more like a flashlight pointed at topics than requiring a significant  step between one and the other 
  • Test first: as failing test cases, they motivate a tester like myself to discover puzzles in a familiar context
  • Great activity paired: social learning and learning through another person's eyes in addition to one's own is highly motivating. 
  • Exploratory programming: you do what you need to do, but you can do what else you learn you need to do. Experiment away from whatever you have to deeper understanding works for me. 
This time I found a pair and mechanism that worked to get us through it. Searching for another learner with similar background (other languages, tester) on Twitter paired me up with Mesut Durukal, and we worked pretty consistently an hour a day until we completed the whole thing in 15 hours. 

The way we worked together was sharing screen and solving the Koans actively together. After completing each, we would explore around the concept with different values or extending with lessons we had learned earlier in Koans, testing if what we thought was true was true. And we wrote down our lessons after each Koan on a shared document.

The Learning

Being able to look back to doing this with the document as well as tweets two months after we completed the exercise is interesting.  I picked up some key insights from Twitter.

Looking at out private document, the numbers are fascinating: 382 observations of learning something.

With 15 hours, that gives us an average of 25 things in an hour.

On top of those 15 hours, I had a colleague wanting to discuss our learning activity, and multiple whiteboarding sessions to discuss differences of languages the learning activity inspired.

Next up, I have so many options for learning activities. Better not make promises, because no matter how publicly I promise, the only thing keeping me accountable is activities that we complete together. Thanks for the super-fun learning with you, Mesut!

Users test your code

In a session on introduction to testing (not testers), I simplified my story to having two kinds of testing:

  • Your pair of eyes on seeing problems
  • Someone else's pair of eyes on seeing problems
My own experience in 99% of what I have ended up doing on my 25-year is that I'm providing that second pair of eyes, and working as that has made me a tester by profession.

Sometimes the second pair of eyes spend only a moment on your code as they are making their own changes adding features (another developer) and you do what you do for testing yourself. Sometimes it becomes more of a specialty (tester). And while the second pair of eyes often is used to bring in perspectives you may be lacking (domain knowledge), there is nothing preventing that second pair of eyes having as strong or stronger programming knowledge that you do. 

You may not even notice your company has second pair of eyes, as there's you and then production. Then whatever you did not test gets tested in production, by the users. And it is only a problem if they complain about the quality, with feeling strong enough to act. 

To avoid complaining or extensive testing done slowly after making changes, modern developers write tests as code. As any second pair of eyes notices something is missing, while adding that, we also add tests as code. And we run them, all the time. Being able to rely on them is almost less of a thing about testing and quality, and more of a thing about peace of mind to move faster. 

In the last year or so, my team's developers have gotten to a point where they no longer benefit from having a tester around - assuming a non-programmer tester covering the features like a user would. While one is around, it is easy to not do the work yourself, creating the self-fulfilling prophecy of needing one. 

Over an over again, I go back to thinking of one of my favorite quotes:
"Future is already here, it is just not equally divided"
I believe future is without testers, with programmers co-creating both application software and software that tests it. I believe I live at least one foot in that future. It does not mean that I might not find myself using 80% of my time testing and creating testing systems. It means that the division is more fluid, and we are all encouraged to grow to contribute beyond our abilities of the day.

The past was without testers but also without testing. To see the difference of past and future, you need to see how customer perceives value and speed. Testing (not testers) is the way to improve both.










Wednesday, March 25, 2020

One Eight Fraction of a Tester


As I was browsing through LinkedIn, I spotted a post with an important message. With appropriate emphasis, the post delivered its intended point: TEST AUTOMATION IS A FULL TIME JOB. I agree. 

The post, however, brought me in touch with a modeling problem I was working through, for work. How would I explain that the four testers we had, were all valuable yet so very different? The difference was not in their seniority - all four are seniors, with years and years of experience. But it is in where we focus. Because, TEST AUTOMATION IS A FULL TIME JOB. But also, because OTHER TESTING IS A FULL TIME JOB. 

As part of me pondering this all, I posted on Twitter: 

The post started a lively discussion on where (manual) testers are moving, naming the two directions: quality coaches teaching others to build and test quality and product owners confirming features they commenced. 

The Model of One Eight Fraction of a Tester

Taking the concepts I was using to clarify my thinking about different testers, a discussion with Tatu Aalto over a lovely refreshing beverage enjoyed remotely together drew the mental image of a model I could use to explain what we have. With two dimensions of 4x2 boxes, I'm naming the model "One Eight Fraction of a Tester".

1st Data Point

In our team, we have six developers and only one full-time manual tester. I use the word manual very intentionally, to emphasize that they don't read or write code. They are too busy with other work! The other work comes from the 6 super-fast developers (who also test their own things, and do it well!) and 50+ other developers working in the same product ecosystem. Just listing what goes on as changes on a daily basis is a lot of work, let alone seeing those changes in action. Even when you leave all regression testing for automation. 

The concern  here is that story and release testing both in our context could be intertwined with creating test automation. For level 1 testing to see features with human eyes, that could also happen while creating automation. 

Yet as the context goes, it is really easy to find oneself in the wheel, chipping away level 1 story testing "I saw it work, maybe even a few times", story after story, and then repeating pieces of it with releases. 

2nd Data Point 

A full time exploratory tester in the team, taking a long hard look at where their time goes, is now confessing that the amount of testing they get done is small and the testing is level 1 in nature. The coverage of stories and releases is far from the tester focusing there full time. Instead, where time goes is enabling others in building the right thing incrementally (product owner perspective) and creating space for great testing to happen (quality coach perspective). While they read code, they struggle to find time to write it, and they use code for targeted coaching rather than automating or testing.
The concern  here is that no testing is getting done by themselves. Even if they could do deeper story testing, they never practically find the time. 

As the context goes, they are in a wheel that they aren't escaping, even if they recognize they are in it.  

3rd Data Point

A most valued professional in the team, a spine of most things testing is the test automation specialist. They find themselves recognizing tests we don't yet have and turning those ideas into code. While they've found, with support of the whole team, particularly developers, time to add to coverage not only maintain things functional, maintenance of tests and coordinating that is a significant chunk of their work. While they automate, they will test the same thing manually. While they run the automation, they watch automation run to spot visual problems programmatic checks are hard to create for. That is their form of "manual testing" - watch it run and focus on things other than what the script does. 


The concern  here is that all testing is level 1. Well, with the number of stories flying around, even with all groups groups of developers having someone like this writing executable documentation on expectations exist, they still have a lot of work as is.

As context goes, they too are in a wheel of their own with their idea of priorities that make sense.

4th Data Point

Automation and Infrastructure is a significant enabler, and it does not stay around any more than any other software unless it is maintained and further developed. The test automation programmer creates and maintains a script here and there, test a thing here and there but find that creating that new functionality we all could benefit from needs someone to volunteer for it. Be it turning manually configured Jenkins to code in a repository, or our most beloved test automation telemetry to deal with the scale, there is work to be done. As frameworks are best being used by many, they make their way to sharing and enabling others too.


The concern here is that no testing gets done with a framework alone. But it without framework, it is also slower and more difficult than it should be. There are always at least three main infrastructure contributions they could make when they can fit one into their schedule, like any developers. 

They have a wheel of their own they are spinning and involving every in. 

Combining the data points

In a team of 10 people, we have 10 testers, because every single developer is a tester. With the four generalizing specializing testers, we cover quite many of the Eights.
The concern here is that we are not being always intentional in how we design this to work, it is more of a product of being lucky with very different people.

The question remains for me: is the "Story Testing lvl 10" as necessary and needed I would like to believe it is? Is the "Story Testing lvl 1" as unnecessary to separate from automation creation as I believe it is? And how things change when one is pulled out - who will step up to fill the gaps?

How do you model your team's testing?

Monday, February 10, 2020

Business Value Game - What if You Believed Value is Defined by Customer, Delivery-time?

Over the years of watching projects unfold, I've grown uneasy with the difficulty of understanding that while we can ask the customer what they want in advance, we really know the value they experience after we have already delivered. All too often "agile" has ended up meaning we optimize for being busy and doing something, anything, and find it difficult to focus on learning about the value. To teach this through experiences, I've been creating a business value game to move focus on learning on value.

We played this game at European Testing Conference, and it reminded me that even I forget how to run the game after some months of not doing it. Better write about it then!

Crystal Mbanefo took a picture of us playing at European Testing Conference. 

The Game Setup

You need:

  • 5 colors of token, 25 tokens each color
  • "Customer secrets" - value rules for the customers eyes only where some value is 
    • Positive
    • Negative
    • Changes for the color 
    • Depends on another color
  • Precalculated "project max budget" that is the best value team can achieve learning the rules of how customer values things
  • Placeholders for each month of value delivered on the table
  • Timer to run multiple rounds of 3 minutes - 6 months - projects and reflection / redesign time, total 60-90 minutes. 30 seconds is a month, reflected by the placeholders on the table.
More specific setup:
  • Create 5 batches of "work", each batch with 5 tokens of each of the 5 colors
  • Place post-its in front of where customer is sitting so that work can be delivered 
  • Hand "customer secrets" to customer and allow them to clarify with you how their value calculation rules work
  • Post "project max budget" on a whiteboard as reference
  • Explain rules:
    • 6 people will need to do the work
    • The work is flipping a chip around with left hand
    • The work is passed forward in batches, starting batch size is 25
    • After one finishes the work, the other can start
    • Only value at customer is paid for, and customer is available at the end of the 6-month project to announce and make the payment. 


First round:

Usually on the first round, the focus is on the work and trying to get as much of it done under the given constraints as possible. With large batch size moving through the system, it takes a long time before the team starts delivering.  Usual conclusion: smaller batch size. 

During retrospective, we identify what they can and cannot change:
  • They cannot pre-arrange the chips, the work can only start when project starts. 
  • They can ask the customer to be available for announcing and making payments earlier. Monthly payments are easy to ask for. 
  • They can do the work of 6 flips in any order, but the all 6 people doing the work must be involved in each work item before it is acceptable for the customer.
  • They can do smaller batches and order chips in any order they want - after the project has started.
  • They can use only one hand, but do not need to limit themselves to left hand only. 
Second round:


Different groups seem to take batch size idea in different scale on round 2. While batch size of 1 would seem smart and obvious, a lot of teams bring things down to batch size 5 first. It does not really matter, with both smaller batch sizes what usually happens on round 2 is that the team delivers with lot of energy, and all chips end up on the customer side. The customer is overwhelmed with saying how much anything is worth, so even if the team agreed on a monthly payment, customer is able to announce the value of the month only towards end of the project. With focus on delivery, if customer manages to announce the value, the team does not listen to react.

At the end, team gets smaller than the maximum value, regardless of their hard focused work. We introduce the concept of importance of learning, and how that takes time in the project.

During retrospective, they can identify the ways of working they agree to as a team to change the dynamic. Here I see a lot of variance in rules. Usually batch size is down, but teams struggle to control how the batches get delivered to customer or listen to the customer feedback. Often a single point of control gets created, and a lot of the workers stay idle while one is doing thinking.

Third-Fourth-Fifth rounds:

Depending on the team, we run more projects to allow experimenting with rules that work. We let the customer secrets stay the same, so each "project" fails to be unique but is yet another 6 months of doing the work with same value rules. Many teams fail at creating learning process, only a learning outcome for the rules at hand.

Final round:

On last round, customer can create new values for value and the team tests their process whether they are able to now learn during the project.

The hard parts: facilitation and the right secrets

I'm still finalizing the game design, and creating better rules for myself to facilitate this. A key part that I seem to struggle with still is the right values for "customer secrets". The negative values need to be large enough for the team to realize they are losing value by delivering things that take away the value, and the dependent and changing values can't be too complex for the customer to be able to do the math.

I've usually used values in multitudes of 100 000 euros because large numbers sound great, but less zeros would make customers life easier.

I play with poker chips because they have a nice heavy feel for "work" but since carrying around 10kg of poker chips isn't exactly travel friendly, I also have created impromptu chips from 5 colors of post-it notes.

Also, still optimizing the process myself on how to combine delivering and learning. There is more than one way to set this stuff up.

Let me know if you play this, would love to hear your experiences.








Saturday, January 18, 2020

Say It Out Loud - it's Testing


Sitting in front of they're computer, with a focused expression on their face, the tester is testing a new feature. Armed with their notes from all the whiteboard sessions, from design review and passing by comments of what we're changing and whatever requirements documentation they have, they've built their own list of functions they are about to verify that exist and work as expected.

"Error handling" says one of the lines in the functions list. Of course, every feature we implement should have error handling. Into the user interface fields where a sum of money is expected, they type away things that aren't numbers and make no sense as money. With typing of "hundred" being ok'd and just saved away to be reviewed later, it is obvious that whatever calculations we were planning to do later to add things up will not work, and armed with their trusty Jira bug reporting tool, they breathe in an out to create an objective step by step bug report explaining that the absence of error reporting is indeed a bug.

Minutes later, the developer sharing the same room just pings back saying the first version did not yet have error handling implemented. The tester breathes some more.

---
The thing is, errors of omitting complete features are very common finds for us testers. Having found some thousands of them over my tester career, I'm also imagining I see a pattern. The reactions to errors of omitting complete features very often indicate that this did not come as a surprise to the developer. They were giving you a chance of seeing something they build incrementally but weren't guessing *you* would start your testing from where they would go last in their development.

A Better Way

When you build your lists of functions you will verify, how about sharing your lists with the developer. Having a discussion of what of these they expect to see work would save you a lot of mental energy and allow you to direct it to their claims, going deeper than just the function. With that list, you would most likely be learning with them that "Error handling" for this feature won't yet be in the Wednesday's builds, because they planned on working on it only from Friday on.

You could also ask in a way that makes them jump into showing you where the function is in code. Even if you did not understand code, you understand sizes of things. Seeing something is conceptually one block of code, another one is sprinkled around when they show it, and something is very big and makes you want to fall asleep just looking at it are all giving you hints on how you would explore in relation to what your developer just showed you.

If you read code, go find some of that stuff yourself. But still, drag your developer into the discussion as soon as you suspect something might not be there.

Code Reviews vs. Testing

When organizations review code through for example pull requests, errors of function omission are hard ones to spot without someone triggering this particular perspective. If you have a list of things that you expect to see implemented and one of them is missing, there is no way that functionality could end up working in testing.

Sometimes, when you have a hunch of something that was discussed in the design meetings being forgotten from the implementation, the way to go about figuring it isn't to install and test - it is to ask about the feature. Say your idea out loud, see a developer go check it, and learn that something not implemented has no chance of working.

Jumping always to testing isn't the only tool you have as a tester, even if you didn't write (or read) code.

Sunday, January 5, 2020

Hundreds of hours Mob Programming over Four Years - Is it Still Worth It?

With four years mob programming (and testing) with various groups, I feel it is time to reflect a little.
  • I get to work with temporary mobs
  • I often teach (and enable learning) in a mob
As people have spent majority of their careers learning to work well apart, supported by other people, learning to work well together is something I cannot expect as a new mob comes together. 

Mob programming is a powerful learning tool. It has helped me learn about team dynamics and enabled addressing patterns that people keep quiet and hidden. It has helped me learn how people test, how different skillsets and approaches interact, and come to appreciation that it is totally unacceptable way of learning and working for some people, uncomfortable for others while some people just love it. Most people accept it for the two days we spend together, but would opt out should the mechanism find its way to their offices. 

One thing remains through the four years - people are curious on how could five people doing the work of one be productive. What does it really mean if we say that working together, at the same time, we get the best out of everyone into the work we're doing? 

Contributing and Learning

There's two outputs of value for us working, individually or in a group. We can be contributing, taking the work forward. Or we can be learning, improving ourselves and being better solving work problems better later on. 

Contributing enables our business to distribute copies of the software we are creating, and and in short and medium term, it scales nicely in value if we manage to avoid pitfalls of technical debt dragging us down, or building the wrong things no one cares to scale. There's a lot of value in not just delivering the maximum contribution over a longer period of time, but being able to turn an idea into software in use fast. Delivering when the need is recognized rather than a year later, and distributing copies of value in scale for an extra year turns into money for the company. We're ok paying a little more in effort as long as the we receive more in the timeline of it paying itself back. 

Learning enables the people to be better at doing the work. And as the work is creative problem solving, there's a lot of value seeing how the others do things in action to help us learn. Over time, learning is powerful. 
If my efforts in learning allow me to become 1% better every single day of the year, I am 37.8 times the version of myself in a year. That allows for a significant use of time today, to continuously keep things improving for the future me. 

There's a lot of value in contributing effectively to having the best work from us all in the work we are doing. Removing mistakes when they are being made. Caring for quality as we're building things. Avoiding technical debt. Avoiding bottlenecks of knowledge so that support can happen even when most of the team is on a vacation and just the little me is left to hold the fort. 

Mob Programming helps with this. And it helps a lot. 

Question Queue Time

Have you ever been working away with something, and then realize you need a clarification. It's that busiest person in your team who at least knows it, but since they are the busiest, you will take the minute to type in your problem to a team chat hoping someone answers. Sometimes you need to wait for that one busiest person. With a culture like ours, responding to others pulling information is considered a priority, and others stop doing what they were doing to get you to an answer that is not immediate.

If getting that answer takes you 10 minutes, it takes 10 minutes for someone else too. With that chat channel, probably it takes 10 minutes for a lot of people to think about, including the ones curious on the answer they too find themselves not having (but also not needing right now). 

If they put that question into an email, the wait time is more like a day. And the work waits, even if other work may happen. 

If your mob includes people with the answers, getting to a place where you have no question queue time could be possible. 

It's not just the questions through. It is any form of waiting. Slow builds to get to test. Finding a set of tools when you need it. Getting started for a new microservice. Discussing rather than trying. 

When the whole mob waits, the wasted time is manyfold. This seems to drive groups to innovate on the ways they work, and take time to do work that removes the wait time in places where individuals suffer in silence, mobs take action. 

If a mob works with something boring, they often end up automating it. If a mob works with something an individual alone has hard time solving, they get it done. And usually they get it done trying out multiple approaches, deciding through action what way they do things. 

What I find though that even in mobs, we don't have all the answers. For my work, the over reliance on waiting for an answer we already had confirmed by a product owner lead us to no product owner - just to emphasize that we don't need to wait for an answer as waiting costs just as much as potentially making a mistake. Removal of product owner revealed that there are answers not available from a person, but ones that require a discovery process. 

Growing Your Wizard

Mobbing is a challenge, but also rewarding. It has amplified my learning. Knowing it exists and not being able to use it, I look at the juniors who learned a lot but not as much as they could if we were mob programming as a wasted opportunity to optimize for the convenience of our seniors. 

We need to help our different team members level up to find their unique superpowers. We need to grow our wizards, and not just expect them to somehow either get through or give up trying. And answering their questions when they don't know what they don't know is just not enough.

In four years, I haven't ended up with a team that would try mobbing for a significant portion. The year of once every two weeks in my previous place of work is the furthest I got with a single team. But I have spent hundreds of hours mobbing, and even if I have more to try, I have learned a lot on how to get started