Wednesday, December 12, 2018

An Evening Detour to TCR

At an usually early time for me, I waved goodbyes at the office announcing I was heading to a workshop on TCR and greeted with a bit of rolling eyes and quick googling fingers. It was already established that I was more into volunteering on all sorts of learning activities. Aki Salmi hosted a workshop session on TCR with Tech Excellence meetup, which I just so happen to facilitate, so I had all the excuses I could need to get over myself and over  there.

TCR - Test && Commit || Revert - is a programming workflow  or a test-driven development flavor. Reactions have been from "well, got to try it" to "why are we confusing TDD more" to "will TDD replace TCR" and it feels like a worthwhile thing to do in a great company.

Aki introduced the thing we were about to be experimenting with. Test-Driven Development (TDD) as we've come to know it has three steps, RED - GREEN - REFACTOR, and Test && Commit || Revert (TCR) removes the RED. If your tests aren't green when run, you lose what you worked on. If they are green, they get committed as the next baseline to work from.

Other than the focus on experiencing TCR, the session was framed by 3x25 minutes of paired work with sharing impressions in between, the Lift (Elevator) kata and whatever the pair ends up choosing as their language.

I paired with one of my favorite people who I yet had never paired with before. They came in with things set up on their computer, so the choice on language and IDE were settled: Python on Vim.

Over the three 25-minute sessions, the promise of having fun was well delivered.
  • The most common reason of reverting for us was syntax - we missed a part of formatting. This made me aware of how much I prefer relying on Pycharm as my IDE, with having my focus free from the details of the syntax. We also had great little discussion on the feeling of control of having to know / do every bit in Vim I wasn't appreciating. 
  • With another IDE, I find it relieving to work with intent and generating frames for the code and the Vim as editor made me aware of how much I appreciate other tooling. 
  • Differences on Python / Java felt evident in running tests with same name that Python just dealt with for us, while we would have had a couple more of reverts if we worked in Java. 
  • One pair "cheated" by commenting out the failing tests and I'm still confused with it. Being always green if cheating is encouraged is easiest by never having much in the way of tests, and that cannot be what Kent Beck means with "Cheating is encouraged, as long as you don’t stop there."
  • With the workflow, I was missing seeing the test fail to test the test before implementing. I disliked the thinking as the computer when I would much rather see the test fail first, but wasn't willing to let the test be reverted just for that. 
  • Putting the commands together so that your changes are gone on red increased the sense of risk losing your changes and introduced a language of betting a small amount of work. 
  • Being painfully aware of the risk of losing changes keeps changes small. It would require next level abilities compared to what we were working with though that we could identify making designs driving us to smaller steps for this. 
Overall, I think I was just as bothered with "losing my IDE" as "losing the RED".

In the discussions afterwards, there was speculation about would something of this sort be more necessary when working on trunk-based development where folks might commit tests while they are red, but it sounds to me like a different problem to just the programming workflow.

I find all these things useful as ways of learning about what you're comfortable with and how constraints of all sorts impact your way of working.

All in all, this just felt like a relaxed version of Adi Bolboaca's baby steps constraint, where you revert if you're not green in 3 minutes. With this style, you can see red, but get to a similar practice - making changes intentionally small without losing the feedback of a test first failing to know you're actually testing what you intended.  

Tuesday, December 4, 2018

Testing a Modify Sprite Toolbar

I've been teaching hands-on exploratory testing on a course I called "Exploratory Testing Work Course" for quite many years. At first, I taught my courses based on slides. I would tell stories, stuff I've experienced in projects, things I consider testing folklore. A lot of how we learn testing is folklore.

The folklore we tell can be split to the core of testing - how we really approach a particular testing problem - and the things around testing - conditions making testing possible, easy or difficult as none of it exists in a vacuum. I find agile testing still talks mostly about things around testing, and the things around testing, like the fact that testing is too important to be left only for testers and that testing is a whole team responsibility, those are some great things to share and learn on. 

All too often we diminish the core of testing into test automation. Today, I want to try out describing one small piece in the core of testing from my current favorite application under test while teaching, Dark Function Editor.

Dark Function Editor is an open source tool for editing spritesheets (collections of images) and creating animations out of those spritesheets. Over time of using it as my test target, I've come to think of it as serving two main purposes:
  • Create animated gifs
  • Create spritesheets with computer readable data defining how images are shown in a game
To test the whole application,  you can easily spend a work week or few. The courses I run are 1-2 days, and we make choices of what and how we test to illustrate lessons I have in mind. 
  • Testing sympathetically to understand the main use cases
  • Intentional testing
  • Tools for documenting & test data generation
  • Labeling and naming
  • Isolating bugs and testing to understand issues deeper
  • Making notes vs. reporting bugs

Today, I had 1.5 hours at Aalto University course to do some testing with students. We tested sympathetically to understand the main use cases, and then went into an exercise of labeling and naming for better discussion of coverage. Let's look at what we tested. 

Within Dark Function Editor, there is a big (pink) canvas that can hold one or more sprites (images) for each individual frame in an animation. To edit image on that canvas, the program offers a Modify Sprite Toolbar. 


How would you test this? 

We approached the testing with Labeling and naming. I guided the students into creating a mindmap that would describe what they see and test. 

They named each functionality that can be seen on the toolbar: Delete, Rotate x2, Flip x2, Angle and Z-Order. To name the functionalities, they looked at the tooltips of some of these, in particular the green arrows. And they made notes of the first bug. 
  • The green arrows look like undo / redo, knowing how other application use similar imagery. 
They did not label and name tooltips nor the actual undo/redo that they found from a separate menu, vaguely realizing it was a functionality that belonged in this group yet was elsewhere in the application. Missing label and name, it became a thing they would have needed to intentionally rediscover later. They also missed label and name of the little x-mark in the corner that would close the toolbar, and thus would need to discover the toggle for Modify sprite -toolbar later, given they had the discipline. 

The fields where you can write drew their attention the most. They started playing with the Z-order, giving it different values for two images - someone in the group knew without googling that this would have impact on which of the images were on top. They quickly run into the usual confusion. The bigger number would mean that the image is in the background, and they noted their second bug:
  • The chosen convention of Z-order is opposite to what we're used to seeing in other applications
I guided the group to label and name every idea they tried on the field. They labeled numbers, positive and negative. As they typed in the number, they pressed enter. They missed label and name for the enter, and if they had, they would have realized that in addition to enter, they had the arrow keys and moving cursor out of focus to test. They added decimals under positive numbers, and a third category of input values of text. 

They repeated the same exercise on Angle. They quickly went for symmetry with Z-order, and remembered from earlier sympathetic testing they had seen positive value 9 in the angle work already. They were quick to call the category of positive covered, so we talked about what we had actually tested on it.

We had changed two images at once to 9 degree angle. 
We had not looked at 9 degrees in relation to any other angle, if it would appear to match our expectations. 
We had not looked at numbers of positive angles where it would be easy to see correctness. 
We had not looked at positive angles with images that would make it easy to see correctness. 
We had jumped to assuming that one positive number would represent all positive numbers, and yet we had not looked at the end result with a critical eye. 

We talked about how the label and name could help us think critically around what we wanted to call tested, and how specific we want to be on what ideas we've covered. 

As we worked through the symmetry, the group tried a decimal number. Decimal numbers were flat out rejected for the Z-order, which is what we expected here too. But instead, we found that when changing angle from value 1 to value 5.6, the angle ended up as 5 as we press enter. Changing value 4 to 4.3 showed 4.3 still after pressing enter, and would go to 4 only with moving focus away from the toolbar. We noted another bug:
  • Input validation for decimal numbers would work differently when within same vs. other digits.
As we were isolating this bug, part of the reason why it was so evident was that the computer we were testing with was connected to a projector that would amplify sounds. The error buzz sound was very easy to spot, and someone in the group realized there was asymmetry of those sounds on the angle field and the Z-order field. We investigated further and realized that the two fields, appearing very similar and side by side would deal with wrong inputs in an inconsistent manner. This bug we did not only note, but spent a significant time writing a proper report on, only to realize how hard it was. 
  • Input validation was inconsistent between two similar looking fields.
I guided the group to review the tooltips they did not label and name, and as they noticed one of the tooltips was incorrect they added the label in model, and noted a bug. 
  • Tooltip for Angle was same as for Z-order description. 
In an hour, we barely scratched the surface of this area of functionality. We concluded with discussion of what matters and who decides. If no one mentions any of the problems, most likely people will imagine there are none. Thinking back to a developer giving a statement about me exploring their application in Cucumber podcast:
She's like "I want to exploratory test your ApprovalTests" and I'm like "Yeah, go for it", cause it's all written test first and its code I'm very proud of. And she destroyed it in like an hour and a half.
You can think your code is great and your application works perfectly, unless someone teaches you otherwise.

I should know, I do this for a living. And I just learned the things I tested works 50% in production. But that, my friends, is a story for another time. 






It's not What Happens at the Keyboard

"What if we built a tool that records what you do when you test?", they asked. "We want to create tooling to help exploratory testing.", they continued. "There's already some tools that record what you do, like as an action tree, and allow you to repeat those things."

I wasn't particularly excited about the idea of recording my actions on the keyboard. I fairly regularly record my actions on the keyboard, in form of video, and some of those videos are the most useless pieces of documentation I can create. They help me backtrack what I was doing, especially when there are many things that are hard to observe at once, and watching a video is better use of my time than trying the same things again on the keyboard - not very often. Or trying to figure out a pesky condition I created and did not even realize was connected. But even on that, 25 years of testing has kind of brought me better mechanisms of reconnecting with what just happened, and I've learned to ask (even demand!) for logs that help us all when my memory fails as the users are worse at remembering than I will be.

So, what if I had that in writing. Or executable format. It's not like I am looking for record-and-playback automation, so the idea of what value those would provide must be elsewhere. Perhaps it could save me from typing details down? But from typing just the right thing - after all, I'm writing for an audience - I would need to clean up to the right thing or not mind the extra fluff there might be.

I already know from recording videos and blogging while testing, that the tool changes how I test. I become more structured, more careful, more deliberate in my actions. I'm more on a script just so that I - or anyone else - could have a chance of following later. I unfold layers I'm usually comfortable with, to make future me and my audience comfortable. And I prefer to do this after rehearsal, as I know more than I usually do when I first start learning and exploring.

A model of exploratory testing starts to form in my head, as I'm processing the idea of tooling from the collection of data of the activity. I soon realize that the stuff the computer could collect data on is my actions on the computer. But most of exploratory testing happens in my head.


The action on the computer is what my hands end up doing, and what ends up happening with the software - the things we could see and model there. It could be how a page renders to be displayed precisely as it is, so that for future, I can have an approved golden master to compare against. It could be recognizing elements, what is active. It could be the paths I take. 

It would not know my intent. It would not know the reasons of why I do what I do. And you know, sometimes I don't know that either. If you ask me why I do something, you're asking me to invent a narrative that makes sense to me but may be a result of the human need of rationalizing. But the longer I've been testing, the more I work with intentional testing (and programming), saying what I want so that I would know when I'm not doing what I wanted. With testing, I track intent because it changes uncontrollably unless I choose to control it. With programming, I track intent because if I'm not clear on what I'm implementing, chances are the computer won't be doing it either. 

As I explore with the software as my external imagination, there are many ways I can get it to talk to me. What looks like repetitive steps, could be observing different factors, in isolation and chosen combinations. What looks like repetitive steps, could be me making space in my mind to think outside the box I've placed myself in, inviting my external imagination to give me ideas. Or, what looks like repetitive steps, could be me being frustrated with the application not responding, and me just trying again. 

Observation is another thing human side of exploratory testing brings. We can have tools, like magnifier glass, to enhance our abilities to observe. But ideas of what we want to observe, and its multidimensional nature are hard to capture as data points, and even harder to capture as rules. 

Many times the way we feel, our emotion is what gives another dimension to our observations. We don't see things just with our eyes, but also with how we experience things. Feeling annoyed or frustrated is an important data point in exploratory testing. I find myself often thinking that the main tool I've developed over years comes from psychology books, helping me name emotions, pick up when they come to play and notice reasons for them. My emotions make me brave to speak about problems others dismiss. 

Finally, this is all founded on who I am today. What are my skills, habits and knowledge I build upon. We improve every day, as we learn. We know a little more (knowledge), we can do a little more (skills) and can routinely do things a little more (habits). In all of these we both learn, and unlearn. 

I don't think any of the four human side parts of exploratory testing can be seen from looking at the action data alone. There's a lot of meaning to codify before tooling in this area is helpful. 

Then again, we start somewhere. I look forward to seeing how things unfold. 



Friday, November 30, 2018

Open Source and the New World of Creator Responsibility

As I'm minding my own business, doing some testing and enjoying myself, I get brutally interrupted by an incoming message. Someone somewhere has a problem. With the product I work with. I know that I care a lot, deep down, but the timing of the message is just so inconvenient. Reluctantly I offload from the work I was doing, preparing myself mentally to dig into the intrusion.

As I read the message and related bug report, I feel the frustration in me growing. The steps to reproduce the problem are missing. The description of the problem is vague. It feel like the reporter did not even understand what they were doing, let alone knowing what  the product was doing. Reading the text again and again, I come to the conclusion they have an actual problem, but their report, all without the right details and logs does not help me to do anything with it.

The tester that I am, I start testing the situation myself. Sure enough, it is not like I have tested our product together with Google Ads management interface. As I set up the environment, I realize I have to have an actual campaign set up so that I could even reproduce the problem. I toy between the "going to ask for company credit card" and "use my own credit card, here's the opportunity I always wanted to try ads for my own stuff" and go for the easy route and build a little campaign to advertise my conference. I confirm the bug, test more to figure out exact steps to isolate it to know what is the minimal impact workaround I can suggest, collect the logs and attach all of my investigation into the records I started with.

The fix is ready in 10 minutes. The release of the fix takes a little longer.

All of this happened in a project where I get paid for my time. Yet I am frustrated. But what about when we find ourselves working on our pet projects, sharing them as open source?

First of all, the code, whether closed or open source does what its creators make it do.

This means there is always creator responsibility of what can be done with the software you created.

With software, it's not just about being a creator as in writing some lines of code, but it is also being an owner of what you started creating. You can be responsible for the code you wrote to an extent, but when you move in open source projects to the idea of being responsible for code others created on a codebase you started, we're piling a lot of work and responsibilities on someone who did not really opt into it.

There was a great example of that this week.
It wasn't *mining* it was *stealing* as many pointed out. But the really peculiar thing to look at was the formation of camps in assuming different responsibility for the owner as original contributor.

Finally posting this was motivated by a good friend of mine running a relevant open source test automation tool project tweeting this:
 Going back to my story to start this blog post, you can't really always expect that your users - that is what they are even if your project is free and open source - would take the time to isolate and report problems. And when you make it so, at worst you are like another open source maintainer who goes around bragging on quality of their code when reality is that their users just don't bother reporting but take the choice of just walking away.
  
I know how to write good bug reports yet I rarely do. I only do it when I get paid for it, or as a result of it I get something I need. I want to optimize the time I spend and a quick report on problems is the smallest possible contribution I can choose to take.

Information is valuable for a project that wishes for wider scale adoption. While there may not be direct money coming in from an free open source project, I find that many of the relevant creators have found a way of turning their thing into some income flow.

Say thank you for making you *aware* you have work you can think about doing, and stop blaming anyone who works free. I know it is hard to do, even when you have a paying customer.





Thursday, November 29, 2018

Forced Consistency Across Teams

The first thing I taught to our latest 15-year-old trainee was what we believe that rules and processes need to be built with a core principle in mind: trust.

If someone might commit a crime, we don't put everyone in jail.

Many of the corporate processes and principles are an illustration of people really not getting this idea.

We make people clock in and out of work so that we know they worked. Except in this industry, time at place of work is a ridiculous metric. I should know, I've just had two weeks of motivational issues where I was at work and performed really bad (to my standards).

We make people write test cases and tick the box as they complete them, because, quoting an old manager of mine "no one in their right mind would test if they were not monitored in this detail" Well, I did. And still do. And I love it. The devs loved it as soon as it wasn't tick-the-box.

We introduce common practices and processes for teams with very different skillsets and background, even if there was no common problem those solve.

When I look at processes and practices, I note that I have personal preferences. I prefer no estimates, no Jira, end-to-end visibility and sense of ownership, understanding and solving problems. I recognize that everyone in my team has their own personal preferences, and I respect their preferences as I expect them to respect mine. I do compromises, like spend my time suffering with Jira just because they still haven't figured out that it isn't making them better. And they experiment with whatever ideas we all interject into the efforts of trying to make things better.

What inspired me to write this is a discussion about my personal dislike for definition of done.

I believe definition of done is a great tool for building a common understanding for a team on what they try to mean when they say done. I've used it, multiple times.

I've come to think of it as "definition of done for now"-

I've learned that a deeper version of it is risk-based definition of done for now, even within one team. Cookie cutter templates rarely work for other than getting started.

I've experienced over and over again how forcing definition of done over many teams for reasons of consistency is short-sighted. First you have to understand if the teams are consistent, or if some are steps ahead of others, and approaching the same problem with a different solution could actually improve things more.

As with any practices and processes, I don't accept that it would be our only option for improvement. Using time on one thing is time away from something else - the idea of opportunity cost. If Definition of Done would help us make sense in a messy multi team setting, would any other approaches work? Could you redesign the team compositions to force the architecture you aspire that would drive down the dependencies, leveraging Conway's law? Could you instead of a Definition of Done (there are plenty of examples what this contains), describe your team's responsibilities in some other format that would enable you to see a dimension DoD misses?

Using Jira states the different way is hardly the reason why developers find it hard to start working on a new component. Looking at the code and its structures is a much more likely reason. Lack of documentation and training is a much more likely reason.

Go for consistency when it solves a problem without introducing bigger ones. Putting everyone in jail because one might rob the bank tends to be a bigger problem.



The Broken Phone Effect

I was totally upset with a colleague of mine, and to ease my heart, I ranted about them to my manager. Like that colleague just reminded me today, it is so hard to understand people like me who would just talk about problems without expecting a solution. And this rant was like that. I wasn't expecting an action. I just needed to talk.

Like so many times, I found that the metadata of what type of request was coming in from me did not go through. When my communication headers included the metadata asking for a sympathetic ear, and a mirror to bounce things off, they only received the default: When presented a problem, solution is in order.

The solution however was particularly good this time. They suggested that I'd go and talk to my colleague. I did. It felt overwhelming and difficult. But it was a start of many great conversations where we built trust on one another, now knowing that we could constructively talk about anything and everything.

So I distilled my lesson. When I had something I felt strong enough to rant to another person, perhaps taking the extra step and talking to them directly, not about what *they do* but about what *I feel*, was a path worth taking.

With this lesson in mind, I've asked many people to talk to me directly when I'm involved in how *they feel*. I believe they cannot tell me what to do, but they can help me understand how what I do makes them feel and I may choose to work on my behaviors, or at least help them understand how I make sense of the world. Remembering how hard those steps are, I appreciate that many people choose avoidance. I still choose avoidance in righteous anger when I feel neither my status nor the past agreements justify me taking the first step. The word "emotional labor" comes to mind. Bridging in disagreements requires that, and I'm tired of it being expected it is my duty to perform it for the others.

Over the years as I have been blogging, people have reached out to tell they've been through what I write about. While I write for myself, I also write for people like myself. People who aspire to change themselves, change the results they contribute, change the world in some small way. My stories are not factual representations of events, but they are personal intertwining of many experiences that allow me to shine light to a relevant experience.

When my manager calls me telling I should not say "Google me", I wish the person I offended would have had the guts to talk about this without the broken phone effect. I could have explained that I mean that you will find articles I've written, research I've done and that I did google your background enough to see that you are talking to someone who knows something of this stuff. Assume good intent. I rarely say things to insult.

If you have something to say, talk to me about it. If you don't but changed your ways for the better anyway, I'm fine with you being annoyed with me and avoiding me. Mutual loss, but we both have options.

A great option is to break the broken phone effect and just deal with your own stuff instead of sending a messenger. It might have a second order positive impact.

I tried, I failed, I succeeded. I learned. Can you say the same? FAIL is a first attempt in learning and takes a significant amount of courage.

Tuesday, November 27, 2018

Pay to Speak and Why Non-Profit Does Not Mean What You Think

Four years ago, I started a conference as an experiment in figuring out how a tech conference could be more fair to speakers. I had experienced many sleepless nights trying to figure out what were my values as a mother of two in allocating the family money to Pay to Speak at conferences, just because it was something I personally aspired for.

Looking at the way I felt about my immediate choices I had to make within my family, and the greater scheme of things in the community, I started actively looking at a theme I plugged #PayToSpeak. It was an observation that while conferences sell the possibility to come hear speakers teach what  they had learned, I found myself short of money as showing up at conferences was something where I was expected to pay my travel and accommodation. I paid, just like everyone else, building barely enough name to get to a point I could have a choice.

In the greater scheme of things, people like myself of with less privilege than what I had as a single mother of two, suffer more from #PayToSpeak. Also, where you work matters - some companies seek the visibility in your conferences and are willing to pick up the travel bill, while I personally have chosen to work in product companies that would choose to invest their visibility euros in other conferences than the ones I want to speak at.

So I created a sheet, and very recently upped it to a web site with a link to the sheet. It will move forward when I feel I have time and energy. But it serves a purpose already as it is. You can check out http://paytospeak.org to learn about the theme.

Fairly regularly, I get conference representatives asking me to present them in a more positive light. CAST chairperson Maria Kedemo asking for improvements in the documents is not unusual.
However, I have not yet found a way of presenting information like this. As you may guess from the title of my post, being a non-profit is not as obvious tick box as you might think.

The difference of a non-profit and company as organizer is hard to describe. Both can organize the same conference. Both can organize it for the same price for participant. Both can choose to use most of the money to pay salaries for the organizers, and all expenses for the speakers. Both can choose to be #PayToSpeak. The only real difference is in what they can choose to do with the profits.

Remember, profit is what is left after all the costs. Salaries are costs. So even for a company, you don't have to end up making profit.

What non-profit do with the profit they raise is that  they run their cause. The cause for AST is admirable. They use the money they make in conferences in financially supporting small testing communities that need that money to bring in speakers, pay event organizing costs, start new events. AST played a core role in financially supporting my conference on year one, saving me from some of the financial stress taking a risk of going into organizing with them, rather than all by myself.

My conference uses the profits on supporting new speakers traveling to other #PayToSpeak conferences, and enabling people who aspire to speak to experience a conference they can't pay for. With the cost structure of always paying the speakers expenses and being uncertain about number of paying participants, the profits from the conference to use on the cause have not been very large.

Instead, my conference has served as an experimentation platform. I can now say that  while speakers are important, the sales and marketing effort is more important in a conference's success. I have found new respect for people who manage to run series of conferences with volunteers only, and for conferences that pay their organizers. The choices of what work / costs are worth paying from the conference budget are not easy, and will be versatile and hard to describe.

So I choose to only describe in my sheet the immediate impact for the speaker - what money out of pocket are they expected to find or what financial support they can expect to see, should they volunteer as speakers.

I dream of a world we we'd also have the money to compensate for used time for the speakers. That means that the audience - all of us - needs to have money to pay for those services. The world is more than half-full of people for whom their companies never paid a single tech conference. They might never get to go, or if they go, it is personal time off.