Tuesday, December 4, 2018

Testing a Modify Sprite Toolbar

I've been teaching hands-on exploratory testing on a course I called "Exploratory Testing Work Course" for quite many years. At first, I taught my courses based on slides. I would tell stories, stuff I've experienced in projects, things I consider testing folklore. A lot of how we learn testing is folklore.

The folklore we tell can be split to the core of testing - how we really approach a particular testing problem - and the things around testing - conditions making testing possible, easy or difficult as none of it exists in a vacuum. I find agile testing still talks mostly about things around testing, and the things around testing, like the fact that testing is too important to be left only for testers and that testing is a whole team responsibility, those are some great things to share and learn on. 

All too often we diminish the core of testing into test automation. Today, I want to try out describing one small piece in the core of testing from my current favorite application under test while teaching, Dark Function Editor.

Dark Function Editor is an open source tool for editing spritesheets (collections of images) and creating animations out of those spritesheets. Over time of using it as my test target, I've come to think of it as serving two main purposes:
  • Create animated gifs
  • Create spritesheets with computer readable data defining how images are shown in a game
To test the whole application,  you can easily spend a work week or few. The courses I run are 1-2 days, and we make choices of what and how we test to illustrate lessons I have in mind. 
  • Testing sympathetically to understand the main use cases
  • Intentional testing
  • Tools for documenting & test data generation
  • Labeling and naming
  • Isolating bugs and testing to understand issues deeper
  • Making notes vs. reporting bugs

Today, I had 1.5 hours at Aalto University course to do some testing with students. We tested sympathetically to understand the main use cases, and then went into an exercise of labeling and naming for better discussion of coverage. Let's look at what we tested. 

Within Dark Function Editor, there is a big (pink) canvas that can hold one or more sprites (images) for each individual frame in an animation. To edit image on that canvas, the program offers a Modify Sprite Toolbar. 


How would you test this? 

We approached the testing with Labeling and naming. I guided the students into creating a mindmap that would describe what they see and test. 

They named each functionality that can be seen on the toolbar: Delete, Rotate x2, Flip x2, Angle and Z-Order. To name the functionalities, they looked at the tooltips of some of these, in particular the green arrows. And they made notes of the first bug. 
  • The green arrows look like undo / redo, knowing how other application use similar imagery. 
They did not label and name tooltips nor the actual undo/redo that they found from a separate menu, vaguely realizing it was a functionality that belonged in this group yet was elsewhere in the application. Missing label and name, it became a thing they would have needed to intentionally rediscover later. They also missed label and name of the little x-mark in the corner that would close the toolbar, and thus would need to discover the toggle for Modify sprite -toolbar later, given they had the discipline. 

The fields where you can write drew their attention the most. They started playing with the Z-order, giving it different values for two images - someone in the group knew without googling that this would have impact on which of the images were on top. They quickly run into the usual confusion. The bigger number would mean that the image is in the background, and they noted their second bug:
  • The chosen convention of Z-order is opposite to what we're used to seeing in other applications
I guided the group to label and name every idea they tried on the field. They labeled numbers, positive and negative. As they typed in the number, they pressed enter. They missed label and name for the enter, and if they had, they would have realized that in addition to enter, they had the arrow keys and moving cursor out of focus to test. They added decimals under positive numbers, and a third category of input values of text. 

They repeated the same exercise on Angle. They quickly went for symmetry with Z-order, and remembered from earlier sympathetic testing they had seen positive value 9 in the angle work already. They were quick to call the category of positive covered, so we talked about what we had actually tested on it.

We had changed two images at once to 9 degree angle. 
We had not looked at 9 degrees in relation to any other angle, if it would appear to match our expectations. 
We had not looked at numbers of positive angles where it would be easy to see correctness. 
We had not looked at positive angles with images that would make it easy to see correctness. 
We had jumped to assuming that one positive number would represent all positive numbers, and yet we had not looked at the end result with a critical eye. 

We talked about how the label and name could help us think critically around what we wanted to call tested, and how specific we want to be on what ideas we've covered. 

As we worked through the symmetry, the group tried a decimal number. Decimal numbers were flat out rejected for the Z-order, which is what we expected here too. But instead, we found that when changing angle from value 1 to value 5.6, the angle ended up as 5 as we press enter. Changing value 4 to 4.3 showed 4.3 still after pressing enter, and would go to 4 only with moving focus away from the toolbar. We noted another bug:
  • Input validation for decimal numbers would work differently when within same vs. other digits.
As we were isolating this bug, part of the reason why it was so evident was that the computer we were testing with was connected to a projector that would amplify sounds. The error buzz sound was very easy to spot, and someone in the group realized there was asymmetry of those sounds on the angle field and the Z-order field. We investigated further and realized that the two fields, appearing very similar and side by side would deal with wrong inputs in an inconsistent manner. This bug we did not only note, but spent a significant time writing a proper report on, only to realize how hard it was. 
  • Input validation was inconsistent between two similar looking fields.
I guided the group to review the tooltips they did not label and name, and as they noticed one of the tooltips was incorrect they added the label in model, and noted a bug. 
  • Tooltip for Angle was same as for Z-order description. 
In an hour, we barely scratched the surface of this area of functionality. We concluded with discussion of what matters and who decides. If no one mentions any of the problems, most likely people will imagine there are none. Thinking back to a developer giving a statement about me exploring their application in Cucumber podcast:
She's like "I want to exploratory test your ApprovalTests" and I'm like "Yeah, go for it", cause it's all written test first and its code I'm very proud of. And she destroyed it in like an hour and a half.
You can think your code is great and your application works perfectly, unless someone teaches you otherwise.

I should know, I do this for a living. And I just learned the things I tested works 50% in production. But that, my friends, is a story for another time. 






It's not What Happens at the Keyboard

"What if we built a tool that records what you do when you test?", they asked. "We want to create tooling to help exploratory testing.", they continued. "There's already some tools that record what you do, like as an action tree, and allow you to repeat those things."

I wasn't particularly excited about the idea of recording my actions on the keyboard. I fairly regularly record my actions on the keyboard, in form of video, and some of those videos are the most useless pieces of documentation I can create. They help me backtrack what I was doing, especially when there are many things that are hard to observe at once, and watching a video is better use of my time than trying the same things again on the keyboard - not very often. Or trying to figure out a pesky condition I created and did not even realize was connected. But even on that, 25 years of testing has kind of brought me better mechanisms of reconnecting with what just happened, and I've learned to ask (even demand!) for logs that help us all when my memory fails as the users are worse at remembering than I will be.

So, what if I had that in writing. Or executable format. It's not like I am looking for record-and-playback automation, so the idea of what value those would provide must be elsewhere. Perhaps it could save me from typing details down? But from typing just the right thing - after all, I'm writing for an audience - I would need to clean up to the right thing or not mind the extra fluff there might be.

I already know from recording videos and blogging while testing, that the tool changes how I test. I become more structured, more careful, more deliberate in my actions. I'm more on a script just so that I - or anyone else - could have a chance of following later. I unfold layers I'm usually comfortable with, to make future me and my audience comfortable. And I prefer to do this after rehearsal, as I know more than I usually do when I first start learning and exploring.

A model of exploratory testing starts to form in my head, as I'm processing the idea of tooling from the collection of data of the activity. I soon realize that the stuff the computer could collect data on is my actions on the computer. But most of exploratory testing happens in my head.


The action on the computer is what my hands end up doing, and what ends up happening with the software - the things we could see and model there. It could be how a page renders to be displayed precisely as it is, so that for future, I can have an approved golden master to compare against. It could be recognizing elements, what is active. It could be the paths I take. 

It would not know my intent. It would not know the reasons of why I do what I do. And you know, sometimes I don't know that either. If you ask me why I do something, you're asking me to invent a narrative that makes sense to me but may be a result of the human need of rationalizing. But the longer I've been testing, the more I work with intentional testing (and programming), saying what I want so that I would know when I'm not doing what I wanted. With testing, I track intent because it changes uncontrollably unless I choose to control it. With programming, I track intent because if I'm not clear on what I'm implementing, chances are the computer won't be doing it either. 

As I explore with the software as my external imagination, there are many ways I can get it to talk to me. What looks like repetitive steps, could be observing different factors, in isolation and chosen combinations. What looks like repetitive steps, could be me making space in my mind to think outside the box I've placed myself in, inviting my external imagination to give me ideas. Or, what looks like repetitive steps, could be me being frustrated with the application not responding, and me just trying again. 

Observation is another thing human side of exploratory testing brings. We can have tools, like magnifier glass, to enhance our abilities to observe. But ideas of what we want to observe, and its multidimensional nature are hard to capture as data points, and even harder to capture as rules. 

Many times the way we feel, our emotion is what gives another dimension to our observations. We don't see things just with our eyes, but also with how we experience things. Feeling annoyed or frustrated is an important data point in exploratory testing. I find myself often thinking that the main tool I've developed over years comes from psychology books, helping me name emotions, pick up when they come to play and notice reasons for them. My emotions make me brave to speak about problems others dismiss. 

Finally, this is all founded on who I am today. What are my skills, habits and knowledge I build upon. We improve every day, as we learn. We know a little more (knowledge), we can do a little more (skills) and can routinely do things a little more (habits). In all of these we both learn, and unlearn. 

I don't think any of the four human side parts of exploratory testing can be seen from looking at the action data alone. There's a lot of meaning to codify before tooling in this area is helpful. 

Then again, we start somewhere. I look forward to seeing how things unfold.