Saturday, February 8, 2025

Evolving scoring criteria for To Do App for recruiting

I have used various applications for assessing people's (exploratory) testing skills in recruiting. While it's already a high-stress scenario, I feel there needs to be a way to *test something* if that is the core of the job you'd get hired for. I may believe this because I too, was tested with a test of testing 28 years ago. 

Back when I was tested, the application was two versions of Notepad, one the original English and the other a localized version with seeded bugs. My setting for doing the test then was one hour in front of a computer in a classroom, being observed. There were 8-12 of us in total for each scheduled session, we were all nervous, did not know each other and most likely never even introduced ourselves. We did our thing, reported discrepancies we identified as clearly as we could, and got a call back or not. I remember they did these tests in scale. The testing teams we had weren't small, and we were all total newbies. 

This week I assigned To Do app as the thing to test. For newbies, I make it a take home assignment and recommend using under two hours. For experienced testers, I spend half an hour face to face out of the max two hours of interviewing time we allocate. The work of people is not back yet, but the work of me looking at the application myself got done. 

The most recent form of this take home assignment is one where I give two implementations of To Do App, and explain they are what developers use to show off their best practice for applying front end frameworks.
  1. Elm: https://todomvc.com/examples/elm/
  2. Angular: https://todolist.james.am/#/

I ask for outputs:

  • A clearly catalogued listing of identified issues you’d like to give as feedback to whoever authored the version
  • Listing of features you recognize while you test
  • Description of how you ended up doing the assignment
  • (optional) example of test automation in language and framework of your choice
The previous post listed the issues I am aware of, and today I created a listing of scoring the homework.  In case you have your version, or would like to talk about mine. I am still hoping for the day when some of the people doing the assignment *surprise me* by reading about how to approach these exercises from my blog. 

I can't expect complete newbies to get to all, but what worries me the most is that too many of seasoned testers don't even scratch the surface. We still expect testers to learn testing by testing, often without training or feedback. 

To Do Application -Assessment Grid

ESSENTIAL INSIGHTS
Architecture: frontend only
Same spec for both implementations
Material online to reuse
Reading the room, clarifying assumptions
Optional is chance to show more
Presenting your work is not just description of doing

ESSENTIAL ACTIONS
Research: find the spec as it is online
Research: ask questions
Meta: explain what and why you do
Learning: showing something changed in knowledge while testing
Bias to action: balance explaining and results
Modeling function, data, environments
Recognizing tools of environment
Choosing a constraint to control perspective
Stopping criteria: time or coverage
Classifying and prioritizing
Clarity of reporting issues
Reporting per implementation and common for both
TL;DR - expect lazy readers
Using and explaining a heuristic
Awareness of classes of data (e.g. naughty strings)
Surprise me (e.g. screenshot to genAI)

RESULTS
Functional problems (e.g. off by one count, select all, tooltip)
Functional problem, browser dimension (e.g persistence, icon corruption)
Usability problems (e.g. light colors, lack of instructions)
Implementation problems (on console) e.g. messages in code and errors in console
Data-related problems: creating empty items
Data-related problems: trim whitespace
Data-related problems: special characters
Missing features (e.g. order items)
Typos
In-app consistency (e.g. always visible functionality that does not always work)

AUTOMATION
Working with selectors
Reading error messages
Scenario selection
Locators
Structure and naming
Describing choices
Readme for getting tests to run

MISTAKES THAT NEED EXPLAINING
Overfocus on locators while application is unknown and automation is not in play
Wanting to input SQL injection string

I ended up cleaning this all and making it available at GitHub: https://github.com/exploratory-testing-academy/todoapp-solution 

Monday, February 3, 2025

That Pesky ToDo app

While I am generally of the opinion that we don't need injected problems on applications that are already target rich enough as is, today I went for three versions of a well-known test target problem, namely the ToDo MVC app

Theoretically this is where a group of developers show how great they are at using modern javascript frameworks. There is a spec defining the scope, and the scope includes requirement for having this work on Modern browser (latest: Chrome, Firefox, Opera, Safari, IE11/Edge). 

So I randomly sampled one today - the Elm version, https://todomvc.com/examples/elm/

I took that one since it looked similar to in styles to what playwright uses as their demo, https://demo.playwright.dev/todomvc/, while the latest react version already has the extra light styles updated to something that you are more likely to be able to read. 

I also took that one since it looked similar to the version flying around as target of testing with intentionally injected bugs, https://todolist.james.am/.

My idea was simple: 

  • start with app, to explore the features
  • loop to documenting with test automation
  • switch over implementations to see if automation is portable over various versions of the app
I had no idea of the rabbit hole I was about to fall into. 

The good-elm-version was less good than I expected: 
  1. Select all does not work
  2. edit mode cannot be escaped with esc
  3. unsaved new item not removed on refresh
  4. edit to empty leaves the item while it should be removed
  5. edit to empty messes the layout and I should not see it since 4) should be true
So I looked at the good-latest-react version, only to learn persistence is not implemented. 

And that is where the rabbit hole went deep. I researched the project materials a bit, and explored the UI to come up with an updated list of claims. The list contains 40 claims. That would let me know that good-elm-version was 90% good, 10% not good. 


Looking at the bugs seeded version, there's plenty more to complain: 

  1. Typos, so many typos: need's in placeholder, active uncapitalized, toodo in instructions
  2. "Clear" is visible even when there are no completed items to clear
  3. "Clear" does not clear, because it is really "Clear completed"
  4. Counter has off by one error
  5. Placeholder text vanishes as you add an item, but returns on refresh
  6. Sideways a as icon for "mark all us complete" is not the visual I would expect, nor is the A with ~ on top for deleting - on chrome, after using it enough, but the state normalized on forced refresh. 
  7. Select all does not unselect all on second click
  8. Whitespace trim is not in place if one edits items with whitespace, only when items are shown
  9. <!-- STUPID APP --> in comments is probably intentionally added for fun
  10. ToDo: Remove this eventually tooltip is probably added for fun as well
  11. Errors on missing resources on console are probably added for fun too
  12. "Clear" is missing the counter after it the spec asks for
  13. usability with clear completed, since its functionality only works on filters all and completed, does it really need to be visible on the active filter
  14. URL does not follow the technology pattern you would expect for the demo apps. 
In statistics of the listing of features though, the pretty listing of capabilities is hard to map with the messiness of issues: 

✓ should show placeholder text
✓ should allow to clear the completion state of all items
✓ should trim entered text
✓ should display the current number of todo items
✓ should display the number of completed items
✓ should be hidden when there are no items that are completed
✓ should allow for route #!/
 
7/40 (17,5%) does not feel like it is essentially worse but then again, there are many types of problems that the list of functional capabilities does not lead one to. 

There is also usability improvement conversation type of feedback, that is true for both the two versions. 
  1. The annoyingly light colors where seeing the UI and instructions is hard
  2. None of these allow for reordering items and it feels like an omission even if intentional
  3. None of these support word wrapping
  4. usability of concepts "active" and "completed" for to do items is a conversation: are there better words that everyone would understand more clearly? 
  5. usability with a mouse, there's no adding with a mouse even if that feels by design
  6. usability of the whole design of router / filter concept can be confusing, as you may have a filter that does not show the item you add
  7. Stacked shadow effect in the bottom makes it seem like there are multiple layers. This does not connect with the filters / routing functionality well. 
  8. Delete, edit and select all options take some discovering. 

You could also compare to what you get from a nicely set up demo screenshot of the bugged version. 





The pesky realization remains: seeding bugs is unfortunately unnecessary. While I got "lucky" with elm-version's four bugs, I also got lucky with the refactored react version that is missing implementation of persistence. 

There's also an idea that keeps coming up with experienced testers that we really need to stop throwing at random: SQL injections. For a frontend only application without database, it makes so little sense unless you can continue your story with an imagined future feature where local storage of json gets saved up and used with an integration. Separating things true now and risks for future are kind of relevant in communicating your results. 

Playing more with the automation is left for another day. The 9 tests of today were just scratching surface, even if they 100% pass on playwright practice version and don't on any of the others.