Saturday, November 5, 2011

Management: 80 % feels better than 50 %

I never feel too good going back to Quality Center for the tests we just did, but this time we managed to use it so that it did not cause us too much trouble.

Our "test cases" in the test plan, are types of data. A typical test case on average seems to have 5 subdata attached to it. All in all, a lot of our planning was related to understanding what kinds of data we handle in production, what is essentially different and how we get our hands on that kind of subset.

We had two types of template tests in use for the steps:
  • Longer specific process flow descriptions: basically we rewrote the changes that were introduced in the specification in detail in minimum number of workflows, and ended up with about 40 steps, and 5 main flows.
  • Short reminders of common process flow parts: these had 5 steps and just basically mention the main parts that the system needs to go through, little advice for the testers.
Our idea was, that for the first tests we run, we support the newcoming testers with the longer flows. And when they get familiar (they have the business background, it doesn't take that much), we enable them to do things in exploratory fashion with the feeling of being in service role towards own colleagues - it's an internal system.

Before we started testing, I took a look at the documentation, numbers and allocated time, and made a prediction. We'd get to 50 % measured against the plan in the timeframe. After the first our of five weeks, we were at 4 %, and I was sure we'd stay far behind my pessimistic prediction.

We also needed to split the tests in Plan to 5 instances of the test in Lab, to make sure we would not end up showing no progress or being inable to share the work in our team, as one of the things to test were potentially a week worth of work. We added prioritization at this point, and split things so that one of the original test cases could actually be started on week one to be completed on week 5 - if it would ever be completed.

We did our finetuning while testing, moved from the detailed cases (creating more detailed documentation on what and how we were checking things) to the general ones, and with face-to-face time talking on risks we cut down each actual test run in different way, basically building together understanding of the types of things we had already seen and could go with the risk that it may not work but we won't bother, since there's stuff that is more relevant to know of right now.

With notetaking, I encouraged the testers to write down stuff they need but taught that blank means nothing relevant to add. They run the tests in QC lab and made notes there in the steps. The notes we write and need are nowhere near the examples that go around session-based test management.

The funny part came from talking with some of the non-testing managers, when they compared my "50 % is where we will get" to the end result of getting to 80 % as the metric was taken out from quality center. It was (and still is) difficult to explain that we actually did end up pretty much on the level I had assumed and needed to cut down about half of what we had planned for, we had just decided we'll cut out random amounts of each test, instead of something that they decided to assume was comparable.

Yet another great exploratory testing frame with the right amount of preparation and notetaking. And yet, with all the explaining of what really happened, management wants to think there was a detailed plan, we executed as planned and and the metric of coverage against plan is relevant.

Friday, November 4, 2011

Ending thoughts of an acceptance test

I spent over a year in my current organization to get to the point where my team completed a project's acceptance test the first time. Now, down to little over two years, a second project's acceptance test is ending.

I'm a test manager in quite a number of projects that are long enough that I feel that most of my time goes into waiting to actually getting into testing. Two more to complete in upcoming months, and another longer project just getting started.

I wanted to share a few bits on the now-about-to-be-finished acceptance testing.

Having reached this point, I can reflect back a month and admit, that I believed our contractor would not be able to find the relevant problems. I had reviewed some of their results, and noted that
  • system testing defects were mostly minor and there were not that much of those
  • it took about 4 days to plan & execute a test on average, 1 day to fix a bug when found during system testing
  • it took about 2 days to plan & execute a test on average, 0,7 days to fix during integration testing
On setting up the project, I had tried convincing steering group to let the customer side testers participate in the contractor testing phases in testing role, but was refused. We wanted to try out how things go when we allocate the responsibility over to the contractor side. So I wasn't convinced there would not be a number of "change requests" - bugs that the contractor can't or don't care to find, since they are not to be read directly from specifications or problems that you just can't read from spec directly.

Now that we're almost done with our testing, I can outline our results. We found a small yet relevant portion of problems during the project. I had estimated that if I can count number of bugs to fix with one hands fingers, we'll be able to do this in the allocated one-month timeframe, and that's where we ended. Another batch was must-have change requests, bugs with just another name. And we logged about 80 % of issues, that reflected either bugs in current production (and mostly those we don't fix, although it's not so straightforward) or problems in setting up the test scenario for the chain of synchronized data. Just counted that we used on finding issue on average 4,6 days. If I take out the ones that did not result in changes, 20 days to a relevant bug. I have no right to complain about the contractors efficiency. And I better do them right, and make our management clearly aware of this too.

So, I guess I have to admit. The contractor succeeded even though I remained sceptic up until the very end. I had a really good working relationship with the customer side test manager, and at this point I'm convinced she is a key element in this experience of success and another ones still going on, being over schedule with still relevant concerns on whether the testing we do on customer side will ever be enough.

Just wanted to share, amongst all the not-so-fortunate stories I have, that this one went really well. The only glitch during the project was at the time when our own project manager needed planning support to take the testing through. After that, she was also brilliant and allowed me to work on multiple projects on less managerial role but more of as a contributor to the actual testing.

Saturday, August 20, 2011

Tester to testing is NOT like surgeon to surgery

I wrote a post earlier (quite some time ago). Today I realized I had comments on that post and other ones that I did not know of.

The main point that I was trying to write about is that some people may be right in saying you don't need testers - as in full time test specialist team members - in scrum teams, you just need testing - the skills in team members that don't identify as testers.

James Bach made a good point in saying he has not met any / many people who would make serious commitments towards learning the skills needed in testing other than those who identify themselves as testers. I've met some, but too few.

However, this post is about arguing the point he made I placed in title
"It could also be said that you don't need surgeons -- only surgery"
or the milder version of the same that Ru Cindrea, a respected colleague within Finland made that you don't need developers either, only development.

I would argue that testing to development is not quite the same thing as surgery. I mean that in the sense that surgeons are not in the information providing industry as testers are to developers and stakeholders, but are more self-contained. Perhaps there would be a second surgeon in surgery, that looks out to provide a service to the other, but without knowing much of surgeons life, I would still guess that person is called a surgeon too.

If there was no development, there would be little testing related to the development that was never done.

We need people trained in the skills of testing. Well trained. Both those who identify as developers and those who identify as testers. The essential difference to me is that it is hard to keep up with the skills of testing with the limited amount of hours available, let alone having to use half - or more - of my time on development skills.

I just don't want to go with the assumption that developers can't do testing, when I've witnessed in numbers more testers who can't test than developers. It may not matter what you're called.

Wednesday, August 17, 2011

Testing Micromanagement

I spent the last week in CAST2011 in Seattle, talking with testing people and listening to the most interesting-sounding tracks I could find. One that left me thinking for a long time was by Carsten Feilberg on Managing Testing based on Sessions.

After his quite interesting talk, I had to ask if he felt like his style of SBTM had too much of micromanagement included and naturally he did not feel that way. I wrote in my notes that managers responsibility is not to manage, but to make sure it is managed, and have been pondering about my strong reaction to what I felt was micromanagement.

We briefly talked about the contextual differences, but way to shallow to actually yet know of the determining factors. One of the differences we identified is how we felt responsibility in our organizations is allocated, e.g. is test manager responsible for coverage or can that responsibility lie within the team members. My team members are subject-matter experts and I would not dream of controlling their work in less than 2 hours chunks, I just teach them to do that themselves. Thus I have sessions that are "private" and that are "shared", so I do heavy sampling to guide the team and test if they are on the right track as per I understand it.

I went for wikipedia to check what micromanagement is: a management style where a manager closely observes or controls the work of his or her subordinates or employees.

Having thought about it for a week, I still feel SBTM as it has been described originally is a form of micromanagement. It's less granular than detailed test cases, but it was intended for high accountability - that comes with a cost. I most often don't feel I need that.

In a local discussion whether SBTM is micromanagement or not, a lesson I picked up in agile circles several years back came to mind. In some material I read, there was a typical quadrants picture of two dimensions of building a self-organized team. One axis was Willing vs. Unwilling: whether there was attitude building to do with the individuals in the team. Another was Capable vs. Uncapable: whether they had deep enough skills to do the work that needed doing. For now, I think assumptions on where my team is on these scales are significant in deciding how often you'd need to control to keep the notes good enough.

I feel lucky to have a team that is willing and mostly capable. And that my capabilities amend to those that the team already has.

Saturday, March 5, 2011

Timing is essential!

In recent projects, I've had the pleasure of working with an organized frame for testing. The frame includes very traditional elements: first you design your tests, write them out for experts in the matter to review, to receive comments that as per your documentation they are not sure if what you do is enough and how you could improve. After ignoring most of the comments for various reasons, starts test execution phase, in which you're supposed to report which tests now pass and which fail, and which have you even tried out.

I work within an organization that provides contents to this frame of testing. We have testers, who supposingly could do the testing blind-folded, or with the help of other subject-matter experts. The frame is just for showing if we've done what we were supposed to, and if the schedule can hold up to its promises.

I'm having a small-scale rebellion within the frame.

For a particular area to test, I just did not manage to motivate myself into writing the tests as requested in the planning phase. It was a prioritization decision with too many things to do, to drop out something that would most likely be of little value. For testing in a different way, I have significant organizational support within my organization. The organization guiding the testing work is external, but my good fortune is that the somewhat-of-an-expert-in-test-positioning allows me more flexibility than others.

We skipped the "test planning phase". When "test execution phase" started, I trusted the 3 month timeframe would be enough for what we had thought and discussed on a very high level, checking a sample of production data, some hundreds of items. I assumed the people I work with knew how this particular thing is supposed to work, at least in some scale as they had participated in defining what we'd want.

We started executing tests as the version became available. We run tests on a selected sample of production data, selected against a business-oriented criteria of commonality and essential differences that we'd run into. We had selected a corner to start from, so this was just the first step - we were not sure what other steps would be required, but there was an idea that perhaps about 4 different rounds of attacking the software would be needed. The first round included 61 samples of data - 61 SOAP messages where the input data was the only variable.

Half of the response messages included wrong answers. We digged in deeper in our analysis, and identified 13 separate issues we reported. Same problems would happen in a lot of our samples, at least that was our assumption. Now, about a month later, I know that 5 out of our 13 issues have resulted in a code change to get the result we expected. One is in the queue. Others were due to us setting up some of our data incorrectly - a typical problem in our domain, looking for the stuff in the wrong
environment. With every step of our testing, we learned together, and adapted what we'd want our next steps to be. This was possible and easy, as we had not yet written the test specification that would fix the contents of our tests.

However, as was expected, the organized frame would make its comeback. I received polite reminders of delivering our test specification, up until the point I felt the "last responsible moment" had arrived. I reviewed examples of what the test spec should be like, and made the decision of not taking the advice from examples, which would have required me to write 61 test cases of the first round we did - documentation that no-one really needs. Instead, I wrote four test cases and no-one can argue those are not test cases. One just happens to handle multiple data samples.

At this point, we know from running the first round, that there's another round just like this one with a tweak of one of the most essential variables in addition to data. And we're safer to assume, knowing our skills and understanding the application better, that four samples of different changes to variables would cover the ground that we need to cover for us to be safe with the most significant risks. Thus, four tests cases.

Yesterday evening, two days after having delivered the test specification for review, I run the second batch of test. There's other people to do some more detailed analyses, but already as I was running them, I looked at all the results in a matrix to spot trends and increase my understanding of what this would take to test as far as we want to go. I realized, that if I had run my tests one-by-one as management wish was for reporting purposes, there would have been problems I could
not have spotted. They were obvious in a larger sample, but would have not been noted in single samples.

As I assumed, I also received the "how do we put these test into our reporting" -question, with kind request of writing tests as they had thought out for reporting purposes. I suggested two options:

1) four test cases, where one has been started but state is fail until bugs are fixed (which hides our progress, but serves as a rough way of seeing our progress)

2) add categorizing numbers to the test cases after we've run them, as we don't exactly know the contents before we're running them. They'd metrics-wise be the same as others, after the test have been run but not before. We just don't want to do extra work that does not create value, and them being able to follow our progress in detail provides little value if any.

Looking forward to how this turns out, and will write a better experience report later. I need these to change the status quo and allow good people do do good testing in our segment. There should be more consideration on timing when to do what and interleaving test design / execution. To allow my colleagues to work efficiently with better results than before, I need to help in creating a better frame. Time is right for that.

Tuesday, September 28, 2010

A story on requirement traceability

Inspired with a thread on software-testing list, I shared a story I'll also post here.

Not very long ago, I was working in a project, contractor side, with responsibility on testing the changes with a team of testers. The change was adding a common new feature to a number of applications, built with various technologies.
As typical in the sector (insurance / pension), we had lots of documentation: requirements / functional specifications / technical specification for each application, going to the detail where there was not much room for interpretation. We also had the Way-Testing-Must-Be-Done, including traceability to detailed test cases. Since someone had thought of requirements being different, they came up with the concept of test requirements, where you'd create yet another level of documentation as part of the specification project that puts all others together in point of view of testing.

The test requirements were created per application. They detailed what should be tested - whatever the specification maker had come up with. As the Way-Testing-Must-Be-Done stated, we carefully linked each test case to requirement, and for a lot of the requirements, there were several test cases. Huge effort.

On the side, we did a little exercise regrouping the requirements on a list that was formatted towards the overall change and risks related to that in particular ways. Just for fun, we traced our tests to this list too. Previously we had 100 % coverage as the Way-Testing-Must-Be-Done required us. From this point of view, the coverage measure was 13 %. We did not add more test cases.

Eventually, we tested. We run out of schedule with less than half of planned tests executed, and had to pass on the software anyway. It was tested by yet another group, with very little problems to note. No complaints in production (they still might not know it's not working...) The unfortunate part was that our group wasn't doing too well results-wise in our testing, we found only a handful of problems.

I've written down some metrics during the project. The size of the overall effort was about 5 man-years, and 16,7 % of it was reserved for testing. We logged 5 bugs. A big part of the testing was talking to people, 75 people listed if you wanted to talk to all that were significantly involved in making it happen.

In my past projects on a completely different sector (software products), this testing would have been considered quite much of a failure. Documentation was expensive, it did not help us in the future, and it did not help us finding problems (there weren't much) and making sure we would have tested before passing it on in the chain.

Lessons I actively took from this:
- I will not compromise my beliefs in what makes good testing for the Way-Testing-Must-Be-Done without a good discussion again
- Requiring and managing traceability isn't providing much value this way - we can use the requirements (some of them at least) as session charters instead of creating more useless documentation. I knew it before, now I know how much it took in effort with little value provided.
- The traceability concept we were using missed an essential part: the level of quality committed developers could produce without support from a traditional testing group in testing of their own and ways of building the software to avoid some of the problems.

In my current projects, I guide contractors from customer side. Traceability is the magical proof that the contractor did what the customer required, and that sending extra invoice on anything unclear is allowed. I'd prefer cheaper ways of doing that, and getting a system to production that serves at least a significant part of the expectations that were included in setting up the project. I don't want full coverage. It's way too expensive. And when the cost is needed, I'd prefer responsible ways of covering risks instead of the requirements.

Wednesday, February 24, 2010

Tester scope and authority

Some weeks back, there was a discussion on the yahoogroups software-testing list, into which I managed to dare to comment.

The discussion, shortly summarized, was handling testers scope and authority, and what a testers should do that is within her authority. There was a comment related to the idea of separation of concerns emphasized in agile, where the "what" questions belong to business and "how" questions belong to the team. And a tester is part of the team.

I find myself and a significant portion of my colleagues in test to be people who are somewhere between business and the team. I've intentionally chosen to be a tester, and focus most of my energy into testing type of activities. I could be a project manager. I could be a product owner. But, I'm a tester.

If I would choose to be e.g. a product owner, I could still test. I could take the bits from XP and interpret that acceptance testing, at least the tip of it after all the automation, belongs to the customer role.

I see the potential personal benefits in focusing on one or the other of what / how – getting to actually be good at one instead of trying to do both. But again, mostly from a personal point of view, do I really have to choose between the sides I live by "living up to my role"?