The irony of being asked to provide an estimate as to how long testing is going to take is that, not only is this like asking us how long it's going to take us to find all the easter eggs at easter, but often, it's not really up to us to decide when to stop.
Testing is the empirical technical investigation performed to allow stakeholders to make important quality-related decisions (Cem Kaner). Therefore, the question of "how long will testing take?" is really the question, "How long is it going to take for a decision-maker to have enough information to inform a risk-based decision?"
Therefore, the first estimation question is: "How much information do we need to gather to inform a risk-based decision within the context of this project?"
The second estimation question is: "How long will it take to gather this information and communicate it?" This answer of course depends on the product, the complexity of the product, the tester's understanding of the product, other stakeholders' understanding of the product, access to oracles, presence of obstacles to testing. So a lot of the key factors that impact the time it takes to test are completely outside the control of the tester. And the amount of information required to gather is more of a function of the decision-maker, than of the tester. Yet testers are asked to provide this information.
So how have I handled the testimation problem?Here is one way I've done estimation in the past:
Considering the multiple estimates involved, and the multiple factors outside of my control when it comes to how long the testing task is going to take, I make clear that I'm giving a 'ball-park' estimate, and should be treated as a heuristic for determining potential resources required which should be continually analysed and adjusted. Therefore, it's important for me to be as transparent as possible lest a manager is mislead by my giving him a number and he assumes it's a scientifically-derived answer.
- Visual Testing Coverage Map (VTCM): A visual representation of our model of test coverage (more info here & here & here.)
- Test Coverage Areas: Functions or features of the product, or test activities we could dedicate testing time to cover, based on our model of test coverage.
- Test Session: A time-boxed uninterrupted test session dedicated to covering test coverage areas.
- The VTCM is divided into test sessions; that is, the test coverage areas that I believe could be covered in an ideal 90 minute uninterrupted testing session.
- I will assume that a tester can complete three test sessions per day. Time taken to investigate difficult-to-reproduce bugs, bug reporting, and setting up environments are not included as these are very variable and impossible to predict. Emails, meetings, learning activities, and other interruptions are also not included, hence the estimate of three test sessions per day.
- The product is well understood by the people around us and we have access to their knowledge in some form (documents, conversations etc)
- The test environment meaningfully reflects the production environment
- Login details, account names, passwords, IP addresses, URLs, FTP details, Database details and any other information needed to access and test the product are available to the tester.
- The tester has sufficient permissions within their environment and within the application so as to be able to perform testing to a satisfactory level. (this usually means admin access to the application and to your machine)
- Where good testing relies on production or production-like data to be present, then that data will be present, or the ability to create and populate that data will be available (at a level I can perform).
- Communication between testers, PMs, developers, BAs and other project team members will be unimpeded by geographical, political, or other constraints.
Example:This is a VTCM for Digg for Desktop which I created in a recent Weekend Testers ANZ exercise:
(view full size here: http://i.imgur.com/6zW6RZp.jpg)
Each shaded area represents the Test Coverage Areas of a 90 minute session. Therefore, according to this model, each shaded area is an equivalent amount of work. There are 12 sessions. 12/3 = 4 therefore I estimate it will take four days to cover this model....
...which we should immediately distrust.
However, we now have a benchmark that we can use to monitor how our testing is tracking based on the assumptions we have made. Where reality and our estimate differ, we can adjust our estimate on the fly.
I'm making no claim that this method is any more accurate (if there can be such a possibility) than any other method. But what it does provide is visibility. My experience with using this method is that I can show someone that VTCM and they can say "you're dreaming if you think it's going to take only 90 minutes to cover X" My estimate, and the model I used to derive that estimate is all contained together.
"How did you come up with four days?" "Let me show you" I can reply. People are then able to critique my estimate because the testing coverage areas, AND my model of the product are available and visible for anyone to see. And when we start testing, we should hopefully begin seeing right away whether there's something systemically wrong with the estimate. Turns out there's more bugs than anticipated and I'm only getting through two sesions a day? We can adjust. I have a map that stakeholders can use to select which areas they may want to descope, since my model contains areas I intend to test and areas I don't intend to test. It's kind of like a menu, with prices next to them. A stakeholder can then select what items on the menu they'd like, and if they have a fixed budget, be able to select the best course for them.