Not Sure About Uncertainty

By James Thomas

Risk-based testing (RBT) is defined in terms of the prioritisation of possible test activities based on perceived risks. These will often cover factors such as:

  • Usage:  perhaps we’ll test more on platforms where we have more customers
  • Impact:  if component X failed, all customer data would be lost, so perhaps we should test it
  • Type: we could consider risks to users or customers or shareholders or profits.

Frequently, identifying the risks will be an exercise in speculation, intuition, experience, happenstance, circumstance and plain old luck. We might have data on which to base a strategy and some of it may be considered reliable but, even with it, there’s still the second-order problem of creating a framework with which to compare the risks. A model of the risk space, if you like.

Let’s consider a product, a mobile app. A new release just arrived on our desk and, as it’s got history, we have the opportunity to gather some data on previous versions, development cycles, our customer base, income streams and so on.

A simple model for the risk space might suggest testing iOS and Android proportional to the sizes of their user bases. A more complex model might attempt to target testing towards areas that have the greatest potential impact on revenue. This model includes an explicit risk type and could include information about how the impact and probabilities of the identified risks are calculated and combined.

However you do it, in order to evaluate risks consistently it’s important to be clear about the kind of risk you are targeting. Prioritising on risk to a release date being met might result in a very different outcome to prioritising on usability, which would be different again to prioritising on the bottom line.

Yet even taking this into account, and even using an arbitrarily sophisticated model, we can’t in general be sure how accurate or useful it is. As so often in testing this kind of approach to modelling is necessarily heuristic: something that we believe is a reasonable, practical approach and which leads to a good enough result, taking account of context; but ultimately fallible.

This means that there is also risk associated with our risk model, and this should not be discounted when considering the problem we’re working on. We might hope that more time spent on the model would yield more accuracy (in whatever dimensions we’ve decided are important). But we also have to consider the cost of model-building against the budget for investigation the model is being created to serve. Further, the model is an abstraction and abstractions are always leaky which counsels careful thought about the effort and time spent finessing the model.

In practice, we’ll often refine the model as we learn more during the investigation. In this way, the perceived risks change as knowledge about the problem space changes.

John Stevenson presents a brief summary of psychological research that shows people are biased to prefer a risk they can evaluate over one they cannot. He ties this to RBT by suggesting that it is more likely to prioritise the investigation of quantifiable risks over more open-ended questions. From his piece:

The majority of our testing is spent on testing based upon risk, with outcomes that are statistically known … however does it have more value than testing against uncertainty?

Leaving aside the question of whether the majority of testing is or can be statistically known, the interesting distinction between risk and uncertainty that he’s drawing on is taken from the work of economist Frank Knight. Loosely, he says that:

  • Risk should be reserved for scenarios which can be measured, or quantified in some way, say with a probability (distribution).
  • Uncertainty is for those scenarios that cannot be quantified or measured in any way that we know of.

Most definitions of RBT that I’ve found don’t make this distinction, which means that while they do not rule out including uncertainties as risks, neither do they highlight the possibility of doing so.

Uncertainty is a particularly useful concept in testing. As testers, we are often in a position of uncertainty about any number of factors, including the extent to which it is reasonable to be uncertain about them. For example:

  • Where we have some numerical basis for a risk model the testing task might become one of determining the uncertainty associated with it. If we are confident in our risk assessment, perhaps we won’t (need to) test much or at all.
  • Where we have no numerical basis for our assessment – no possibility of risk calculation in Knight’s terms – we again have uncertainty and the testing task might become one of determining risk to some level of confidence.

This kind of thinking suggests that risk/uncertainty distinction might align well with the one between known and unknown, exemplified by this kind of matrix:

Could we consider it this way?

Our view of some factor as a risk includes an assumption that we understand it well enough to assign some kind probability to it: it’s a known-known. Where we are aware of some factor but there is uncertainty about how to quantify it, we have a known unknown.

We will often know much about a product, its requirements, surrounding context, the target customer and the numerous other factors that contribute to the understanding of the testing problem without ever enumerating them. This implicit knowledge falls into the class of unknown knowns – until and if it is made explicit.

Finally, in those cases where we aren’t even aware that there may be something to consider we’ve got unknown unknowns.

Let’s cast the testing task in terms of identification and understanding of uncertainty. On this view, risk would constitute the top-left of the matrix and the remainder would fall into uncertainty. But no classification is likely to be as clean as this. The red arc represents the notion that there is uncertainty about our assessment of risk against uncertainty.

With this model, the goal of testing would be to surface potential issues and place them into the top half of the problem space.

Nothing lodges in the bottom half of the model. Once a potential factor is identified, we may be able to say which of the two bottom squares it belonged to, but it is then known and so we perceive it in the top half.

But how are these issues found? Exploration, questioning and exposing assumptions can highlight them, pluck them from the bottom and place them somewhere above. Experimentation, new data and answers to our questions could move our perceived location for them from top right to top left.

But there can be migration in other directions too. Experimentation might reveal that what we thought we knew, we actually didn’t. At this point, a previously known known is found to be a known unknown – our risk turns out to be an uncertainty.  Digging into our reasons for making something a known known might expose that we only know it to some level of granularity and the rest was, in fact, an unknown known.

The usual focus of testing is to expose uncertainty and to classify it as risk where possible. Once a risk is understood, then we have arguably completed our work on it. The testing task becomes reviewing what we know for flaws, reviewing what we don’t know for things we could and should know and clarifying the things we know about but don’t understand.

The distinction between testing and checking is potentially interesting here. Checking can only ever operate in the top left quadrant, on known factors with (perceived) understood outcomes. The testing task is the bigger, broader, uncertain task.

A meta-challenge of testing is to balance the quest for new unknowns against the investigation of existing known unknowns on the basis of whatever data is available. Although the matrix is square and the four sections are identical in area, in the figures shown, there’s always additional uncertainty about their relative size and shape.

We should feed our risk model with the best data we have about that topography – more time spent on research might give us confidence that the unknown unknowns are small relative to the known knowns. An inexperienced team might have the opposite effect.

And, although it might be undesirable, this might mean testing results in an increase in uncertainty: we discovered that what we know about the product was even less accurate than we thought; we discover what the users want from the product is so severely different to what we thought we knew, that our previous knowledge, planning, activities and so on were useless.

Perhaps we could go further?

Is it too much of a stretch to say that testing is inherently uncertainty-based and that this subsumes RBT? Well, it’s certainly not true that all testing is or can be uncertainty-based. For example, in some circumstances statutes or policy mandate that particular kinds of tests will be carried out. But it doesn’t feel particularly controversial to say that, while some strategies of testing focus more on some sections of the matrix, the vast majority of testing should have a focus on uncertainty.

When we employ RBT we are making assumptions on what are probably objective uncertainties to turn them into subjective risks, so that we can move forward. As above, it’s heuristic – we’ll make the best decisions we can in the context. In this spirit, RBT is a “good enough” form of prioritisation that assumes whatever it feels is reasonable about uncertainty to make it into risk.

But, if Stevenson is right, RBT has blind spots and it’s advantageous to be aware of them, not least because it makes the output from our testing – information – fuller, balanced, more rounded. Good examples of that output will include some assessment of risk, some enumeration of the assumptions we made to generate it, a measure of the confidence in it and some idea of the areas in which there is still uncertainty.

And you can probably be sure that there is still uncertainty.

With enormous thanks to Joshua Lorden Raine without whom, I can state with certainty, this article would have been considerably poorer. And also to John Stevenson who was kind enough to talk through his article with me, and let me disagree with chunks of it in a very agreeable manner.

About the author

James is one of the founders of Linguamatics, the world leader in innovative natural language-based text mining. Over the years he’s had many roles in the company and is currently the test manager, a position in which he strives to provide an environment where his testers have an opportunity to do their best work. Find him: @qahiccupps and Hiccupps.