Why unhandled exceptions are useful

Posted by Simon Cooper on Simple Talk See other posts from Simple Talk or by Simon Cooper
Published on Mon, 03 Jun 2013 11:44:04 +0000 Indexed on 2013/06/24 16:33 UTC
Read the original article Hit count: 292

Filed under:

Uncategorized

It’s the bane of most programmers’ lives – an unhandled exception causes your application or webapp to crash, an ugly dialog gets displayed to the user, and they come complaining to you. Then, somehow, you need to figure out what went wrong. Hopefully, you’ve got a log file, or some other way of reporting unhandled exceptions (obligatory employer plug: SmartAssembly reports an application’s unhandled exceptions straight to you, along with the entire state of the stack and variables at that point). If not, you have to try and replicate it yourself, or do some psychic debugging to try and figure out what’s wrong.

However, it’s good that the program crashed. Or, more precisely, it is correct behaviour. An unhandled exception in your application means that, somewhere in your code, there is an assumption that you made that is actually invalid.

Coding assumptions

Let me explain a bit more. Every method, every line of code you write, depends on implicit assumptions that you have made. Take this following simple method, that copies a collection to an array and includes an item if it isn’t in the collection already, using a supplied IEqualityComparer:

public static T[] ToArrayWithItem(
    ICollection<T> coll, T obj, IEqualityComparer<T> comparer)
{
    // check if the object is in collection already
    // using the supplied comparer
    foreach (var item in coll)
    {
        if (comparer.Equals(item, obj))
        {
            // it's in the collection already
            // simply copy the collection to an array
            // and return it
            T[] array = new T[coll.Count];
            coll.CopyTo(array, 0);
            return array;
        }
    }

    // not in the collection
    // copy coll to an array, and add obj to it
    // then return it
    T[] array = new T[coll.Count+1];
    coll.CopyTo(array, 0);
    array[array.Length-1] = obj;
    return array;
}

What’s all the assumptions made by this fairly simple bit of code?

coll is never null
comparer is never null
coll.CopyTo(array, 0) will copy all the items in the collection into the array, in the order defined for the collection, starting at the first item in the array.
The enumerator for coll returns all the items in the collection, in the order defined for the collection
comparer.Equals returns true if the items are equal (for whatever definition of ‘equal’ the comparer uses), false otherwise
comparer.Equals, coll.CopyTo, and the coll enumerator will never throw an exception or hang for any possible input and any possible values of T
coll will have less than 4 billion items in it (this is a built-in limit of the CLR)
array won’t be more than 2GB, both on 32 and 64-bit systems, for any possible values of T (again, a limit of the CLR)
There are no threads that will modify coll while this method is running

and, more esoterically:

The C# compiler will compile this code to IL according to the C# specification
The CLR and JIT compiler will produce machine code to execute the IL on the user’s computer
The computer will execute the machine code correctly

That’s a lot of assumptions. Now, it could be that all these assumptions are valid for the situations this method is called. But if this does crash out with an exception, or crash later on, then that shows one of the assumptions has been invalidated somehow.

An unhandled exception shows that your code is running in a situation which you did not anticipate, and there is something about how your code runs that you do not understand. Debugging the problem is the process of learning more about the new situation and how your code interacts with it. When you understand the problem, the solution is (usually) obvious. The solution may be a one-line fix, the rewrite of a method or class, or a large-scale refactoring of the codebase, but whatever it is, the fix for the crash will incorporate the new information you’ve gained about your own code, along with the modified assumptions.

When code is running with an assumption or invariant it depended on broken, then the result is ‘undefined behaviour’. Anything can happen, up to and including formatting the entire disk or making the user’s computer sentient and start doing a good impression of Skynet. You might think that those can’t happen, but at Halting problem levels of generality, as soon as an assumption the code depended on is broken, the program can do anything. That is why it’s important to fail-fast and stop the program as soon as an invariant is broken, to minimise the damage that is done.

What does this mean in practice?

To start with, document and check your assumptions. As with most things, there is a level of judgement required. How you check and document your assumptions depends on how the code is used (that’s some more assumptions you’ve made), how likely it is a method will be passed invalid arguments or called in an invalid state, how likely it is the assumptions will be broken, how expensive it is to check the assumptions, and how bad things are likely to get if the assumptions are broken.

Now, some assumptions you can assume unless proven otherwise. You can safely assume the C# compiler, CLR, and computer all run the method correctly, unless you have evidence of a compiler, CLR or processor bug. You can also assume that interface implementations work the way you expect them to; implementing an interface is more than simply declaring methods with certain signatures in your type. The behaviour of those methods, and how they work, is part of the interface contract as well.

For example, for members of a public API, it is very important to document your assumptions and check your state before running the bulk of the method, throwing ArgumentException, ArgumentNullException, InvalidOperationException, or another exception type as appropriate if the input or state is wrong. For internal and private methods, it is less important. If a private method expects collection items in a certain order, then you don’t necessarily need to explicitly check it in code, but you can add comments or documentation specifying what state you expect the collection to be in at a certain point. That way, anyone debugging your code can immediately see what’s wrong if this does ever become an issue. You can also use DEBUG preprocessor blocks and Debug.Assert to document and check your assumptions without incurring a performance hit in release builds.

On my coding soapbox…

A few pet peeves of mine around assumptions. Firstly, catch-all try blocks:

try
{
    ...
}
catch { }

A catch-all hides exceptions generated by broken assumptions, and lets the program carry on in an unknown state. Later, an exception is likely to be generated due to further broken assumptions due to the unknown state, causing difficulties when debugging as the catch-all has hidden the original problem. It’s much better to let the program crash straight away, so you know where the problem is. You should only use a catch-all if you are sure that any exception generated in the try block is safe to ignore. That’s a pretty big ask!

Secondly, using as when you should be casting. Doing this:

(obj as IFoo).Method();

or this:

IFoo foo = obj as IFoo;
...
foo.Method();

when you should be doing this:

((IFoo)obj).Method();

or this:

IFoo foo = (IFoo)obj;
...
foo.Method();

There’s an assumption here that obj will always implement IFoo. If it doesn’t, then by using as instead of a cast you’ve turned an obvious InvalidCastException at the point of the cast that will probably tell you what type obj actually is, into a non-obvious NullReferenceException at some later point that gives you no information at all. If you believe obj is always an IFoo, then say so in code! Let it fail-fast if not, then it’s far easier to figure out what’s wrong.

Thirdly, document your assumptions. If an algorithm depends on a non-trivial relationship between several objects or variables, then say so. A single-line comment will do. Don’t leave it up to whoever’s debugging your code after you to figure it out.

Conclusion

It’s better to crash out and fail-fast when an assumption is broken. If it doesn’t, then there’s likely to be further crashes along the way that hide the original problem. Or, even worse, your program will be running in an undefined state, where anything can happen. Unhandled exceptions aren’t good per-se, but they give you some very useful information about your code that you didn’t know before. And that can only be a good thing.

Developer IT