Scenarios and invariants

How does a developer come up with code ? How does he know what to write, and if it’s a correct thing to do ? Spoiler alert: it’s much more art than science.

Most developers have a rough idea of what their code has to do, and what results are considered valid, but it’s usually very informal. As a result, developers tend to forget things and write part of their code as an afterthought, sometimes with side effects leading to regressions. Despite Edsger Dijkstra’s best efforts, formal verification remains largely unused in the software industry, essentially because of its cost in relation to most of the tools currently in use. So, to be perfectly honest, most developers don’t write correct code. There are ways however to reduce the number of mistakes we make, and the cost of producing and maintaining large code bases without formal verification. This article presents a lightweight formal method that can help verify whether a solution seems to be a good fit for a problem, before writing a single line of code. When I apply it, I consider my development time to be roughly 70% design and 30% coding. Needless to say, I spend a lot of time looking at the sky with a blank expression, sometimes talking to myself for improved feedback… To begin with, a developer would start the process as usual, a bit like a scientist would come up with a theory: by guessing. Nothing special there. But that’s when this method kicks in.

Scenarios

The first thing to do is to come up with a list of scenarios. Usually, the problem concerns a functionality with a set of code paths that may be taken, and I’ll define a scenario here as one of these paths. This is similar to use case driven development, although at a much lower level. Note that, at this point, the code doesn’t exist, so what I’m referring to only exists in the mind of the developer.

A common situation, especially in concurrent computing, is the combination of different operations, potentially in any order. For example, assume A and B are two atomic, mutually exclusive operations, that may occur in any order. There are two code paths to consider, two scenarios : A and then B, or B and then A. As a side note, this is what makes concurrent computing through shared memory inherently more difficult: critical sections combine, increasing complexity, and there is no real way to reduce it.

Invariants

While building the list of scenarios, the beginning of a solution should emerge in the creative mind of the developer. Patterns appear and seem true throughout the thought experiment of imagining the solution. They had better remain true at all times, or all hell breaks loose. These are invariants, and the developer should list them as well, since they will be the foundation on which the code will be built.

As an example of invariant, the cbuf module of my basic module library for C requires that the size of a circular buffer, defined as the difference between the end and start indexes, never exceeds the capacity of the buffer. This invariant is later used to build functions that check or update indexes. Invariants also provide a convenient way to check the code at runtime, for example using assertions.

The method

It is likely that the first version of these lists will be incomplete, as new scenarios and inconsistencies are identified. Remember that the purpose of this method is to help verify a solution, assuming we are mere mortals who often fail before succeeding. The idea is then to fix the solution until all invariants remain true for all scenarios. Here is some pseudo code :

def check_solution(solution):
    for scenario in solution.scenarios:
        for invariant in solution.invariants:
            if not check_invariant(invariant, scenario):
                return False

    return True

solution = guess_solution()

while not check_solution(solution):
    solution = improve_solution(solution)

write_solution(solution)

Conclusion

As with many formal methods, this may seem quite tedious at first. But I wouldn’t be talking about it if I didn’t feel it saved me a lot of time. And as formal methods go, this is a particularly lightweight one.

In the end, however, I want to insist on the importance of seeing the big picture, while minding the details, as well as imagination and creativity. As David Hilbert bluntly put it when a student of his had dropped out of mathematics to study poetry, “good, he did not have enough imagination to become a mathematician” ! I strongly feel the same applies for the development of quality code.