A common sequence of steps we mortal software developers frequently find ourselves implementing goes something like this:

  1. Perform some sort of setup or acquire some sort of resource.
  2. Carry out some arbitrary sequence of actions.
  3. Tear down things we setup or release resources we acquired in step 1.

There are well-known patterns for implementing this scenario robustly, but when there are multiple sub-steps to be performed in the setup phase and where any of those sub-steps can each fail individually, things get more complicated. This article presents a concise, self-documenting and robust way to handle these more complicated cases. A follow-up article will extend this further to improve some performance characteristics and ends up having a lot in common with the ScopeGuard11 pattern described in various places online.

The multi-step setup problem

Conceptually, the problem we want to solve can be described as follows:

  1. For each setup sub-step:
    • Perform sub-step.
    • If sub-step fails, stop and release/clean up after all previous setup sub-steps.
  2. Carry out some arbitrary sequence of actions.
  3. Tear down things we setup or release resources we acquired in all sub-steps of step 1.

In the rest of this article, we will use a specific example scenario to highlight the different techniques discussed. The scenario is to implement a function which runs some kind of test on a device and then return the device back to its original state. The device requires multiple actions to be performed on it before the test can run and each of those actions can fail, leading to the test being aborted. In addition, the test can throw exceptions or fail in other ways. The specific set of steps we will consider are as follows:

  1. Connect to the device.
  2. Configure the device for test mode.
  3. Initiate the test.
  4. Monitor the device during the test to ensure correct operation.
  5. Upon successful completion of the test, reset the device back to normal mode.

The naive approach

To highlight the problems this scenario raises, let’s look first at a naive implementation. We will discuss its robustness issues shortly.

// Return true on success, false on failure
bool testDevice(Device& dev)
{
    // Setup sub-step A
    if (!dev.connect())
        return false;

    // Setup sub-step B
    if (!dev.setupForTest())
    {
        dev.close();
        return false;
    }

    // Setup sub-step C
    if (!dev.startTest())
    {
        dev.setupForNormalUse();
        dev.close();
        return false;
    }

    bool testPassed = false;
    // Monitor test status for a while
    // and set value of testPassed to true
    // if the test was successful.
    // ...

    // Cleanup. We assume none of these
    // calls fail or throw!
    dev.stopTest();
    dev.setupForNormalUse();
    dev.close();

    return testPassed;
}

Looking at the above, we can see that for each sub-step, we have to include code to perform cleanup for each previous sub-step in the event of failure. The more sub-steps we have, the more out of control this duplication of cleanup code becomes. A more serious issue, however, is that this code is not exception-safe. If an exception is thrown at any point, none of the cleanup steps will be performed.

A less naive approach: RAII

To address the issues of code duplication and robustness to exceptions, we would typically turn to RAII (Resource Acquisition Is Initialization). In simple terms for our purposes here, the RAII technique boils down to creating a local object which acquires a resource in its constructor and releases that resource again in its destructor. Acquiring a resource and performing a setup step can be thought of as being equivalent, similarly releasing a resource and executing a cleanup step are also equivalent.

A common example of RAII is the use of a locker object to control the state of a mutex for some block of code. The locker object’s constructor locks the mutex, while its destructor unlocks that mutex again. For example:

std::mutex mutex;

void func()
{
    // Step 1: The lock_guard constructor will
    // lock the mutex
    std::lock_guard<std::mutex> locker(mutex);

    // Step 2: Execute some code which might throw
    // an exception or return at any point
    // ...

    // Step 3: mutex is automatically unlocked by
    // locker's destructor when it goes out
    // of scope
}

In the above example, if an exception is thrown in step 2 or when the function returns for any other reason, locker‘s destructor will always unlock the mutex. Hence, the locker object is robustly ensuring that the resource release is always performed.

Applying this technique to our more complex example scenario, one possible implementation might look something like this:

// Return true on success, false on failure
bool testDevice(Device& dev)
{
    RAII_Connect connect(dev);
    if (!connect.success())
        return false;

    RAII_SetupForTest setup(dev);
    if (!setup.success())
        return false;

    RAII_StartTest startTest(dev);
    if (!startTest.success())
        return false;

    bool testPassed = false;
    // Monitor test status for a while
    // ...

    // The RAII destructors will take
    // care of all required cleanup
    return testPassed;
}

This now robustly handles exceptions being thrown at any point and we have avoided duplicating the cleanup code for each sub-step. But wait, we don’t have any actual code which performs each sub-step, nor the associated cleanup for them! All we’ve done is hide the code behind three new classes which have to be defined elsewhere, which we’ve conveniently omitted from the above. Furthermore, each class is defined separately from the rest, so we lose the flow of the actual code being executed.

This highlights one of the drawbacks to RAII when applied in this way, namely that it can result in distributing the real code across helper classes and obscure what is really going on. But C++11 gives us some tools which lead to better solutions.

Keep code in place using lambdas

Rather than shifting the code for each sub-step out to separate classes, we can use lambda functions to define the code directly in place. We then use a single helper class to control when those lambdas are invoked to give us the acquire/release or setup/cleanup behaviour we want. A common name for this helper class is ScopeGuard and you can find plenty of material online about it (see here for a talk by Andrei Alexandrescu covering the topic). In this article, we will use the name OnLeavingScope for reasons which will become clear shortly.

The basic idea is that we pass the lambda as an argument to OnLeavingScope‘s constructor. A copy of that lambda is stored and only invoked in OnLeavingScope‘s destructor. When creating an OnLeavingScope object, the code then naturally reads in a way which says exactly what the code is setting up. Showing how our example scenario looks with this approach should make it clearer:

bool testDevice(Device& dev)
{
    if (!dev.connect())
        return false;
    OnLeavingScope disconnect([=]
    { dev.disconnect(); });

    if (!dev.setupForTest())
        return false;
    OnLeavingScope restoreNormalMode([=]
    { setupForNormalUse(); });

    if (!dev.startTest())
        return false;
    OnLeavingScope stopTest([=]
    { stopTest(); });

    bool testPassed = false;
    // Monitor test status for a while
    // ...

    // Cleanup is done automatically by
    // OnLeavingScope objects' destructors.
    return testPassed;
}

This is very concise and expresses exactly what we want to happen with minimal extra code getting in the way. The name we give the local variable is like a summary name for the action to be performed when it goes out of scope. The lambda we give its constructor is then the implementation of that action. The pattern is thus OnLeavingScope summary(cleanup-code).

The above example has three OnLeavingScope objects. When the function returns or an exception is thrown at any point, each OnLeavingScope object created up to that point will be destroyed in the reverse order in which they were created. This is perfect for us, since it ensures we perform cleanup in the reverse order of our setup sub-steps.

So, we’ve achieved our goal of avoiding code duplication and ensuring robustness. As a bonus, we also improved readability along the way. Now we just need to define the OnLeavingScope class and we have all the pieces of the puzzle.

OnLeavingScope implementation

If you look at the discussion around ScopeGuard online, the material can get a bit involved. The focus here in this article is to provide a robust, flexible approach which is not too onerous to learn. We will get into all the nitty gritty details in a follow-up article, so don’t worry, you won’t be missing out!

For a basic implementation, C++11 gives us some further syntactic sugar which makes defining OnLeavingScope relatively easy. A robust and flexible, yet fairly simple implementation looks something like this:

#include <functional>

class OnLeavingScope
{
public:
    // Use std::function so we can support
    // any function-like object
    using Func = std::function<void()>;

    // Prevent copying
    OnLeavingScope(const OnLeavingScope&) = delete;
    OnLeavingScope& operator=(const OnLeavingScope&) = delete;

    OnLeavingScope(const Func& f) :m_func(f) {}
   ~OnLeavingScope() { m_func(); }

private:
    Func m_func;
};

The constructor accepts a function object, a copy of which we store and then invoke in the destructor. Normally, we’d expect client code to pass a lambda to the constructor, but we can accept any function object without any loss of simplicity or robustness. The type of function object used is defined as std::function, which is already pretty self-documenting: any function (or function object) which returns void and takes no arguments can be used.

The other key point to observe is that we ensure OnLeavingScope objects can never be copied. If we allowed copying, the function object would be invoked in the destructor of each copy, but this class is meant to encapsulate a function to be performed only once when the OnLeavingScope object is destroyed.

And that’s all we need to be able to start using OnLeavingScope in our code to encapsulate cleanup tasks to be done in a robust, straightforward way.

But there’s more to the story…

As presented, the OnLeavingScope implementation can still be improved further. We have not explored adding a move constructor or differentiating between lvalue and rvalue function objects passed to the constructor. Furthermore, the use of std::function comes with its own advantages and disadvantages, some of which may be less or more desirable in certain situations. These and related matters are fairly advanced topics and are discussed in an upcoming article. That article introduces a slightly different way of creating the OnLeavingScope objects, but is otherwise just an optimized version of the material presented here which ends up being very similar to the ScopeGuard implementations.