Skip to main content

Lua-Style Coroutines in C++

Lua's implementation of coroutines is one of my all-time favorite features of the language. This (short) paper explains the whole reasoning behind the Lua's coroutine implementation and also a little about the history of coroutines.

Sadly, coroutines are not supported out-of-the box by many modern languages, C++ included. Which brings me to the subject of this post: Lua-style coroutines in C++! For those who don't know (or were too lazy to read the paper!), Lua's coroutines support three basic operations:
  • Create: Create a new coroutine object
  • Resume: Run a coroutine until it yields or returns
  • Yield: Suspend execution and return to the caller
To implement these three operations, I'll use a great header file: ucontext.h.
#include <vector>
#include <ucontext.h>

class Coroutine {
public:
    typedef void (*Function)(void);

    Coroutine(Function function);
    void resume();
    static void yield();

private:
    ucontext_t context_;
    std::vector&;t;char> stack_;

    static ucontext_t caller_;
    static ucontext_t callee_;
};

To start us off: a Coroutine class definition. Don't worry about the instance variables yet, we'll get to those later. I've chosen to map Lua's coroutine.create to the constructor, and coroutine.yield to Coroutine::yield.
Coroutine::Coroutine(Function function) :
    stack_(4096)
        
    getcontext(&context_);
    context_.uc_stack.ss_sp = &stack_[0];
    context_.uc_stack.ss_size = stack_.size();
    context_.uc_link = 0;
    makecontext(&context_, function, 0);
}
Here's the constructor. Notice how a 4K stack is allocated, and used to create a new context for our coroutine. Each coroutine needs to have its own stack, because the yield operation can occur at any time, even within a function that was called by the coroutine. Compare this to Python's generators, which can only yield in the top-level stack frame.

Now for the fun part! To resume a coroutine A, we need to save the context of the coroutine that called A first. Then we can run A until it yields or returns, and restore the previous context. The function swapcontext copies the current context into the first argument, and then swaps to the context specified by the second argument. In the code below, caller_ always stores the caller of the current context, and the callee_ is used to store the context of a coroutine right before it yields.
void Coroutine::resume() {
    ucontext_t save = caller_;
    // The caller context becomes the 
    // current context
    swapcontext(&caller_, &context_);
    context_ = callee_;
    caller_ = save;
}
The yield function is pretty easy. It copies the current context into callee_, and then restores the caller's context:
void Coroutine::yield() {
    swapcontext(&callee_, &caller_);
}
Done! Well almost...how about a quick demo:
void bar() {
    std::cout << "x" << std::endl;
    Coroutine::yield();
    std::cout << "z" << std::endl;
}

void foo() {
    // Start a nested coroutine from inside the 
    // current one
    Coroutine co2(bar);
    co2.resume();

    std::cout << "a" << std::endl;
    Coroutine::yield();
    std::cout << "b" << std::endl;
    Coroutine::yield();
    std::cout << "c" << std::endl;

    co2.resume();
}

int main(int argc, char** argv) {
    Coroutine co1(foo);
    co1.resume();
    co1.resume();
    co1.resume();
    std::cout << "DONE" << std::endl;
    return 0;
}

Here's the output for this short example:
x
a
b
c
z
DONE
Notice how the coroutine co2 is actually started inside of co1. In fact, the coroutines can even be passed around, returned from functions, and copied! (However, the copying is pretty expensive unless you use a copy-on-write stack). I think it might even be possible to implement software transactional memory with this coroutine class and some atomic test-and-set magic.

That's it!  Pretty simple.  Now if only Javascript had coroutines, maybe we could avoid all that crazy callback nesting.

Source: https://github.com/mfichman/lua-style-coroutines

EDIT: Coroutines are also possible to implement without ucontext.h, but you need to save the registers, flags, and stack pointer yourself. It's not really that hard, it just takes a bit of assembly.

EDIT: True Lua-style coroutines can pass parameters through the yield() and resume() calls, but that's pretty simple to implement with some template code, and I didn't want to clutter post.

Comments

  1. Sweet.

    There is however one big question -- how does it cooperates with C++ exception handling and all related stack magic?

    ReplyDelete
  2. It doesn't cooperate right now. I think that if I added a function that called the user's function indirectly, I could put a catch-all block there to save and then re-propagate the exception back to the user. Thanks for the comment!

    ReplyDelete
  3. uC++ addresses this issue pretty well: http://plg.uwaterloo.ca/~usystem/uC++.html

    ReplyDelete
  4. Nice! I'd never heard of uC++. It looks pretty interesting.

    ReplyDelete
  5. Is there any way to kill coroutine?

    ReplyDelete
  6. I run your code.
    and gdb in your code , I got

    x
    a
    b
    c
    z

    Program exited with code 0340.

    there seems to be a problem here.

    ReplyDelete
  7. it seems
    before each resume function return
    you have to call yield to swapcontext back.

    add Coroutine::yield() at the end of foo() and bar()
    the program would be fine .

    please mail me if I am wrong.

    ReplyDelete
  8. Apparently there is not mechanism to set status to FINISHED

    ReplyDelete

Post a Comment

Popular posts from this blog

Criterium: Road Screenshots

I finally got around to making the road tool for Criterium.  The tool has two parts: a Java application that lets you paint roads on the 2D heightmap texture, and a Ogre-based tool that automatically converts a 2D path into a 3D mesh.  The Ogre-tool queries the heightmap to get the height of the road, and performs smoothing so there are no discontinuous road segments.  I've posted a screenshot below.  Also, I've got my GIMP terrain shown in the screenshot.  I generated it using random noise and the GIMP lightmap filter.

Jet: Particle Systems

Here's a demo of the new particle systems I've implemented in OpenGL.  Performance is much improved over the DirectX version.  Particles are initialized in C++ rather than in Lua.  Also, I use two particle buffers and swap between them, rather than using one buffer per particle system.  Anyway, here's a video capture: