Skip to main content

Lua-Style Coroutines in C++

Lua's implementation of coroutines is one of my all-time favorite features of the language. This (short) paper explains the whole reasoning behind the Lua's coroutine implementation and also a little about the history of coroutines.

Sadly, coroutines are not supported out-of-the box by many modern languages, C++ included. Which brings me to the subject of this post: Lua-style coroutines in C++! For those who don't know (or were too lazy to read the paper!), Lua's coroutines support three basic operations:
  • Create: Create a new coroutine object
  • Resume: Run a coroutine until it yields or returns
  • Yield: Suspend execution and return to the caller
To implement these three operations, I'll use a great header file: ucontext.h.
#include <vector>
#include <ucontext.h>

class Coroutine {
public:
    typedef void (*Function)(void);

    Coroutine(Function function);
    void resume();
    static void yield();

private:
    ucontext_t context_;
    std::vector&;t;char> stack_;

    static ucontext_t caller_;
    static ucontext_t callee_;
};

To start us off: a Coroutine class definition. Don't worry about the instance variables yet, we'll get to those later. I've chosen to map Lua's coroutine.create to the constructor, and coroutine.yield to Coroutine::yield.
Coroutine::Coroutine(Function function) :
    stack_(4096)
        
    getcontext(&context_);
    context_.uc_stack.ss_sp = &stack_[0];
    context_.uc_stack.ss_size = stack_.size();
    context_.uc_link = 0;
    makecontext(&context_, function, 0);
}
Here's the constructor. Notice how a 4K stack is allocated, and used to create a new context for our coroutine. Each coroutine needs to have its own stack, because the yield operation can occur at any time, even within a function that was called by the coroutine. Compare this to Python's generators, which can only yield in the top-level stack frame.

Now for the fun part! To resume a coroutine A, we need to save the context of the coroutine that called A first. Then we can run A until it yields or returns, and restore the previous context. The function swapcontext copies the current context into the first argument, and then swaps to the context specified by the second argument. In the code below, caller_ always stores the caller of the current context, and the callee_ is used to store the context of a coroutine right before it yields.
void Coroutine::resume() {
    ucontext_t save = caller_;
    // The caller context becomes the 
    // current context
    swapcontext(&caller_, &context_);
    context_ = callee_;
    caller_ = save;
}
The yield function is pretty easy. It copies the current context into callee_, and then restores the caller's context:
void Coroutine::yield() {
    swapcontext(&callee_, &caller_);
}
Done! Well almost...how about a quick demo:
void bar() {
    std::cout << "x" << std::endl;
    Coroutine::yield();
    std::cout << "z" << std::endl;
}

void foo() {
    // Start a nested coroutine from inside the 
    // current one
    Coroutine co2(bar);
    co2.resume();

    std::cout << "a" << std::endl;
    Coroutine::yield();
    std::cout << "b" << std::endl;
    Coroutine::yield();
    std::cout << "c" << std::endl;

    co2.resume();
}

int main(int argc, char** argv) {
    Coroutine co1(foo);
    co1.resume();
    co1.resume();
    co1.resume();
    std::cout << "DONE" << std::endl;
    return 0;
}

Here's the output for this short example:
x
a
b
c
z
DONE
Notice how the coroutine co2 is actually started inside of co1. In fact, the coroutines can even be passed around, returned from functions, and copied! (However, the copying is pretty expensive unless you use a copy-on-write stack). I think it might even be possible to implement software transactional memory with this coroutine class and some atomic test-and-set magic.

That's it!  Pretty simple.  Now if only Javascript had coroutines, maybe we could avoid all that crazy callback nesting.

Source: https://github.com/mfichman/lua-style-coroutines

EDIT: Coroutines are also possible to implement without ucontext.h, but you need to save the registers, flags, and stack pointer yourself. It's not really that hard, it just takes a bit of assembly.

EDIT: True Lua-style coroutines can pass parameters through the yield() and resume() calls, but that's pretty simple to implement with some template code, and I didn't want to clutter post.

Comments

  1. Sweet.

    There is however one big question -- how does it cooperates with C++ exception handling and all related stack magic?

    ReplyDelete
  2. It doesn't cooperate right now. I think that if I added a function that called the user's function indirectly, I could put a catch-all block there to save and then re-propagate the exception back to the user. Thanks for the comment!

    ReplyDelete
  3. uC++ addresses this issue pretty well: http://plg.uwaterloo.ca/~usystem/uC++.html

    ReplyDelete
  4. Nice! I'd never heard of uC++. It looks pretty interesting.

    ReplyDelete
  5. Is there any way to kill coroutine?

    ReplyDelete
  6. I run your code.
    and gdb in your code , I got

    x
    a
    b
    c
    z

    Program exited with code 0340.

    there seems to be a problem here.

    ReplyDelete
  7. it seems
    before each resume function return
    you have to call yield to swapcontext back.

    add Coroutine::yield() at the end of foo() and bar()
    the program would be fine .

    please mail me if I am wrong.

    ReplyDelete
  8. Apparently there is not mechanism to set status to FINISHED

    ReplyDelete

Post a Comment

Popular posts from this blog

Password Generator for Chrome

This week, I finally got fed up with typing in/managing passwords on a billion different sites. Since things like OpenID haven't really taken off, I decided to take matters into my own hands...and write a password generator extension for Google Chrome. There are actually a ton of such apps on the Chrome web store, but I'm paranoid about security, so I wrote my own and open-sourced it. By virtue of being open source, perhaps people will trust my version a bit more. Anyway, the extension is available here , and the source code is hosted at github . May all your online transactions be secure! UPDATE: Fixed github link.

Sparse Voxel Octrees

Terrain implementation for games is a subject with a lot of depth. At the surface, it's very easy to get rudimentary terrain working via a noise function and fixed triangle grid. As you add more features, though, the complexity builds quickly. Overhangs, caves Multiple materials Destructive terrain Collision detection Persistence Dynamic loading Regions, biomes Very large scale Smooth features Sharp features Dynamic level of detail Extremely long view distances In isolation, each of these features can make terrain difficult to implement. Taken together, it takes a lot of care and attention to make it all work. Minecraft is probably the canonical example for terrain generation in the last generation of games. In fact, I'd say Minecraft's terrain is the killer feature that makes the game so great. Minecraft does an excellent job at 1-8, but for a recent planetary renderer project I was working on, I really wanted 9-12 too. In a series of articles, I'm planning to break do