how does the stack copying technique (as used in lthread) compare to segmented stacks (as used in go)? The swap technique is very simple but I would assume that the stack copy shows up in profiles and is significantly slower than the segmented stack technique... Are there any benchmarks?
Segmented stacks will take much more space. One of the reasons I went with stack copy is because it allows me to create a million lthreads if I wanted without worrying about memory.
Lthread is in C, an environment where you control/manage your memory. Using slab allocators and high performance malloc like jemalloc, you can avoid allocating a lot of your variables on the stack and the stack copy will become minimal. In most of the production code I have running using lthread, stack copying is on the average of 300 bytes.
Lthread is a cool experiment but coroutines in C is a dubious ideal. If you're writing a program in C/C++, presumably you want ultimate control and the fastest possible execution. You wouldn't introduce extra runtime complexity and a speed hit for easier to use APIs (blocking I/O functions). You'd just use epoll and non-blocking sockets (or libuv). That's not to say that coroutines/green threads/goroutines in high-level languages where you've already decided to take a speed hit for easier use isn't useful. Said another way: if your concern is about easy to read code, C is not the right choice; if your concern is speed: copying a stack on every i/o function is not the right choice.
When you said segmented stack, It registered stack per lthread for me. Ignore my previous reply.
Segmented stacks aren't possible in C, because you don't have control over the stack's growth and the MMU when a function is running.
Coroutines simplifies your code a lot compared to callbacks(via libuv, epoll) and with lthread, you barely have any performance hit if you don't have sizable variables on the stack. I've written web servers and proxies in lthread that performs as fast as nginx (and sometimes faster). I think that's a pretty good deal.
> You wouldn't introduce extra runtime complexity and a speed hit for easier to use APIs (blocking I/O functions).
Your program is running in a kernel. You already have lost a lot of performance, and the complexity got introduced. Kernel does a gazillion thing while running your program, so if you really want raw performance, make sure to run your code on a bare metal CPU without kernel intervention. But that's not convenient, hence you compromise, pay a penalty, and run your program in a kernel. But does that mean you might as well use a high level language because you compromised on performance? no.
> if your concern is about easy to read code, C is not the right choice
I disagree. They aren't mutually exclusive.
> if your concern is speed: copying a stack on every i/o function is not the right choice.
lthread doesn't copy a stack on every IO function call. The lthread stack is copied only after the socket blocks or it reached its fair share. Sometimes this means an lthread can do 2 to 5 calls before it yields. And when it yields, depending on how deep you are in your call, the stack gets copied from there. If you don't have big variables on your stack, the stack copy can be just few bytes. Note that it's not the whole scheduler stack that's copied (4MB by default), but only what was consumed by lthread (usually from few bytes to 300 bytes, it varies based on the code).
You are over-simplifying your choices to this: if you want performance, use C & callbacks otherwise use a high level language. There's a wide spectrum that you are missing out in this over simplification.
Grow up. C programs are ported painfully to each operating system. This is the reality programming in C. If the author does not support your system, he simply has not spent the time to port it. Proper autoconf tests isn't going to magically make software run on BSDs because the proper test won't be apparent until the author attempts to compile on BSDs. Does it suck? Yes. Has it ever been any different? No.
I suspect you're going to get downvoted for saying this, but I'm going to upvote you and give people a reason to think about this.
If someone goes through the time to program in C and release their source code for free, shouldn't we be thankful, rather than being upset at their autoconf tests they haphazardly threw together? I can't imagine how people can see these as being completely necessary.
This is the metaprogramming we used to build OpenAMQ. It was very successful in technology terms (1M lines of real code generated from 20k lines of metacode) but a failure in social tems (too high a barrier to community participation).
https://github.com/joyent/libuv