I'm not sure that the original description is precisely correct, but yours isn't correct either.
Basically, you can't treat green threads just like "a multi-threaded runtime" and have it just work. That is, a 1:1 mapping between green threads and OS threads is just OS threads.
So fundamentally if you bounce your green stacks off of the actual stack they're going to need to go somewhere... and that place must be the heap.
There are pluses and minuses to this implementation, but the biggest minus is that it makes FFI very complicated. C# has an extremely rich native-interop history (having historically been used to integrate closely with Windows C++ applications) and therefore this approach raised some serious challenges.
In some sense, async is the cost for clean interop with the C/system ABI. Transition across OS threads requires something like async.
I meant that you can have a multi-threaded runtime that will be executing your green threads in a multi-threaded fashion. Like in Go you have (by default) as many worker OS threads as CPUs, and the Go runtime will take care of scheduling your green threads on those worker OS threads (+ create threads as needed for blocking syscalls if I remember correctly, but that's getting way to deep into the details). And this will, in fact, "just work" from the user's perspective.
And yes, as both you said, and I said at the end of my previous comment, the main hurdle of green threads imo is FFI, but it's not what the article mentions, which is what surprised me.
Ah, I see. You were saying that green threads can usually be scheduled on multiple os threads and take advantage of parallelism. Yup, I agree. Apologies for the confusion.
Basically, you can't treat green threads just like "a multi-threaded runtime" and have it just work. That is, a 1:1 mapping between green threads and OS threads is just OS threads.
So fundamentally if you bounce your green stacks off of the actual stack they're going to need to go somewhere... and that place must be the heap.
There are pluses and minuses to this implementation, but the biggest minus is that it makes FFI very complicated. C# has an extremely rich native-interop history (having historically been used to integrate closely with Windows C++ applications) and therefore this approach raised some serious challenges.
In some sense, async is the cost for clean interop with the C/system ABI. Transition across OS threads requires something like async.