Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even more impressive (to my way of thinking) is c4:

https://github.com/rswier/c4/blob/master/c4.c

c4 is what you might call a microscopic C compiler; it's surprising how much it does for the code it has.



C4 handles a similar subset of C to what selfie does, but is a full order of magnitude smaller : about 600 lines. And it is very readable despite the terseness!


One thing I love is how it reads the whole source file into RAM and operates on it as an array. So many compilers (including my own...) avoid this, which I guess ostensibly lets them handle huge machine generated source files (but that's an extreme outlier, and really I'd be inclined to let people depending on that use something else or work around it). The first compiler I saw that didn't was Wouter van Oortsmerssen's AmigaE compiler [1], which demonstrated that it was a viable strategy even when your compiler is expected to run on a machine with 512KB RAM. It's one example of where we often over-abstract for dubious benefits.

[1] http://strlen.com/amiga-e


On a machine with virtual memory couldn't it fix the source size issue by mmaping instead of reading it into ram, and leting the OS page it in/out of memory?


Yes, that should work, as long as you make sure to ensure it gets zero terminated so you don't have to do length checks all over the place - that'd lose a lot of the benefit. For files that are not a multiple of the page sizze that's guaranteed. I've never tried to request mapping of more bytes than a file takes, though... I'm going to guess that still does something sane, but I've not tested it.


> I've never tried to request mapping of more bytes than a file takes, though... I'm going to guess that still does something sane, but I've not tested it.

Sadly, the Opengroup and Linux man pages say that

> Memory access within the mapping but beyond the current end of the underlying objects may result in SIGBUS signals being sent to the process.


Joy... Oh, well, I think on modern systems it'd be reasonable to just refuse to deal with source files that are too large to load into RAM, and leave it at that.


Wait, are you saying that one can't mmap a file larger than physical memory. Isn't this exactly what mmap is for. I think what masklinn is saying if you mmap a file, you can't read past the end or over allocate a single mapping. One could have another output mmap region.


SIGREFUSE is a lamentable omission in POSIX.


It doesn't reliably do something sane. IIRC, LLVM in that case just falls back to reading the file into an array the ordinary way. That only happens one time in 4096, so there's no practical difference to performance.


There's a bunch of other stuff in Selfie beyond the C compiler so you can't really compare them by line count.


c4.c author here - selfie seems to fill a niche between c4 and my other more full featured compiler/os swieros. Very cool!

https://github.com/rswier/swieros/blob/master/00README.txt


Yep, I just meant to point out they're different. Selfie is something educators wrote for their purposes. c4.c is more like a piece of code poetry.


That's mind-boggling to me. My lisp interpreter is hardly shorter than that, granted it was designed with extensibility as a goal. Still, incredible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: