Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Nitty Gritty of “Hello World” on OS X (reinterpretcast.com)
48 points by joesavage on Nov 8, 2014 | hide | past | favorite | 7 comments


I don't want to explain each of the 8548 output bytes in detail here

I wonder how many people these days (including developers) would think "that's not so big for a Hello World program", and then change their mind after watching some 4k demos... 8KB might not sound like much in this era of gigabytes and terabytes, but eight thousand bytes is still, in absolute terms, quite a bit of data, and enough to do plenty more interesting things. Executable formats have become more complex with their headers, which are mostly unavoidable, but seeing empty space in the majority of the file is somewhat sad.

Here are some smaller Hello World programs to examine...

http://seriot.ch/hello_macho.php - OS X

http://timelessname.com/elfbin/ - Linux

...but they're still somewhat larger than the 20 bytes of the DOS version (95 ba 07 01 cd 21 c3 48 65 6c 6c 6f 20 57 6f 72 6c 64 21 24.)


I'm curious how much space would get saved if the binary was stripped, a step most people forget to run if they're trying to see what a bare minimum binary looks like. Of course, I'm not very familiar with the Mach-O format, but for ELF binaries it can take a 4K hello world from GCC down to a couple hundred bytes.


truly curious here. what's the tradeoff. What did we gain with bigger/more sophisticated executable formats?


Page-aligning the segments makes it (much) cheaper to load pieces of the executable on demand (demand paging) which reduces the amount of physical memory required to launch the program. Note that most of that size is zeroes padding the file out. Also remember that most file allocations are quantized to 512 or 4096-byte blocks.

Dynamic linking makes it possible for executables to share a single copy of system frameworks; this saves physical memory, improves performance (due to reduced cache traffic), reduces the size of downloads, etc.

The UUID allows the debugger to reliably associate an executable with the dSYM bundle containing debug information.

Other features not discussed in the article allow a single program to run on multiple architectures (PPC, I386, X86_64, etc), allow the system to reliably determine whether a program has changed, to decide what capabilities the program has been granted, etc.

It's worth putting these size number into perspective; yes, EIGHT THOUSAND BYTES sounds big and scary next to twenty, but the overhead scales linearly with the number of executables and their functional complexity, not geometrically. Also, you have on the order of ONE TRILLION BYTES of storage for these headers, rather than three hundred and sixty thousand...


Page-aligning the segments makes it (much) cheaper to load pieces of the executable on demand (demand paging) which reduces the amount of physical memory required to launch the program.

In this case the entire executable, headers and all, can fit in 1 page, so instead of just having that one page read into memory and executed immediately, you have to read another one for a total of 2 pages. For a small executable, this overhead can result in up to twice as many pages being read in, and although for larger ones that require many tens or more pages it decreases proportionally, it still doesn't make any sense to add otherwise completely useless bytes that have to be read in; disks (and SSDs) are several orders of magnitude slower than memory, so if the block containing the header is going to be read, it might as well include some more data (like the code of the program) that would need to be read later in any case.


Out of curiosity, what kind of impact does having executables with so much zero-padding have on code being cached into the L1/L2/LN caches of a CPU? Sure it might be more efficient to load sections of an executable into memory from the file system due to alignment, but does the inflated size cause code to get booted out of a cache more often, or does it not really have any effect? We might have nigh infinite amounts of cheap storage these days, but that still doesn't hold true for the small caches next to the CPU where things run truly fast, and not just adequately fast.


It has almost no one effect on L1/L2 cache usage. CPU caches tend to work on 64-128 byte cache lines. Most of the lines in s mostly zero page will never get touched so they will never enter the cache.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: