Patchelf is almost always broken, and when they merge fixes they don't release a new version right away. Between 2014-2019 running patchelf+strip would corrupt binaries, after the fix running patchelf twice on the same binary would corrupt it. And now patchelf doesn't work on all binaries, there's a bugfix merged https://github.com/NixOS/patchelf/pull/230 merged Nov 19, 2020, but no release since then.
To be fair, patching existing binaries isn’t all that easy to begin with. Once you need to start moving around structures, a lot of bets are off. In case of NixOS, I’m sure the long nix store paths added often overflow structures and require workarounds to avoid breaking images. Hacking around Win32 PE images, I’ve run into a lot of tricky subtle issues over time...
Yeah, it wasn't meant to criticize the project as a whole; I'm happy that at least someone took on the project of "growing" the rpath section... My only point of critique is they should immediately tag a new version after a bug fix, cause many distro's have buggy versions of patchelf.
patchelf is a lifesaver for handling Android/iOS with libraries compiled outside the regular toolchain (both platforms can be very finicky about soname/rpath).
On one occasion, however, I did spend days chasing a crash that turned out to be a bug when using an older version of patchelf on ARM binaries.
So great tool, but definitely be wary of introducing weird flakiness.
elf files ARE hard, the documentation is best described as "adequate". You can often learn a lot of things looking at how GCC generates ELF files that might not be obvious from looking at the ELF spec.
> why don't we just use base64-encoded JSON or something?
Base64 would be counterproductive, increasing both space and parsing time... presumably out of fear that JSON would contain a naughty byte for a greenfield file format. It would be much better to just design tho format to not have any naughty bytes.
Parsing time for exacutables and libraries is definitely on the critical startup path. You really want a length-delimited format, or better yet, one where offsets to various structures are stored at fixed offsets so you can find everything in O(1) time with a tiny constant factor.
Compiler writers and tool authors are perfectly comfortable working with binary file formats. There's nothing more inherently future-compatible about JSON than a forward-compatible binary format like flatbuffers. Having to escape and then unescape naughty bytes is a huge downside for text-based formats that are hardly ever read by humans.
On a side note, Zlib DEFLATE / gzip / LZMA etc. aren't magic for getting rid of space overheads. Try gzip -9'ing your system's wordlist, now convert it to UTF-16 and gzip -9'ing it. You'll see a several percentage increase in size, despite an entropy change of at most a constant and small number of bits (-log2(P(UTF-16)/P(UTF-8)). I've frequently seen huge JSON proponents use hand-wavy arguments that gzip will reduce any size differences to zero.
It's also nice if the file format is very close to being able to just be mmap()ed into the process's address space and only require minimal patching to a minimum number of pages in order to be an optimized in-memory representation.
Also, there's a huge amount of momentum behind executable formats. Incremental improvements by adding new features in new segment types or appending new fields to old data structures (where there's no ambiguity) is much preferred to wholly new formats.
So, creating a new debugging symbol section that's just flatbuffers is workable. Replacing the whole ecosystem with base64-'d JSON would have way more downsides than upsides.
On another side note, you need to be very careful with JSONifying floating point values. Many libraries don't give you bit-perfect round-tripping of IEEE-754 double precision values.
That's a good point about the critical path - I was thinking it would be a bit bigger and slower since you'd have to decode it, but I hadn't realized what an impact that would probably have. No bit-perfect round trips is also absolutely horrifying and I would never have even thought that would be a thing.
>Compiler writers and tool authors are perfectly comfortable working with binary file formats. There's nothing more inherently future-compatible about JSON than a forward-compatible binary format like flatbuffers. Having to escape and then unescape naughty bytes is a huge downside for text-based formats that are hardly ever read by humans.
Well, the problem isn't binary or non-binary, the problem is that these formats like ELF are, apparently, really annoying to deal with, have weird limitations, and are difficult to extend. The reason I thought of JSON in particular is because it doesn't really need to be extended to encode anything (unless you include escaping or base64 encoding binary data you want to put inside of a JSON document as "extending" it). You can encode all of the fields in ELF (or any other format) inside of JSON, while it doesn't make sense to consider the converse because ELF has fixed fields with fixed meanings.
That's the problem with these bespoke binary formats like ELF - they're not designed to encode arbitrary schemas of data, they're designed for very specific tasks and then when they get used outside of their intended environment, we get problems like have been described in this thread. Nobody has ever had these problems with a JSON document - maybe with something that consumed one, but the file format itself simply does not have the same kind of limitations like ELF does. It has different limitations, but they're not of a fundamental and semantic nature like they are in a more rigid format.
You're right that it would be a problem to have to escape/unescape every section every time you wanted to run something because that's very slow, but I think that's basically the only problem that these bespoke binary formats solve. If that's the case, I wonder why something like Matroska wouldn't work for binaries? My understanding is that it's basically binary XML and allows for basically a completely arbitrary dictionary structure. It doesn't have nice tooling like JSON or XML do, but there's no weird restrictions on things like field length that I'm aware of. I guess it doesn't exactly have any "momentum", though, but maybe the NixOS people will get sick enough of ELF to consider such a drastic solution :P
> That's the problem with these bespoke binary formats like ELF - they're not designed to encode arbitrary schemas of data, they're designed for very specific tasks and then when they get used outside of their intended environment, we get problems like have been described in this thread. Nobody has ever had these problems with a JSON document - maybe with something that consumed one, but the file format itself simply does not have the same kind of limitations like ELF does. It has different limitations, but they're not of a fundamental and semantic nature like they are in a more rigid format
This is nothing specific to binary formats, but specific to insufficiently extensible formats. Note that I specifically mentioned flatbuffers, which provide for extensibility while keeping parsing latency low.
Also, ELF was designed to be extensible by adding new sections. You could totally add functionality by adding a new section holding JSON data.
Don't confuse JSON with extensibility. I've seen plenty of headaches with poorly thought out JSON schemas where forward compatibility wasn't sufficiently well thought out. There are also tons of elegantly extensible binary formats. ELF is just old; much older than JSON. A new binary format would probably be more elegantly extensible.
Part of the excellent "nix pills" guide on why things are the way they are in Nix, and why this tool was created.
The whole series is a pretty easy evening read if you have experience in functional languages (like haskell), and really helps you appreciate what nix is doing.
One of my main reservations about Nix and introducing it at work is basically requiring that other developers wrap their minds around functional concepts like partial application.
This might just be an imagined problem, but not one I want to spend cycles explaining to the higher-ups
I'm working on it now at my work, partially following some advocacy in the last Nix thread on here a month or so back. I think the biggest barrier for me is that you can't really be only partially in— like, you sort of can, but you lose a lot of the benefits if you have impure builds linking to a bunch of filesystem stuff.
So yeah, you need sufficient buy-in that you can spend the effort required to basically port your entire system to a new operating system and packaging scheme. And depending how big your system is, that might be a lot of work that has to happen upfront before any real value is delivered.
I highly recommend using the Nix package manager alongside whatever you're comfortable with. That way you can `nix shell -p foobar` when you need a package quickly or fallback to brew/apt/etc if you're not yet comfortable addressing the situation in Nix.
I built a (research) library a few years ago to rewrite ELF binaries; our research projects ran into a lot of limitations with doing incremental patches to a binary (ELF has a lot of redundant representations of the same data). For us, parsing the binary into a normalized representation, modifying that, and re-serializing worked — we could make more intrusive changes to the binary, and (almost? I don’t recall anything breaking) everything in the Debian repos still ran after the binaries has been rewritten.
I expect the library is now woefully out of date, and documentation is mostly in the form of conference talk slides:
Probably one or two years ago I randomly met this wonderful tool amidst dealing with a VSCode Remote SSH problem via search. The story begins with using VSCode remote SSH on a HPC that is running CentOS 6. VSCode Remote SSH ship with a Node binary that is dynamically linked to a glibc that is not supported on older OSes such as CentOS 6 https://code.visualstudio.com/docs/remote/linux . I am not the system manager on the HPC and could not update the system myself, and VSCode team is too arrogant to support those old OSes. Alhough I have a homebrew environment and has a new glibc built there, the Node VSCode does not recognize that from environment variable. Manually update string in the node binary’s strings does provide me with solution at least: https://github.com/microsoft/vscode-remote-release/issues/10...
As a researcher I am really thankful of the developer who created this wonderful tool.
Also another reason I have to come up with this is though VSCode is open-sourced, VSCode’s remote SSH extension is private. I had to look into the extension module’s uglified source code to figure out what is happening out there… Alternative solution would involve edit the uglified js file https://github.com/microsoft/vscode-remote-release/issues/10...
Needing a LD_PRELOAD on a system where LD_PRELOAD is blocked.
Sure it still requires you to have rights to create+run a executable.
But it's a much better model then LD_PRELOAD as it can only be used on executable you can write to, which normally means you your user/group but not other user/group owned executables. Which is especially relevant with suid/sgid.
You can also use this, to patch a library which is compiled in a way where LD_PRELOAD doesn't work. Like suid binaries. Through you have to potentially make a copy and set the suid bit again.
Being able to add or update the rpath on apps and shared objects is pretty handy, particularly when the target would be annoying or problematic to rebuild.
To add, there's chrpath(1), but that can't grow the field in the ELF file, so you can only change the rpath to something shorter than it is now, and usually it's empty.
We use this at my workplace to add $ORIGIN-relative rpaths to third-party dependencies of internal code, since we want to distribute/version those along with the code, not with the OS, and we also don't want to distribute them in the medium of OS packages (so you don't have to be root, so we don't risk messing up the base OS, etc.). In many senses that's just a "non-FHS distro," but it's arguably not a distro at all.
Even for easy-to-compile third-party code, squeezing -Wl,-rpath into all the right places is more of a pain than you'd hope, so we just run patchelf on everything in the third-party directory at the end of the build, regardless of whether the "build" is an actual build or just an untar of proprietary software.
You don't even need to patch it to include a certain path, you can use (and I have used) it to include $ORIGIN or $ORIGIN/../lib in programs' RPATHs.
Be careful about when this is and isn't okay though. When used wrong this can be a security risk, such as when your browser auto-downloads files to a Downloads folder, and the program you are patching is also in there.
Sure, but only if you have write access to the binary. If you have that you can already replace it with anything you like, so I am not seeing the privilege escalation there.
There should be no situation where a malformed or unexpected PT_INTERP leads to privilege escalation. If you believe you have found such a situation, please report it.
Lots of Python packages containing native code are optionally distributed with pre-compiled libraries that have been modified with patchelf. I believe both auditwheel (https://pypi.org/project/auditwheel/) and conda-build do this.
Sounds like something that would make binary patching easier. Lack of simple binary patching is one of the key bits missing that would make switching to static binaries more viable. Currently one of the big upsides to dynamic libraries is the bandwidth they save in distribution where binary patches would go a long way to addressing this.
Also patchelf exposed a bug in ldconfig noted 11 years ago and only fixed in glibc 2.31: https://nix-dev.science.uu.narkive.com/q6Ww5fyO/ldconfig-pro...
And currently patchelf still has a bug: https://github.com/NixOS/patchelf/pull/275
Apparently elf files are hard.