Small, sharp tools (2014)

codedokode · on July 20, 2022

While the idea that there can be a lot of small and composable tools sounds good in theory, in practice the implementation is very poor. To use tools in a script, they should have standartized machine-friendly input and output formats, but instead they have arbitrary and difficult to parse formats. It means that they are not intended for use in scripts.

- spaces or newlines in file names are often not handled properly. Some linux software based on shell scripts requires that a user install it to a path that has no spaces.

- the output format is often arbitrary and not machine-friendly. For example, `ls -l` output is difficult to parse properly. Some programs might output a file size with different suffixes, like 10k or 10M, that you have to parse.

- output format often depends on locale settings and scripts can break if used with different locale

So the programs that rely on line-oriented inputs/outputs like `grep`, `wc`, `ls` (without -l), `sort` work relatively good, but other programs that produce more complex output, are unusable for scripting.

Scripts often use ugly hacks like parsing output with `sed` or, even worse, `awk`, a 40 years old cryptic legacy tool. Instead they could use something like JSONpath, for example `parse $.*.size -- ls -l` to get sizes of all files (although I don't think that using JSON is a good idea; it's hard to read by human. I think something indentation-based like YAML would be better; however, YAML itself is overengineered and I recommend against it).

Also the shells are an awful choice for scripting. For example, there is no syntax to write a glob pattern to select all files including dot-files, but excluding pseudo-files like . and ..

To conclude, "Unix philosophy" of composing utilities using shell scripts is often advertised as a superior and innovative approach but in practice it is a bunch of inconvenient to use programs glued together with unrealiable buggy scripts written in outdated language. It is easier to use Python and ignore all this "philosophy".

By the way, `/proc` filesystem also is not scripting-friendly and not standartized. So it is usable only as a debugging tool for humans.

Fnoord · on July 20, 2022

Because ls -l is long format, meant for human readable instead of machine readable. It might be ls -s is suffice.

Instead of parsing ls you may be better off with find or the quick Rust equivalent fd. Instead of awk you can use sed or cut + tr + grep (but I find awk easier to use awk). Nice tools to finish it off are sort and uniq.

Though I suppose nowadays everything becomes json.

unsafecast · on July 20, 2022

I've always thought the same, with one exception:

You can (and should) use lightweight formats that are hard to read by a human and have the shell format it to look nice. You can even format the same datum differently depending on context.

I don't like the idea of using the same format for machine and human consumption, they have different requirements.

tpoacher · on July 20, 2022

those examples seem a bit contrived to me.

i could say the same about python if I mandated no imports or sth

yakshaving_jgt · on July 20, 2022

I’m currently reading The Great Mental Models. The part I’m reading now is called The Map Is Not The Territory, and it is about formulating abstractions.

In this chapter, the author reminds us that we run into problems when our knowledge is that of the map we’re reading, rather than that of the territory it describes.

I don’t think a network of micro-services is analogous to a chain of small UNIX programs piped together. Those aren’t pipes between micro-services; they’re web servers and network requests. It’s a whole different ball game.

The abstraction is an attractive one, and perhaps it’s why micro-services are all we’ve been able to talk about for the past decade. But it’s still not a good abstraction. It’s an oversimplification.

coconuthacker42 · on July 20, 2022

I dont get it, care to elaborate?

yakshaving_jgt · on July 20, 2022

I mean I don't agree with the analogy that the author has drawn between a micro-service architecture and UNIX programs piped together, because I don't believe inter-process communication through streams is comparable to HTTP requests over the internet.

Put another [less accurate] way: a function call is not the same as a network request.

The analogy the author has drawn in the article is a kind of mental model — an intentional oversimplification — intended to convey an idea. It's useful to use mental models like this, but only if you have sufficient understanding of the underlying mechanics that you're abstracting over in order to determine whether the abstraction is useful, or if it's harmful.

In my opinion, because the underlying mechanics are so different, the mental model isn't actually useful.

The author writes:

> The idea can be applied to the web in a similar way. Many modern web applications are choosing to build out their architecture as a set of services that communicate over the network layer, which has some parallels with the Unix model of programs that communicate via OS primitives.

I just don't agree with this.

The author appropriately recognises that this kind of architecture often makes it considerably harder for us to manage shared context, but elides that other serious drawback that I've described.

a_c · on July 21, 2022

The implication goes way beyond code. Software writing is miniature organization design. Making clear interface is easier said than done. Zooming out a bit, a software team should be modular and composable. Zooming even more out, the whole company should be modular and composable. The whole cliche of "communicate more, collaborate more" is just a euphemism of bad interface design of organization. If you can get things done without communicating back and forth, who wants to spend 20 hours a day in meeting/brainstorm/catch-up?

Can we imagine having to email the author of `ls` to clarify the usage of flag `-l` in the name of breaking the silo? In reality it does happen but most of us just happily accept it or consult the documentation if needed. Now, can we imagine having to email the author of an internal library in your own company to clarify usage? I bet it happens all the time. Interface design is hard. Code-wise or org-wise.

heywoodlh · on July 20, 2022

Tiny tweaks and improvements (almost daily) to my Unix-related workflows makes a huge difference for me.

This is a good reminder for me to read The Art of Unix Programming.

coconuthacker42 · on July 20, 2022

yeah, over time I've had probably a hundred little scripts/shortcuts/commands that do one small thing and using them all together is amazing. most of them are in python

dotancohen · on July 20, 2022

I'd love to see some examples.

heywoodlh · on July 24, 2022

Can't speak for parent but a lot of my tiny improvements are in my dotfiles repo:

https://github.com/heywoodlh/conf

Particularly the shell aliases/functions I have in the .bash.d directory.

coconuthacker42 · on July 20, 2022

well there's one that gets comic books. basically downloads the pages from an online service, packages them as pdfs, sends to my phone

foogazi · on July 20, 2022

TIL

manularity [prob. fr. techspeak manual + granularity] A notional measure of the manual labor required for some task, particularly one of the sort that automation is supposed to eliminate.

codedokode · on July 20, 2022

Regarding the article, I think that "small" tools are better for scripting (but they should be made scripting-friendly; many Linux CLI tools are not), and complicated tools are better for usage by humans.

For example, take a word processor. You probably don't want to use one tool to type the text, then copy it into another tool for spell checking, into third tool to add equations, fourth tool to find and download clipart images, and so on. You want to have everything integrated.

meheleventyone · on July 20, 2022

The answer to the latter is making the application plugin/module based. Same philosophy with a different front end.

makemintco · on July 20, 2022

Is these are really useful ?

coconuthacker42 · on July 20, 2022

Why's the title personally attacking me?