> should we continue to uphold "everything is a file" POSIX doesn't do a very go...

AnthonyMouse · on Feb 11, 2017

> POSIX is full of entities (e.g. notably processes and threads) which can't be manipulated via file descriptors.

Well there is /proc

skissane · on Feb 11, 2017

Which isn't a standard part of POSIX. It is an implementation-specific extension, which only some implementations have (OS X most notably lacks it), and even those who have it implement it in incompatible ways (e.g. /proc on Linux is quite different from /proc on Solaris).

Also, Linux /proc doesn't really allow you to treat processes as file descriptors like FreeBSD's pdfork does. Sure, I can open /proc/123, but operations on the resulting file descriptor have no relationship with the process – if I pass the FD into a select() I won't get any notifications about the process lifecycle.

jstimpfle · on Feb 11, 2017

I think this discussion misses the point what "everything is a file" is about.

The standard way to exchange (read, write) information should be through the Unix file interface where possible. Note that "file" here means "FILE *", i.e. an "opened file" or "stream". Not a file on a filesystem which is just one way to make one kind of stream.

In other words that mantra is about simplicity. Simplicity is what enables software to exist.

skissane · on Feb 11, 2017

> I think this discussion misses the point what "everything is a file" is about.

I don't agree. POSIX systems provide APIs to wait on events on file descriptors – select, poll, etc. I can use those APIs to wait on an event on a pipe, a network socket, a device file, etc. To wait on an event on anything which is represented by an file descriptor.

But what if I want to wait for a child process to exit? I can't use select/poll for that since processes are not represented by file descriptors in standard POSIX – FreeBSD is a notable exception, but FreeBSD's facilities are non-standard.

If I want to simultaneously wait on both a child process to terminate and a message to arrive on a socket, I am forced to use multiple threads. For example, I could have a thread to execute waitpid() and then have it write to a pipe, and have another thread select/poll on both that pipe and the network socket. If processes were represented by file descriptors, I could have a single thread doing a select/poll on both the network socket and the child process.

> In other words that mantra is about simplicity. Simplicity is what enables software to exist.

Yes, and POSIX fails to be simple here. By failing to sufficiently unify its abstractions, it forces application code to be more complex than it should be.

jstimpfle · on Feb 12, 2017

> But what if I want to wait for a child process to exit? I can't use select/poll for that since processes are not represented by file descriptors in standard POSIX – FreeBSD is a notable exception, but FreeBSD's facilities are non-standard.

And Linux has signalfd (also not POSIX).

I agree with you, signals are a pain, and could probably be replaced by fds at the handling side (of course not: KILL, STOP, CONT...). The overhead of polling vs asynchronous signal delivery could be optimized away with the VDSO.

> If I want to simultaneously wait on both a child process to terminate and a message to arrive on a socket, I am forced to use multiple threads.

As a matter of fact, you are not. I'm sure you know about EINTR.

Anyways I was more referring to your original parent ('should we continue to uphold "everything is a file"'), and it seems we both agree that "everything is a file" as a pure ideology is very worthwhile.

skissane · on Feb 12, 2017

> As a matter of fact, you are not. I'm sure you know about EINTR.

The problem with using signals to manage child processes comes when child processes are started by libraries. If the library installs a handler for SIGCHLD, it could conflict with other libraries or with the application which wants to do the same thing. The great thing about FreeBSD style process descriptors is that no signal handlers are needed–a library can start a child process and then wait for it to complete (and wait for other things such as network sockets simultaneously) without relying on signal handlers which don't work well when an application is composed of numerous independently developed libraries (as many modern applications are).

jstimpfle · on Feb 12, 2017

I agree. Signals have process scope, so they are a bad idea in general (they counter composability), no matter what's the delivery mechanism. Signal FDs don't help here vs EINTR.

DaiPlusPlus · on Feb 12, 2017

> and it seems we both agree that "everything is a file" as a pure ideology is very worthwhile.

I don't understand that ideology - I don't believe everything can be abstracted as a file considering that ioctl() breaks that abstraction. For example, what is it to "read from /proc/{pid for Chrome}"? Would you be reading from the process' memory space? Reading from a rendered bitmap of the process' main framebuffer? Reading a human-readable text file of process metadata? If it's labelled metadata then how do you deal with UX localization? Same thing when "reading from a /dev device" which isn't a storage device, what should it mean to "read from a GPU"?

skissane · on Feb 12, 2017

> I don't understand that ideology

The OS kernel needs to provide user space processes with access to kernel-managed resources such as files, devices, network sockets, processes, threads, etc. In order to do so, there needs to be some way of identifying these individual resources to the kernel. And then the question is, should every type of resource have a distinct type of identifier? Or should we have a single type of identifier which could refer to an instance of any one of those types of resources?

The pure "everything is a file descriptor" ideology (or its Windows NT equivalent, "everything is a handle") says we should have a single type of identifier, the file descriptor (or handle), which can represent resources of any type for which the process can invoke kernel services – processes, threads, files, network sockets, etc.

Standardised POSIX does a poor job of living up to this ideology, since APIs for managing processes (kill, waitpid, fork, etc) take and return PIDs, not file descriptors. Since a process is not a file descriptor, I can't select()/poll() on it. Using PIDs is also prone to race conditions, whereas file descriptors are less prone to this problem (although not completely immune from it.)

The process descriptor functions such as pdfork provided by FreeBSD do a much better job of living up to "everything is a file descriptor" ideology than pure standardised POSIX does.

> I don't believe everything can be abstracted as a file considering that ioctl() breaks that abstraction. For example, what is it to "read from /proc/{pid for Chrome}"?

Why must every file support read() and write()? Some device files or other special files might only support ioctl(), and maybe also select() and poll(), and I see nothing in principle is wrong with that.

AnthonyMouse · on Feb 12, 2017

You're asking the question backwards. It isn't a question of which one the file is -- "/proc/{pid for Chrome}" isn't even a file, it's a directory.

The idea is rather that there should exist a file you can read the process' memory space from, which there is. The path to and specific format of that file is specified by the system documentation.