I think it's as good as it gets. The rough principal is there: You have one file, and you can derive the documentation and the program from it with different compilers.
The idea of reading code like a book seems extremely flawed to me, and any attempts to create such languages either look like regular source code with slightly different syntax or are barely comprehensible (e.g. TeX)
Early C compilers, prior to prototypes, were rigid about the order of the various functions and includes, and that interfered with the exposition of the design. Literate programming was created by Knuth to address that.
TeX (and Metafont, and the original Tangle and Weave) were written in Pascal, which is much more rigid with respect to ordering than early C was. The CWEB used for the Stanford Graphbase came later. (Silvio Levy did the initial adaptation of Pascal WEB to C, and Knuth took it over when he switched to programming in C.)
Well, I have neither read through the entire 500 pages of tex.pdf[1] nor the 25k lines of tex.web[2], but I certainly have an impression of both.
I think it is very hard to comprehend what one single part of a given algorithm does and it's imho almost impossible to get a good picture of how all these pieces fit together.
Ok, tex.pdf is the same as TeX: The Book, but obviously rendered into a PDF instead of a hardcover book.
The chapters of the book each cover one aspect of the whole system. Sometimes it is a data type (such as boxes, hash tables, token lists, stacks, etc), other times it is functions that operate on them (creating/printing/copying/destroying boxes, for instance), other times it is an algorithm. Giving each chapter a coherent topic like this allows the human reader to comprehend them in isolation, without distractions from unrelated code. Modern languages have more facilities for high–level abstractions than Pascal did, so you could argue that this is no longer necessary, but I disagree.
Chapter 38 Breaking paragraphs into lines is a good example; it is all about the algorithm for breaking paragraphs into lines that fit into whatever width is currently available. The first thing he does is give a very high–level overview of the purpose of this code (line breaking of paragraphs), what primitives it operates on (boxes and vertical lists), the source of the algorithm (a paper he coauthored a few years previously), plus the improvements made to that algorithm (less memory usage, less likely to encounter numeric overflow).
He then discusses how this algorithm interacts with the rest of the system. It is called with one explicit argument, and relies on this and that global state. It makes certain changes to that global state. It also adds a single global variable not previously mentioned, and here is where we start to see the benefit of Web over plain Pascal. Pascal requires that all global variables be declared ahead of time, before any of the functions are declared and before the block of code that forms the body of the program. If we were reading the Pascal source code directly, we would see dozens or hundreds of these declarations before we even know what part of the program they are for, and before we even know what parts the program has. Instead, we see here that a single declaration is added to section 13, Global Variables. All the other globals are hidden from view, because they aren’t relevant to the task of line breaking a paragraph:
〈 Global variables 13 〉 +≡
just_box : pointer ; { the hlist node for the last line of the new paragraph }
Next, we are given an outline of the whole line_break procedure, where we can see that it has four basic steps:
〈 Declare subprocedures for line break 826 〉
procedure line_break (final_widow_penalty : integer );
label done , done1 , done2 , done3 , done4 , done5 , continue ;
var 〈 Local variables for line breaking 862 〉
begin pack_begin_line ← mode_line ; { this is for over/underfull box messages }
〈 Get ready to start line breaking 816 〉;
〈 Find optimal breakpoints 863 〉;
〈 Break the paragraph at the chosen breakpoints, justify the resulting lines to the correct widths, and
append them to the current vertical list 876 〉;
〈 Clean up the memory by removing the break nodes 865 〉;
pack_begin_line ← 0;
end;
The third of those steps is rather more complex than the others, which is perhaps a little unusual. Also, the second step was the subject of a whole paper. Regardless, this is a pretty good overview of the process: Identify the optimal breakpoints, then rebuild the data structures so that instead of one line of text that doesn’t fit we have a vertical list of lines of text that do fit.
If we immediately saw a few hundred lines of code here, we would have no idea what any of them were doing, or why.
Of course, modern programming languages are much more expressive than Pascal. If this program had been written in Rust, for example, then the code would have started out much more readable. The boxes would be defined in a crate of their own, so that no global definitions would need to be inserted into the main crate. The main crate would have a single `use` line for the box crate, and would probably refer to box–related types and functions using prefixed names like `box::short_display` or `box::fast_delete_glue_ref`. The `line_box` function would only be as long as this overview here, because all of the small tasks that it does would be packaged up into iterators and such. That third subtask in the overview might literally be `current_vlist = break_paragraph(para, breakpoints).iter().map(justify_line).collect()`. The overview would almost not be necessary; the code is approaching the expressiveness of the overview.
I think that this hows that that the literate programming style that Knuth uses does not make it “very hard to comprehend what one single part of a given algorithm does”, nor is it “almost impossible to get a good picture of how all these pieces fit together”. I think it enables Knuth to explain everything in a much clearer way than he ever could have done with comments in the source code. I think that many programs could benefit from a similar treatment, even when they are written in better languages than Pascal.
R-studio supports this for R although you can sprinkle in other languages although they are 2nd class citizens unsurprisingly. Emacs has org-mode with babel that combined with poly-mode gets you every feature you'd expect in the major mode of the language you're writing in. Emacs also supports R-markdown/Sweave quite well.
Good question!
I always think it would be nice to just write Markdown sprinkled with code, but without IDE/editor support, it's dead in the water :(