No, client-side rendering is not intrinsically safer than server-side rendering,...

furstenheim · on Feb 10, 2022

Sorry, what do you mean by parsed identically? In CSR you can have data displayed into the front-end without ever be parsed as HTML. You do some http call to the backend, get a json get the property and do, element.textContent = myData. If that's unsafe there would be a bug in the browser, ain't it?

chrismorgan · on Feb 10, 2022

I was going to use optional start tags and tbody as my example, but on checking the spec it turns out that tr is actually valid as a direct child of table, even if the HTML syntax will prevent you from creating it by inserting a tbody around it. (XHTML 1.0 validation also confirms that tbody is genuinely optional there.) This actually undermines my “as is the case in valid HTML”—but never mind, I’ll demonstrate what the point was, and what is at least generally the case.

So let’s go with a more egregious invalidity: nested links. Which browsers do actually support, but HTML syntax doesn’t. Suppose you produce this DOM tree (server side or client side, I don’t care):

  p
  ├ #text "Look at this "
  ├ a href="https://a.example"
  │ ├ #text "link with "
  │ ├ a href="https://b.example"
  │ │ └ #text "nesting"
  │ └ #text " like so"
  └ #text "!"

(Client-side, you could generate it like this:

  let p = document.createElement("p");
  let a1 = document.createElement("a");
  let a2 = document.createElement("a");
  a1.href = "https://a.example";
  a2.href = "https://b.example";
  a2.append("nesting");
  a1.append("link with ", a2, " like so");
  p.append("Look at this ", a1, "!");

)

That serialises to this in both HTML and XML syntaxes:

  <p>Look at this <a href="https://a.example">link with <a href="https://b.example">nesting</a> like so</a>!</p>

(Client-side, `p.outerHTML`; `new XMLSerializer().serializeToString(p)` shows the XML syntax, which is the same modulo an xmlns attribute for XML reasons. Incidentally, `p.outerHTML` gives you HTML syntax for an HTML-syntax document and XML syntax for an XML-syntax document, which mostly means if you served the file with the application/xhtml+xml MIME type.)

But parse that with the HTML syntax, and the nested links break (e.g. `document.body.innerHTML = p.outerHTML`):

  p
  ├ #text "Look at this "
  ├ a href="https://a.example"
  │ ├ #text "link with "
  ├ a href="https://b.example"
  │ └ #text "nesting"
  └ #text " like so!"

And that is the steady state (meaning you can round-trip it again as much as you like and it will no longer change):

  <p>Look at this <a href="https://a.example">link with </a><a href="https://b.example">nesting</a> like so!</p>

Returning to the initial remark you’re asking about: I wrote that having more than just HTML in mind (kind of why I brought HTTP into it later on, and because other formats like Markdown may be being used, and who knows about it; and in the parent comment, SQL parameters had been mentioned, which is also a good example of the issue in hand), that this is a general remark about stability and safety: that interpolating strings raw is just dangerous, and that you should parse and serialise—provided the format has been designed so that that’s a safe operation. As it happens, the typical DOM tree representation of HTML doesn’t protect you enough, so you need to work with valid HTML for it to be fully robust.

Actually, I’ve just thought of the perfect example of why valid HTML is important when you’re crafting a tree for serialisation, because it actually would introduce an injection vulnerability: comments. Contemplate this:

  document.createComment('--><script>alert("pwnd")</script><!--')

  #comment "--><script>alert("pwnd")</script><!--"

  <!-- --><script>alert("pwnd")</script><!-- -->

Or you could break scripts by injecting </script> or stylesheets by injecting </style>, given that they don’t use HTML entity escaping. I think these are the only cases where invalid HTML could actually be harmful; most places (not that there are many—optional start tags, link nesting and paragraph nesting are just about it) it’ll just shuffle the DOM slightly.

Y’know what? I’m starting to think even the tree form is rather dangerous to work in for HTML. XML syntax protects you from almost all inconsistency, but doesn’t guard against that comment attack (that’s literally the only thing it’ll miss) and loses the <noscript> element.

I’m tempted to retract my position that client-side rendering is not intrinsically safer than server-side, but so long as you have a step that validates your HTML before you serialise it, you’re still OK (and even the breakages depend on injecting arbitrary content into a comment, script tag or style tag, which are all extremely unlikely), so I retain my position, now hanging precariously from that delicate thread of the word “intrinsically”. I think there’s a gaping chasm below me. Hopefully there’s something soft to land on.

scambier · on Feb 11, 2022

quick unrelated question: do you use a tool to draw the indent levels?

chrismorgan · on Feb 11, 2022

I typed it out manually in Vim using its built-in digraphs to get the box drawing characters: <C-K>vv for the vertical line, <C-K>vr for the vertical-and-right line, <C-K>ur for the up-and-right line.

See also https://news.ycombinator.com/item?id=30273299 for related discussion yesterday and one popular tool that helps with related things, if not this specific style of illustration.