No, client-side rendering is not intrinsically safer than server-side rendering, provided all outputs of serialisation are parsed identically (as is the case for valid HTML trees).
The problems start when you try to manipulate serialised data, which is not safe to do this in the general case. You should instead construct a proper representation of what you desire, and then serialise that, depending on the serialiser to take care of all of this sort of stuff. This approach has always been fairly popular in compiled languages and languages that like types, but dynamic languages have historically significantly preferred to manipulate strings, I suspect because they don’t have good ergonomics on the other approach, and it’s probably slower in interpreted languages—you’ll note that React felt the need to extend JavaScript to make its approach acceptable to people.
Most JavaScript stuff that supports server-side rendering now is working in this way, crafting a DOM tree and then serialising that. Svelte is a notable exception in that it takes a declarative DOM tree and essentially serialises what it can at compile time, thereby still retaining the required safety guarantees.
There are definitely downsides to strict adherence to the model of crafting a data structure and then serialising it; most significantly, you can’t start streaming a response until you’re done. The solution for this is to use an append-only data structure (or possibly one that allows you to “commit” the document up to a given point, while still allowing mutations in anything that occurs later in the document); thus serialisation can begin before you finish writing the document.
You know the old favourite about parsing HTML with regular expressions? <https://stackoverflow.com/questions/1732348/regex-match-open...> (If not, enjoy!) This is the thing people need to understand and realise in the general case: serialised data should be treated as opaque, and only interacted with after real parsing and before real serialisation.
HTTP headers aren’t strings; "Date: Tue, 15 Nov 1994 08:12:31 GMT" is a serialised HTTP header, representing the actual header that’s more like {Date, 1994-11-15T08:12:31Z}. And that latter is the form you should interact with it in.
HTML isn’t strings; "<p>Hello, world!</p>" is the serialised form of a paragraph element containing a text node with data “Hello, world!”. And that’s the form you should interact with it in.
Yes, I am presenting a strongly-opinionated position that lacks any shade of pragmatism. Yes, my website is generated with templates that manipulate serialised HTML. Eventually I’ll replace it with something more sound.
One last note: at the start I said valid HTML, because it’s not enough to just serialise an arbitrary HTML DOM tree, as you can easily craft invalid HTML DOM trees, like nesting hyperlinks. In most regards, the XML syntax of HTML (still a thing) is actually a safer target to serialise to because then you don’t even need to validate your tree to be confident it won’t get mangled by the serialise/parse round-trip.
Sorry, what do you mean by parsed identically? In CSR you can have data displayed into the front-end without ever be parsed as HTML. You do some http call to the backend, get a json get the property and do, element.textContent = myData. If that's unsafe there would be a bug in the browser, ain't it?
I was going to use optional start tags and tbody as my example, but on checking the spec it turns out that tr is actually valid as a direct child of table, even if the HTML syntax will prevent you from creating it by inserting a tbody around it. (XHTML 1.0 validation also confirms that tbody is genuinely optional there.) This actually undermines my “as is the case in valid HTML”—but never mind, I’ll demonstrate what the point was, and what is at least generally the case.
So let’s go with a more egregious invalidity: nested links. Which browsers do actually support, but HTML syntax doesn’t. Suppose you produce this DOM tree (server side or client side, I don’t care):
p
├ #text "Look at this "
├ a href="https://a.example"
│ ├ #text "link with "
│ ├ a href="https://b.example"
│ │ └ #text "nesting"
│ └ #text " like so"
└ #text "!"
(Client-side, you could generate it like this:
let p = document.createElement("p");
let a1 = document.createElement("a");
let a2 = document.createElement("a");
a1.href = "https://a.example";
a2.href = "https://b.example";
a2.append("nesting");
a1.append("link with ", a2, " like so");
p.append("Look at this ", a1, "!");
)
That serialises to this in both HTML and XML syntaxes:
<p>Look at this <a href="https://a.example">link with <a href="https://b.example">nesting</a> like so</a>!</p>
(Client-side, `p.outerHTML`; `new XMLSerializer().serializeToString(p)` shows the XML syntax, which is the same modulo an xmlns attribute for XML reasons. Incidentally, `p.outerHTML` gives you HTML syntax for an HTML-syntax document and XML syntax for an XML-syntax document, which mostly means if you served the file with the application/xhtml+xml MIME type.)
But parse that with the HTML syntax, and the nested links break (e.g. `document.body.innerHTML = p.outerHTML`):
p
├ #text "Look at this "
├ a href="https://a.example"
│ ├ #text "link with "
├ a href="https://b.example"
│ └ #text "nesting"
└ #text " like so!"
And that is the steady state (meaning you can round-trip it again as much as you like and it will no longer change):
<p>Look at this <a href="https://a.example">link with </a><a href="https://b.example">nesting</a> like so!</p>
Returning to the initial remark you’re asking about: I wrote that having more than just HTML in mind (kind of why I brought HTTP into it later on, and because other formats like Markdown may be being used, and who knows about it; and in the parent comment, SQL parameters had been mentioned, which is also a good example of the issue in hand), that this is a general remark about stability and safety: that interpolating strings raw is just dangerous, and that you should parse and serialise—provided the format has been designed so that that’s a safe operation. As it happens, the typical DOM tree representation of HTML doesn’t protect you enough, so you need to work with valid HTML for it to be fully robust.
Actually, I’ve just thought of the perfect example of why valid HTML is important when you’re crafting a tree for serialisation, because it actually would introduce an injection vulnerability: comments. Contemplate this:
Or you could break scripts by injecting </script> or stylesheets by injecting </style>, given that they don’t use HTML entity escaping. I think these are the only cases where invalid HTML could actually be harmful; most places (not that there are many—optional start tags, link nesting and paragraph nesting are just about it) it’ll just shuffle the DOM slightly.
Y’know what? I’m starting to think even the tree form is rather dangerous to work in for HTML. XML syntax protects you from almost all inconsistency, but doesn’t guard against that comment attack (that’s literally the only thing it’ll miss) and loses the <noscript> element.
I’m tempted to retract my position that client-side rendering is not intrinsically safer than server-side, but so long as you have a step that validates your HTML before you serialise it, you’re still OK (and even the breakages depend on injecting arbitrary content into a comment, script tag or style tag, which are all extremely unlikely), so I retain my position, now hanging precariously from that delicate thread of the word “intrinsically”. I think there’s a gaping chasm below me. Hopefully there’s something soft to land on.
I typed it out manually in Vim using its built-in digraphs to get the box drawing characters: <C-K>vv for the vertical line, <C-K>vr for the vertical-and-right line, <C-K>ur for the up-and-right line.
See also https://news.ycombinator.com/item?id=30273299 for related discussion yesterday and one popular tool that helps with related things, if not this specific style of illustration.
The problems start when you try to manipulate serialised data, which is not safe to do this in the general case. You should instead construct a proper representation of what you desire, and then serialise that, depending on the serialiser to take care of all of this sort of stuff. This approach has always been fairly popular in compiled languages and languages that like types, but dynamic languages have historically significantly preferred to manipulate strings, I suspect because they don’t have good ergonomics on the other approach, and it’s probably slower in interpreted languages—you’ll note that React felt the need to extend JavaScript to make its approach acceptable to people.
Most JavaScript stuff that supports server-side rendering now is working in this way, crafting a DOM tree and then serialising that. Svelte is a notable exception in that it takes a declarative DOM tree and essentially serialises what it can at compile time, thereby still retaining the required safety guarantees.
There are definitely downsides to strict adherence to the model of crafting a data structure and then serialising it; most significantly, you can’t start streaming a response until you’re done. The solution for this is to use an append-only data structure (or possibly one that allows you to “commit” the document up to a given point, while still allowing mutations in anything that occurs later in the document); thus serialisation can begin before you finish writing the document.
You know the old favourite about parsing HTML with regular expressions? <https://stackoverflow.com/questions/1732348/regex-match-open...> (If not, enjoy!) This is the thing people need to understand and realise in the general case: serialised data should be treated as opaque, and only interacted with after real parsing and before real serialisation.
HTTP headers aren’t strings; "Date: Tue, 15 Nov 1994 08:12:31 GMT" is a serialised HTTP header, representing the actual header that’s more like {Date, 1994-11-15T08:12:31Z}. And that latter is the form you should interact with it in.
HTML isn’t strings; "<p>Hello, world!</p>" is the serialised form of a paragraph element containing a text node with data “Hello, world!”. And that’s the form you should interact with it in.
Yes, I am presenting a strongly-opinionated position that lacks any shade of pragmatism. Yes, my website is generated with templates that manipulate serialised HTML. Eventually I’ll replace it with something more sound.
One last note: at the start I said valid HTML, because it’s not enough to just serialise an arbitrary HTML DOM tree, as you can easily craft invalid HTML DOM trees, like nesting hyperlinks. In most regards, the XML syntax of HTML (still a thing) is actually a safer target to serialise to because then you don’t even need to validate your tree to be confident it won’t get mangled by the serialise/parse round-trip.