Yes, I completely understand the reason for the dialects. I’ve been working on compilers for decades, and what’s interesting about MLIR is that it’s a bit of an anachronistic approach. People used to write optimizers with multiple levels of IR, often based on the same data structures (but sometimes not), and they were effectively different dialects in the same sense as MLIR.
That fell out of favor due to the fact that you end up having to choose and commit to the phase ordering fairly early on and writing separate lowering steps to convert between dialects. So the tide turned toward having a single mid-level IR (and sometimes a single high-level IR for things like specific loop optimizations that was then lowered to that mid-level IR).
> So the tide turned toward having a single mid-level IR (and sometimes a single high-level IR for things like specific loop optimizations that was then lowered to that mid-level IR).
You realize this is only feasible if you have one team working on a compiler for one domain right? Eg Rust's MIR is probably a good target for a systems language like rust but a bad target for a SQL like language.
>phase ordering fairly early on and writing separate lowering steps to convert between dialects.
I don't see how a single IR solves the phase ordering problem? LLVM IR is a single IR (not talking about backends) and yet you still have phase ordering problems.
> You realize this is only feasible if you have one team working on a compiler for one domain right?
It sounds like you think I’m advocating for something. I’m not. These are all just engineering trade offs that depend on your goals.
Regarding phase ordering: A single IR allows you to freely reorder passes rather than having to reimplement them if you want to move them earlier or later in the phase order.
From an optimization perspective, such dialects are pretty much like the intermediate datastructures the "single-IR" style passes build internally anyway (eg. various loop analyses), just in a sharable and more consistent (if less performant) form.
Single IR passes from that perspective are roughly equivalent to MLIR-style `ir = loop_to_generic_dialect(my_loop_optimization(generic_to_loop_dialect(ir))`.
This assumes the existence of bidirectional dialect tranformations. Note that even LLVM IR, while a single IR, is technically multi-level as well, eg. for instruction selection, it needs to be canonicalized & expanded first, and feeding arbitrary IR into that pass will result in an exception (or sometimes even a segfault, considering it is C++).
Also, even though passes for single IR can theoretically be run in an arbitrary order, they are generally run in an order that can re-use (some) intermediate analysis results. This is, again, equivalent to minimizing the number of inter-dialect transformations in a multi-dialect IR.
I have no idea what you mean by “how any of this works”.
I didn’t bring up solving the phase ordering problem, you did.
I’m simply pointing out that if you have a compiler where you have multiple IRs or dialects of IR, and you have a pass that is written to work on IR “X”, and then at some point after that pass you translate to IR “Y”, if you want to move your pass after that point of translation, you either need to rewrite your pass so that it operates on “Y”, or you need to translate back to “X” again.
That fell out of favor due to the fact that you end up having to choose and commit to the phase ordering fairly early on and writing separate lowering steps to convert between dialects. So the tide turned toward having a single mid-level IR (and sometimes a single high-level IR for things like specific loop optimizations that was then lowered to that mid-level IR).