> and if it's analytics why were you using a row store instead of a columnar database?
This point confuses me, parent is telling us how they went from a row-based (pgsql) to a column-based DB (Clickhouse), and you asked them why they didn't?
There are many, many ways to avoid a full scan. Fixing a dumb query and proper indexing (partial indexes if this is expensive), materialized views, and several more. It's very common the developers either resist or are not aware of how to use a database. A full scan is dumb and will retrieve all data of this big table without any filtering or even order. This is highly suspicious.
And in the very, very rare case it is absolutely necessary to do a full scan, then Postgres might not be a good idea. Why is it being bashed? I read "Postgres was 12h slow" not "we picked the wrong tool and of course Postgres was slow". Don't blame the hammer.
IME, it's common for programmers to not be educated on what modern databases can do. This was made worse with all the NoSQL mania ending up with the MongoDB is web scale. We are still dealing with the fallout of that.
FWIW, This project was one that I came in at the deployment stage, and the original requirements were for something about 10x smaller. Budget Gone. Shit happens.
There were many places where the right choices weren't made for this, and could have been made better. But even with optimizations in the schema speeding things up by a factor of 3, we would have been looking at something that still didn't hit the performance needs. I'm guessing that to do the reorg on the full DB, we would be looking at O(week). Effective indexes are great, but when one optimized index is tens of GB and takes a day to create, you start to look elsewhere.
This point confuses me, parent is telling us how they went from a row-based (pgsql) to a column-based DB (Clickhouse), and you asked them why they didn't?