A hot/cold tier architecture might work for setups that have a predictable and s...

A hot/cold tier architecture might work for setups that have a predictable and sharp cutoff from hot to cold. E.g. a logging setup, where the current days index gets all the indexing and queries. And there's a low concurrency occasional query for cold data

Our workload is rather different from this. Indexing and document updates occur in an somewhat exponential decay pattern back in time, same with queries. So there's less sharp cut offs

If we ran a hot-cold architecture we'd get a few issues

Within each tier we'd get imbalanced workload over the nodes. Since within the tier the workload varies greatly with age of the indexes

We use AWS i3 NVMe SSD instances. d2's with HDDs or using EBS have too long IO latency/throughput/iops even for our "cold" data workload. So a cold tier would scale based on storage needs, but in this tier we'd be wasting lots of compute capacity. And a hot tier would scale based on compute needs, and waste tons of storage capacity.

By running both hot and cold workloads on the same set of nodes we get much more cost effective utilization. Since the hot workload uses most of the aggregate compute capacity, and the cold data uses most of the storage.

But this then necessitates using Shardonnay to ensure we spread workload optimally across the clusters. And the more evenly we can spread it, the higher total utilization we can put on the clusters without having single nodes overload.

A hot/cold architecture would much more costly for our workload. Since we'd have to unused storage on the hot tier, and unused compute capacity on the cold tier. A single tier just makes much more sense for our particular use case