Interesting! We've built similar support for decoupling compute from storage int...

Interesting! We've built similar support for decoupling compute from storage into Elasticsearch and, as coincidence would have it, just shared some performance numbers today:

https://www.elastic.co/blog/querying-a-petabyte-of-cloud-sto...

It works just as any regular Elasticsearch index (with full Kibana support etc.).

The data being indexed by Lucene allows queries to access index structures and return results orders of magnitude faster than doing a full table scan.

It is complemented with various caching layers to make repeat queries fast.

We expect this new functionality to be used for less frequently queried data (e.g. operational or security investigations, legal discoveries, or historical performance comparisons on older data), trading query speed for cost.

It supports Google Cloud Storage, Azure Blob Storage, Amazon S3 (+ S3 compatible stores), HDFS, and shared file systems.