Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A comparison of advanced, modern cloud databases (brandur.org)
39 points by troydavis on May 23, 2017 | hide | past | favorite | 16 comments


I don't want to defame the author, but I must ask: are they in any way affiliated with Amazon, do they resell Amazon services?

I ask because there are things that would appear to be errors on the comparison matrix, and Amazon Aurora is listed as receiving full marks in every category, as if it was the greatest of all the solutions!

Some issues I see:

Amazon Aurora has checkboxes for horizontally scalable (none) and automatic data sharding (none), and for latency, this is simply assumed. But Aurora is backed by disks (or provisioned IO SSDs) and it's not at all comparable to memcached in terms of latency. (See: >5ms writes in [1], which places Amazon Aurora about what I'd expect any SQL database to be at, 1-10ms for read and writes.)

CosmosDB has no checkbox for ACID or Automatic Data Sharding. "All JavaScript logic is executed within an ambient ACID transaction with snapshot isolation. During the course of its execution, if the JavaScript throws an exception, then the entire transaction is aborted." [2], and "Partition management is fully managed by Azure Cosmos DB, and you do not have to write complex code or manage your partitions. Cosmos DB containers are unlimited in terms of storage and throughput." [3]. The "partition key" might be something of a misnomer. For a multi-tenant service, the partition key can be used to limit the performance of a tenant by assigning them one partition key, but if there is no desire to restrict performance, the partition key can be a randomly generated identifier.

MongoDB has no checkbox for automatic data sharding, but practically speaking the hash-based sharding is fine. It's really curious that the author gives Aurora a checkbox here, but not MongoDB. Is it not normal for `distributed` tables to be used in MongoDB deployments? I would guess it is.

Lastly, I just don't think Aurora is the same type of database as Google's Cloud Spanner and Microsoft's Cosmos DB. The latter are, really, a new kind of database, and Aurora is a managed MySQL or Postgres instance.

[1] https://pagely.com/blog/2015/08/we-benchmarked-amazons-new-a...

[2] https://docs.microsoft.com/en-us/azure/documentdb/documentdb...

[3] https://docs.microsoft.com/en-us/azure/cosmos-db/partition-d...


Citus should be marked as Open Source as well: https://github.com/citusdata/citus


No mention of Greeenplum or Vitess?


Sorry, there are many cool databases out there but I trimmed the list to the ones that I know a little about, and find most pragmatic/interesting. I'll willfully admit that the selection methodology is imperfect and somewhat skewed.


Aerospike should be considered if you need low latency and have relatively small row sizes


The recommendations about mongodb should have a strong note about recent updates, mongodb 3.4 has seen significant improvements.

https://www.mongodb.com/mongodb-3.4-passes-jepsen-test


I disagree that mongo is a poor choice. It's very portable it scales well and it is fast. Doesn't feel like a cloud instance is necessary so it may be a poor choice for this list.


> It's very portable it scales well and it is fast

How about some numbers to back you up?

I think the authors main gripe with MongoDB was the lack of ACID transactions


That's largely accurate, but the core point that I try to convey is that Mongo's key sales pitch is that you get sharding and scalability, but at the cost of features like ACID transactions, foreign keys, consistency checks, etc. Today, we have a great number of options that don't have to make these pyrrhic trade offs, and are just as scalable as Mongo.

If you want something that's portable, pretty scalable, and fast, go with Postgres. If you need something that's really scalable, there are plenty of other good databases on that list.


That's a poor argument and the comparison is just unfair. Actually, I am disappointed the author mentions Auroa but not Amazon DynamoDB. Half of the contenters are RDMS so it is totally unfair to suggest move off / move away from MongoDB because it lacks ACID. Furthermore the article really lacks technical comparison in depth. This is more of a notebook. Don't get me wrong I have used Mongo and I know it some of the common problems beginngers will find. As with every database there are gotchas. Postgres out of the box is great but doesn't mean it is the right fit for every problem and you still should fine tune your database.


DynamoDB is just not good. It has a document size limit, and throttling. It gets very expensive very quickly. And it's not fast. Writes are slow. And it's difficult to query.


Yes DynamoDB is indeed quite expensive - and increasingly so as data size and traffic grows. The whole point of distributed datastore like Dynomodb is to scale out massively by compromising on many database features and flexibility, but what's the point if it becomes so expensive at large scale?


PostgreSQL is the overall recommendation.


Apparently not if you're 'Uber' scale, which makes no sense, PostgreSQL is perfectly fine for 'Uber' scale, it's just not fit for Uber's architectural decisions they made.

This was posted 12 hours ago and mentions 'Aurora' except that that it uses MySQL as a backing and is currently in Beta with PostgreSQL.

It also mentions that PostgreSQL is on Azure, but says 'hopefully soon' for GCE, even tho... it's in Beta, just like Azure.

https://cloud.google.com/sql/docs/features#postgres

Unless it's only referring to High Availability which it's hard to tell from whats written.

------

As a side note, the author needs to update the font used on his site, it's rather hard to read on Chrome / Windows 10.


I'm surprised there was no mention of Amazon dynamodb?


What benefit is there to dynamodb?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: