Reflections on MongoDB

rubyrescue · on June 16, 2010

I found out a few months down the road that the mongodb master/slave replication is very delicate, and can become corrupted if the perfect storm happens, which seems to be a lot around me.

every time i read mongodb blog posts i read things like this in the comments and i decide to pass...

bkeepers · on June 16, 2010

I’ve heard a few other stories like that. I'd like to talk with one of the companies that has a lot of experience managing MongoDB and see if the horror stories are just a result of poorly managed servers, or if it is really a problem with MongoDB itself.

If it is with MongoDB, I’m hopeful that it is a result of MongoDBs immaturity, and over time it will become more stable.

stingraycharles · on June 16, 2010

Maybe this is something that interests you: http://blog.boxedice.com/2010/02/28/notes-from-a-production-...

bl4k · on June 16, 2010

When should I use MongoDB?

Always.

I am a big fan of mongodb but this is bad advice. The answer should be 'when I need a document-based data store'. You determine what type of database you need (RDBMS, KV, Doc, Graph, Column-based etc.) based on the type of data you are storing and retrieving. You don't want to use mongo for an accounting database etc. That said, mongo is the most mature, stable and feature rich of the open source doc based db's.

michaelfairley · on June 16, 2010

You should read the three paragraphs after that comment.

Groxx · on June 16, 2010

Does anyone else have slow framerate while scrolling on that site? I'm too unfamiliar with javascript profilers to figure it out entirely, but it looks like it's doing a bunch of recalculations on every scroll event, including almost a thousand eval calls per second.

jpcx01 · on June 16, 2010

Yep, I noticed that too. I think its because of that retarded photo popin / popout. When you are on the top of the page the author's photo slides out.

alexpopescu · on June 16, 2010

> When should you use MongoDB? Always

NO, NO, NO. If you do that, then you’re definitely doing it wrong! Again! http://nosql.mypopescu.com/post/703837964/when-should-i-use-...

jister · on June 16, 2010

...and he said at the bottom of that line:

"No, seriously!?

OK, I think MongoDB makes sense with most web applications...."

ryanwaggoner · on June 16, 2010

I was just going for a run and thinking about the whole NoSQL thing...can someone clear something up for me? Do document stores still associate records with one another via foreign keys (but with multiple queries rather than joins), or do they store the same info repeated across multiple records?

For example, let's say you have a blog that's using a key-value store, and posts can have many tags. Does each post document have the tags in the record itself, or does the post document have the ids for the tag records, which the system would retrieve in a subsequent query? Or am I missing it completely? Links to any helpful articles on transitioning from a relational database to a document store would be awesome.

epochwolf · on June 16, 2010

Here's an example schema.

Users

    {
      _id:string //doubles for user's name
      password: {
        hash:string
        salt:string
      }
      website:string
      bio:string
      following:array of {
        username:string
        articles:boolean
        comments:boolean
      }
    }

Articles

    {
      _id:ObjectID //auto id
      title:string
      contents:string
      owner:string //references Users
      tags:array of strings //no reference, tags are either unfiltered or selected from a global tag list. 
    }

Comments

    {
      _id:ObjectID //auto id
      contents:string
      owner:string //references Users
      root:ObjectID //references Articles instead of Comments
      parents:array of ObjectIDs //using materialized path for threading of comments
    }

Config

    {
      _id:string
      value:? //untyped can store anything in json format
    }

Note: The references are defined entirely in software, the database is not aware of them.

andrewtj · on June 16, 2010

I've found the resources the folks at Basho (makers of Riak) have put together on their wiki and blog quite useful. The following would be a good starting point:

http://blog.basho.com/2010/03/19/schema-design-in-riak---int...

http://blog.basho.com/2010/03/25/schema-design-in-riak---rel...

http://wiki.basho.com/

http://blog.basho.com/

bkudria · on June 16, 2010

Well, you can structure it both ways. You can specify IDs, and perform extra queries, or you can "inline" the data, duplicating it across the hierarchy, but saving yourself extra queries.

It all depends on your use case.

m_eiman · on June 16, 2010

duplicating it across the hierarchy

Do the data stores do de-duplication internally to make this less expensive storage-wise?

ryanwaggoner · on June 16, 2010

Thanks for the info. Can you elaborate a bit? I guess I'm asking more which is the "right" way to do it most of the time. You can just json all your data into a giant mysql table, but that's usually not the preferred way to use a relational database. So what's the preferred way to use a document store?

michaelfairley · on June 16, 2010

The preferred way is to place a tags array in the post document, and also create an index across this array (allowing efficient lookup by tag). There's actually a specific documentation page for this usage: http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mong...

ryanwaggoner · on June 16, 2010

Cool...thank you very much!

bl4k · on June 16, 2010

the shortcut to think about it when designing a doc-based schema is as follows: what would be a one-to-one or one-to-many relationship in a traditional RDBMS would be embedded data in the document in a doc based store. What would be a many-to-many relationship would be stored as a separate document and referenced with a foreign key.

spuz · on June 16, 2010

We're very seriously considering moving from MySQL to a document database for our application but when people say "don't use MongoDB if you need transactions" does that not rule out the vast majority of CRUD applications? For example, if a user adds an item to their shopping basket, or tops up their online credit, does that not require a read-update-write process to be executed within a transaction? Is this kind of operation not recommended when using something like MongoDB?

michaelfairley · on June 16, 2010

MongoDB does support a number of atomic read-update-write operations, but not full scale transactions: http://www.mongodb.org/display/DOCS/Atomic+Operations

rythie · on June 16, 2010

I don't think those need to be transactions they can be done with atomic operations

- Adding to a shopping basket is adding a row to a table which is atomic (same for deletions)

- topping up a online credit is "UPDATE credit_balance SET credit=new_value WHERE credit=expected_old_value AND user_id=this_user", which is atomic and then check the number of affected rows is 1.

houseabsolute · on June 16, 2010

Yeah, you don't need ACID . . . provided there are no relationships between any of your records. Provided that false in other words.

petewarden · on June 16, 2010

Or provided that your application can cope with minor inconsistencies. When I'm writing Twitter mining tools, I don't want to pay for financial-transaction-level reliability, either in terms of performance, cost or complexity.

michaelfairley · on June 16, 2010

Or you can denormalize the relationships and commit a single document atomically.

Document-based DBs are not RDBMSs, and consequently you can't (or rather, shouldn't) just apply your traditional database design methodology to it.