mongoDB: 9 months on – conclusion
Article part of my "mongoDB 9 months" series:
- Setting the stage – introducing why and how I started to use mongoDB, which provides some context of the stuff spoken off after,
- The good – stuff I really appreciate with mongoDB,
- The bad – hurdles which could/should be fixed – mostly minor yet irritating points about mongoDB stack,
- mongoDB "Take it or leave it" technical choices
- The good to know – side discoveries which doesn’t change the world but are better known before hand,
- Conclusion – a small attempt at concluding over all of this while taking a step back – the current article.
First of all, if you made it till here, congratulations. I had much in my mind and I fear I wrote it too quickly, and even after some updates on different parts of the series I still feel poor with the way they’re written. I guess I’ll have to improve them over time. Hence, I would be eager to know how readers feel about it, so please, let me know!
Back to mongoDB, it’s a powerful tool, whose document orientation really help reducing the infamous impedance mismatch, thus greatly reducing mapping issues. Its indexation abilities, especially of "in document" lists, are also impressive, as are its performances overall.
This comes at the price of some radical technical choices: no join and no transaction (mostly).
Hence to me the following "rules" on when to use it:
- Domain logic which can be mapped to document without needing relationships: go for it, mongoDB is the perfect match. Its high performance will do wonders. And since documents can embed so much, it actually means more than one could expect.
- Domain logic with relationships and few if any transactional needs, where performance matters. mongoDB should be looked at. One should first carefully consider the handling of relationships. A proof of concept for it is a must have I think. The transactional needs are a different beast: one should make sure they could fit somehow in mongoDB non existing support for such things. And remember it’s highly unlikely 10gen will ever introduce new features there, due to its (potential) impact on performances. This actually leads me to the third point:
- Domain logic with lot of needs for transactional operations: mongoDB doesn’t fit.
For sure, these are only my humble opinions and I’m eager to hear other ones.
Some side notes comes to my mind as well:
- if ever a real need for transaction pops up, it won’t be solved for free. A solution will have to be found. For example, it could the limited 2 stage commit (more on it in the No transaction & limited ACID chapter). It could also be some other mechanism for ensuring transactional matters (some people went to mysql for this, but one could also consider something like JTA or, rather, multiverse). Finally, one could split the data in two between a "transactional able" DB and mongoDB. Anyway, the bottom line stays: transactions won’t come for free, whereas it’s a feature which is usually taken for granted due to the massive use of "traditionnal" RDBMS. This leads me to the next point:
- Plan for quite some explanation about mongoDB choice. From JSON content and queries to document approach, all this without join and transaction, mongoDB has a lot which differs and which will, for sure, unsettle and disturb. It’s true for the developers, going away from their safe RDBMS land ("ugh, where is my transaction?"), to admin ("what, I can’t export to an CVS file?") to business analyst/product lead with a past history in IT or some knack at it. They’ll all be surprised! Take the time to consider and win their heart & mind properly.
- Plan to keep looking at mongoDB progress: new features will come often and bugs get fixed. One has to follow carefully not to miss some of them.
- Finally, a point which I haven’t really spoken of it, but which matters as well: watch for your driver/mapping framework. It would be a pity for it to be the limiting factor, yet at the same time the limitation (mostly regarding type info) and flexibility of JSON makes the issue pretty hard, at least on the mapping side. On top of this, refinements like first and second level caches are still welcome performance booster. Then some support for relationships and their management would also be welcome. In the end, the driver business isn’t straightforward.
On a more personal matter, retrospectively, I’m also wondering if the approach of putting write and read all into (one) mongodb is the right approach. Indeed, de normalization means that, quickly, some decisions will be taken based on de normalized data which could be stale. Even without going so far, the documents can quickly end up being pretty messy, with de normalized content for specific views and the like. And still is lurking in the background the lack of transaction: how to be sure all potentially failing multiple updates in a row are well handled?
As such, my current interest in CQRS and EDA, which states the separation of view database and the write one, rings a bell. Indeed, mongoDB makes a perfect fit for the view database: it can handle both full text search and complex queries, yet being quite flexible in terms of mapping for your views. On the other hand, the write database could stick to some RDBMS able of join and transaction, where needed (which in CQRS should be less than in traditional "dump in all into the RDBMS" approach). Sure it may involve extra work, but if you choose mongoDB for write you weren’t afraid, most likely, of extra work anyway. And still, the clarity and flexibility given might well pay off quickly. Yet this is just wild thoughts: I hadn’t any occasion to test them, even if a akka/scala for events, some RDBMS for write and mongoDB for view feels like truly appealing to me.
Actually, I would also love a document oriented db for the write part. Basically mongoDB with better relationship support (from integrity constraints to join) and transaction would be perfect. Yet, while the relationship support can and is likely to improve over time, the transaction aspect feels like way more remote. It simply doesn’t match with the current performance minded approach and, furthermore, would imply a massive amount of changes… Pity!
Before we part, let me thank a hell of a lot codesmell, my tech lead, who has always been eager to endure my lengthy questions and talks on mongodb and related matters. Without him the current series wouldn’t have seen light, it’s as simple as this!
And don’t forget: I more than welcome your view on this series!