mongoDB 9 months on – the bad
Article part of my "mongoDB 9 months" series:
- Setting the stage – introducing why and how I started to use mongoDB, which provides some context of the stuff spoken off after,
- The good – stuff I really appreciate with mongoDB,
- The bad – hurdles which could/should be fixed -the current article, mostly minor yet irritating points about mongoDB stack,
- mongoDB "Take it or leave it" technical choices – you have to be well aware of them,
- The good to know – side discoveries which doesn’t change the world but are better known before hand,
- Conclusion – a small attempt at concluding over all of this while taking a step back.
I guess our project is a bit more complex that what most people do with mongoDB. Indeed, we quickly needed quite complex (and dynamic) queries, which where impeded by various limitations in the current query language:
- $or cannot be nested.
- no $and operator, only an implicit one, with a subtle and quite hidden behaviour: if same property path used twice, only one is taken in account
- the $ positional operator doesn’t handle nested collections
In the end, not all the query logic was expressible in queries, which had many time consuming consequences:
- some queries were reconsidered, either by changing them slightly or, even, adding more denormalized content to make them possible,
- for the more complex ones, we had to resort to partial queries on the database which are then "completed" on the server side. This is pretty bad since the logic is then scattered along and ends up quite hard to follow.
- dynamic queries, built on top of some kind of query object can easily end up failing (not applying all the logic they contain) without us having any clue of it, apart from some testers/users spotting broken data…
- all JS runs on only one thread per server
- it is hard to debug: the console is picky on the characters it gets (no tab for example) and the scripts/batch/db.eval have no proper development environment to speak of: figuring out what goes wrong is really hard, be it either some dumb syntax error or more serious logic issue, which would have required debugging (which isn’t possible)
The only good point there is, after all, that splitting your application logic between different servers (database and application) as well as languages isn’t really recommended anyway. Still, it could have been handy sometime, especially to workaround some query language restrictions.
First of all, the Java Driver code isn’t the best I’ve ever seen.
It’s rather a confused one IMHO. Debugging through isn’t always a good/easy experience. Similarly, the driver has its rough edges, for example this Make the Mongo class proxy safe issue which triggered some unexpected behaviour at some point.
In the end, it sometimes feels like the code was written by good "low/kernel level" non Java developers. This can be sometime a bit unsettling for someone like me, used to many higher level Java frameworks.
A big good point though: when having to dig in for performance matters (hence this post a while ago: Multithreaded performance testing checklist), no contention were spotted, neither no big gotcha were spotted, so the driver feels like doing its job.
One last point about it: we weren’t thrilled by its performance. Its performance, when reading some text content, were in the same ballpark as the mysql driver ones. They degrade in a linear way with the number of concurrent threads up to the driver machine number of cores and then way worse. Most likely some compression of wire protocol will help, but somehow we were expecting better. Memcached java driver for example behaves differently: requests take the same time whatever the number of threads up to the number of cores. On top, a single request was significantly faster than the mysql/mongodb driver one (with the same text content).
Still, it doesn’t mean much of the overall performance of mongoDB, where its architecture helps a lot. I’ll discuss this point more in the About performance chapter of the "The good to know" article, not ready to go yet.
That’s all for now folks, the series next part, mongoDB "Take it or leave it" technical choices, will be published soon!