MongoDB Performance for more data than memory

I've recently been having a play around with MongoDB and it's really cool. One of the common messages I see all over is that you should only use it if your dataset fits into memory. I've not yet seen any benchmarks on what happens when it doesn't though. So, here ... Read more

Django SQL Sampler becomes Django Sampler (with Mongo support)

In a previous post I described Django SQL Sampler as a tool that helps you find the SQL queries that are consuming the most time on a production site. I've now renamed Django SQL Sampler to Django Sampler because it now does much more. It now has a plugin architecture ... Read more

Getting the Size of a Specific Index in MongoDB

Spent a little while trying to find the size of a specific index this morning and couldn't find any documentation on how to do it. Eventually stumbled on it in db.collection_name.stats() > db.content.stats() { "ns" : "conversocial.content", "sharded" : false, "primary" : "main01", "ns" : "conversocial.content", "count" : 1924859, "size" ... Read more

MongoDB - Collection Per User Performance

Theory On the MongoDB site there is a suggestion that collections can be used to cluster data and get better performance as a result. The idea is that a different collection could be used for each user's data. Internally, MongoDB will use different extents for each collection (an extent is ... Read more

MongoDB - Strategies when hitting disk

I gave a lightning talk on this at the London MongoDB User Group and thought I'd write it up here MongoDB sucks when it hits disk (ignoring SSDs). The general advice is to never hit disk. What if you have to hit disk? Conversocial's new metrics infrastructure will allow people ... Read more

Considerations when Sharding

Whilst this is talking about our use of MongoDB there is relevance here for any sharding discussion. We currently use MongoDB at Conversocial for our main content store. We're now starting to think about how we shard as the main store is getting pretty large (150 million documents across 300gb). ... Read more

Powering Conversocial's Analytics

Powering Conversocial's Analytics We recently released our new analytics functionality for our customers. It allows them to see stats like: Number of messages received each day Messages processed by each agent Response times split into buckets (less than 30 minutes, less than 1 hour, etc) Sentiment breakdown All of this ... Read more

Colin Howe

I'm Colin. I like coding, ultimate frisbee and startups. I am VP of engineering at Conversocial