You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by S Ahmed <sa...@gmail.com> on 2010/05/12 23:57:31 UTC

how does cassandra compare with mongodb?

I tried searching mail-archive, but the search feature is a bit wacky (or
more probably I don't know how to use it).

What are the key differences between Cassandra and Mongodb?

Is there a particular use case where each solution shines?

Re: how does cassandra compare with mongodb?

Posted by Benjamin Black <b...@b3k.us>.
Mongo has a rich query API and a weak distribution/replication story.
Cassandra has a narrow (read: weak) query API and a strong
distribution/replication story.  If you want really shallow learning
curve, easy querying, etc, won't have that much data, and are handy
with the typical master/slave replication model, Mongo is a fine
choice.  If you want to have billions of rows and tens or hundreds of
terabytes of data spread over tens or hundreds of machines in multiple
datacenters, and can manage the additional effort of maintaining your
own indices, Cassandra is the better choice.

All my opinions, apply grain of salt...


b

On Wed, May 12, 2010 at 2:57 PM, S Ahmed <sa...@gmail.com> wrote:
> I tried searching mail-archive, but the search feature is a bit wacky (or
> more probably I don't know how to use it).
> What are the key differences between Cassandra and Mongodb?
> Is there a particular use case where each solution shines?

Re: how does cassandra compare with mongodb?

Posted by Gary Dusbabek <gd...@gmail.com>.
Cassandra has always enforced the tiniest bit of schema.  You
basically define how you want your columns and subcolumns to be sorted
within column families.  You also name the column families and
keyspaces.  That's all though.

The part that is changing is that the keyspaces and column families
will no longer be defined statically at runtime.  [column families |
keyspaces] may be [added | dropped] on a live cluster.  Think of it as
[CREATE|DROP|ALTER] TABLE for Cassandra.

Gary.

On Thu, May 13, 2010 at 14:48, Steve Lihn <st...@gmail.com> wrote:
> What is changing? A more flexible schema or no need to restart (some kind of
> hot-reboot)?
>
> Mongo guys claims that Mongo's advantage is a schema-less design. Basically
> you can have any data structure you want and you can change them anyway you
> want. This is done in the name of "flexibility", but I am not sure this is a
> good practice. People argued for years that perl is bad because it is
> typeless and java is strong typed and is better. Now the java community is
> developing a database like Mongo that is schema-less. How does this
> complements the strong-type argument?
>
> The less requirement is put on database schema design, the more burden is
> put on the application to maintain data integrity. Why is this a good trend?
> Can someone kindly explain?
>
> Steve
>
>
>
> On Thu, May 13, 2010 at 1:22 PM, Vijay <vi...@gmail.com> wrote:
>>
>> "Cassandra requires the schema to be defined before the database starts,
>> MongoDB can have any schema at run-time just like a normal database."
>> This is changing in 0.7
>> Regards,
>> </VJ>
>>
>

Re: how does cassandra compare with mongodb?

Posted by Steve Lihn <st...@gmail.com>.
Thanks for pointing this out. My fault in thinking Mongo is another
java-based database, which I will probably realize wrong when I attend the
mongo conference in a week.

On Fri, May 14, 2010 at 4:45 AM, David Strauss <da...@fourkitchens.com>wrote:

> On 2010-05-13 19:48, Steve Lihn wrote:
> > Now the java community is developing a database like Mongo that is
> > schema-less.
>
> Mongo is written in C++.
>
> --
> David Strauss
>   | david@fourkitchens.com
> Four Kitchens
>   | http://fourkitchens.com
>   | +1 512 454 6659 [office]
>   | +1 512 870 8453 [direct]
>
>

Re: how does cassandra compare with mongodb?

Posted by David Strauss <da...@fourkitchens.com>.
On 2010-05-13 19:48, Steve Lihn wrote:
> Now the java community is developing a database like Mongo that is
> schema-less.

Mongo is written in C++.

-- 
David Strauss
   | david@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]


Re: how does cassandra compare with mongodb?

Posted by Yen Pai <ye...@gmail.com>.
Great thread -- thanks for pointing it out!

I was referencing consistency without considering durability but that is
probably something that should not be overlooked, or if it is, as you
suggested, be overlooked as a conscious decision to take on risk.


2010/5/14 Peter Schüller <sc...@spotify.com>

> > Not sure if this was mentioned, but MongoDB is strongly consistent while
> > Cassandra is eventually consistent -- at least about a month ago when I
> > looked at it in more detail, though with vector clocks in 0.7, this may
> be
> > less of an issue.
>
> Did Mongo switch away from the "fsync() every now and then, let's hope
> broken values don't parse as bson" consistency semantics?
>
> Last time I looked mongodb, which was admittedly more than a single
> month ago, you definitely did not want it if you cared about strong
> consistency in the sense of durability:
>
> http://groups.google.com/group/mongodb-user/browse_thread/thread/460dbd49a5b6b267/daca5d5155d7e66e
>
> --
> / Peter Schuller aka scode
>

Re: how does cassandra compare with mongodb?

Posted by Peter Schüller <sc...@spotify.com>.
> Not sure if this was mentioned, but MongoDB is strongly consistent while
> Cassandra is eventually consistent -- at least about a month ago when I
> looked at it in more detail, though with vector clocks in 0.7, this may be
> less of an issue.

Did Mongo switch away from the "fsync() every now and then, let's hope
broken values don't parse as bson" consistency semantics?

Last time I looked mongodb, which was admittedly more than a single
month ago, you definitely did not want it if you cared about strong
consistency in the sense of durability:
http://groups.google.com/group/mongodb-user/browse_thread/thread/460dbd49a5b6b267/daca5d5155d7e66e

-- 
/ Peter Schuller aka scode

Re: how does cassandra compare with mongodb?

Posted by ypai <yp...@kompany.org>.
Not sure if this was mentioned, but MongoDB is strongly consistent while
Cassandra is eventually consistent -- at least about a month ago when I
looked at it in more detail, though with vector clocks in 0.7, this may be
less of an issue.

As for "schema-less" and coupling of database/application, etc.:

Using solutions like iBatis/Hibernate allow the application to be abstracted
from the database in theory, but in practice, it is very rare that a schema
will undergo changes without corresponding changes in the ORM layer.

MongoDB's flexibility in regards to schema does allow the application
developer more freedom/control but even though MongoDB doesn't enforce a
schema in the traditional sense, a logical schema still exists:
applications still need to know what Mongo "collections" (analogous to
tables) to reference, what keys to use, what fields to retrieves -- just
like a typical "enterprise" Java application needs to know what tables,
keys, and columns to retrieve.  It's just that with MongoDB, if a
column/field name is wrong in the application, it is likely to fail
silently, rather then throw an exception.  In any case, the term
"schema-less" is a little deceptive -- one still has to think carefully
about how to structure and store one's data in order to leverage the
strengths of a particular datastore.

Also, I would say that a MongoDB document (think JSON) in a collection
called "user" such as:
{  "_id": "primaryKeyEquivalent",
   "addresses": [ {"address1"}, {"address2"}, {"address3"} ]
}

is about as self-documenting as a table "user" and a table "user_addresses"
with a one to many FK relationship.




On Thu, May 13, 2010 at 7:31 PM, Steve Lihn <st...@gmail.com> wrote:

> Not sure how to comment on this concept. I guess it infers that the
> database and application are no longer loosely coupled, but now strongly
> coupled.
> I guess too, that java developers will vote yes, while database architect
> and DBA will vote no.
>
> In the "traditional" sense, enterprise data is the soul of a business. Data
> has to stand by itself with reasonable information (primary key and foreign
> key) to interpret itself. But it appears now with the Mongo approach, the
> data store (I won't even call it database) is a byproduct of the mapping
> class. Without reading the mapping classes, one can barely understand the
> data. How is this going to be accepted by large enterprises? I put a big
> question mark on it.
>
> On top of it, if you follow Hibernate's suggestion of using numeric as
> primary keys, your data will be as cryptic as hell.
>
>
>
> On Thu, May 13, 2010 at 8:10 PM, philip andrew <ph...@gmail.com>wrote:
>
>> MongoDB encourages you to define your schema in your application code by
>> using mapping classes. This logically infers that it makes no sense to
>> define the schema twice, in the database and in your application code.
>>
>>

Re: how does cassandra compare with mongodb?

Posted by Paul Prescod <pa...@prescod.net>.
On Thu, May 13, 2010 at 7:31 PM, Steve Lihn <st...@gmail.com> wrote:
> Not sure how to comment on this concept. I guess it infers that the database
> and application are no longer loosely coupled, but now strongly coupled.
> I guess too, that java developers will vote yes, while database architect
> and DBA will vote no.

It's a very old debate that has been had over and over on the
Internet. It stems primarily from a clash of cultures.

> In the "traditional" sense, enterprise data is the soul of a business. Data
> has to stand by itself with reasonable information (primary key and foreign
> key) to interpret itself. But it appears now with the Mongo approach, the
> data store (I won't even call it database) is a byproduct of the mapping
> class. Without reading the mapping classes, one can barely understand the
> data. How is this going to be accepted by large enterprises? I put a big
> question mark on it.

I don't think that large enterprises (by which I mean IT departments
of non-technical companies) are the first wave of likely customers for
NoSQL databases.

I also don't think that Cassandra and MongoDB differ enough on this
issue to think that an enterprise IT department would prefer one or
the other on the basis of it. Neither has foreign keys or
transactions. Both shift work from the datastore to the application.
If that's not what you want, neither is a good choice.

 Paul Prescod

Re: how does cassandra compare with mongodb?

Posted by Steve Lihn <st...@gmail.com>.
Not sure how to comment on this concept. I guess it infers that the database
and application are no longer loosely coupled, but now strongly coupled.
I guess too, that java developers will vote yes, while database architect
and DBA will vote no.

In the "traditional" sense, enterprise data is the soul of a business. Data
has to stand by itself with reasonable information (primary key and foreign
key) to interpret itself. But it appears now with the Mongo approach, the
data store (I won't even call it database) is a byproduct of the mapping
class. Without reading the mapping classes, one can barely understand the
data. How is this going to be accepted by large enterprises? I put a big
question mark on it.

On top of it, if you follow Hibernate's suggestion of using numeric as
primary keys, your data will be as cryptic as hell.


On Thu, May 13, 2010 at 8:10 PM, philip andrew <ph...@gmail.com>wrote:

> MongoDB encourages you to define your schema in your application code by
> using mapping classes. This logically infers that it makes no sense to
> define the schema twice, in the database and in your application code.
>
>

Re: how does cassandra compare with mongodb?

Posted by philip andrew <ph...@gmail.com>.
MongoDB encourages you to define your schema in your application code by
using mapping classes. This logically infers that it makes no sense to
define the schema twice, in the database and in your application code.

On Fri, May 14, 2010 at 3:48 AM, Steve Lihn <st...@gmail.com> wrote:

> What is changing? A more flexible schema or no need to restart (some kind
> of hot-reboot)?
>
> Mongo guys claims that Mongo's advantage is a schema-less design. Basically
> you can have any data structure you want and you can change them anyway you
> want. This is done in the name of "flexibility", but I am not sure this is a
> good practice. People argued for years that perl is bad because it is
> typeless and java is strong typed and is better. Now the java community is
> developing a database like Mongo that is schema-less. How does this
> complements the strong-type argument?
>
> The less requirement is put on database schema design, the more burden is
> put on the application to maintain data integrity. Why is this a good trend?
> Can someone kindly explain?
>
> Steve
>
>
>
>
> On Thu, May 13, 2010 at 1:22 PM, Vijay <vi...@gmail.com> wrote:
>
>> "Cassandra requires the schema to be defined before the database starts,
>> MongoDB can have any schema at run-time just like a normal database."
>>
>> This is changing in 0.7
>>
>> Regards,
>> </VJ>
>>
>>
>>

Re: how does cassandra compare with mongodb?

Posted by Steve Lihn <st...@gmail.com>.
What is changing? A more flexible schema or no need to restart (some kind of
hot-reboot)?

Mongo guys claims that Mongo's advantage is a schema-less design. Basically
you can have any data structure you want and you can change them anyway you
want. This is done in the name of "flexibility", but I am not sure this is a
good practice. People argued for years that perl is bad because it is
typeless and java is strong typed and is better. Now the java community is
developing a database like Mongo that is schema-less. How does this
complements the strong-type argument?

The less requirement is put on database schema design, the more burden is
put on the application to maintain data integrity. Why is this a good trend?
Can someone kindly explain?

Steve



On Thu, May 13, 2010 at 1:22 PM, Vijay <vi...@gmail.com> wrote:

> "Cassandra requires the schema to be defined before the database starts,
> MongoDB can have any schema at run-time just like a normal database."
>
> This is changing in 0.7
>
> Regards,
> </VJ>
>
>
>

Re: how does cassandra compare with mongodb?

Posted by Roger Schildmeijer <sc...@gmail.com>.
In a perfect world there should be (aiming for) a new major Cassandra release every 2-3 months.

// Roger Schildmeijer 


On 13 maj 2010, at 19.43em, Sandeep Kalidindi wrote:

> Any idea about how far the 0.7 release is ??
> 
> Cheers,
> Deepu.
> 
> On Thu, May 13, 2010 at 10:52 PM, Vijay <vi...@gmail.com> wrote:
> "Cassandra requires the schema to be defined before the database starts, MongoDB can have any schema at run-time just like a normal database."
> 
> This is changing in 0.7
> 
> Regards,
> </VJ>
> 
> 
> 
> 
> On Wed, May 12, 2010 at 7:25 PM, Jonathan Shook <js...@gmail.com> wrote:
> You can choose to have keys ordered by using an
> OrderPreservingPartioner with the trade-off that key ranges can get
> denser on certain nodes than others.
> 
> On Wed, May 12, 2010 at 7:48 PM, philip andrew <ph...@gmail.com> wrote:
> >
> > Hi,
> > From my understanding, Cassandra entities are indexed on only one key, so
> > this can be a problem if you are searching for example by two values such as
> > if you are storing an entity with a x,y then wish to search for entities in
> > a box ie x>5 and x<10 and y>5 and y<10. MongoDB can do this, Cassandra
> > cannot due to only indexing on one key.
> > Cassandra can scale automatically just by adding nodes, almost infinite
> > storage easily, MongoDB requires database administration to add nodes,
> > setting up replication or allowing sharding, but not too complex.
> > MongoDB requires you to create sharded keys if you want to scale
> > horizontally, Cassandra just works automatically for scale horizontally.
> > Cassandra requires the schema to be defined before the database starts,
> > MongoDB can have any schema at run-time just like a normal database.
> > In the end I choose MongoDB as I require more indexes than Cassandra
> > provides, although I really like Cassandras ability to store almost infinite
> > amount of data just by adding nodes.
> > Thanks, Phil
> >
> > On Thu, May 13, 2010 at 5:57 AM, S Ahmed <sa...@gmail.com> wrote:
> >>
> >> I tried searching mail-archive, but the search feature is a bit wacky (or
> >> more probably I don't know how to use it).
> >> What are the key differences between Cassandra and Mongodb?
> >> Is there a particular use case where each solution shines?
> >
> 
> 


Re: how does cassandra compare with mongodb?

Posted by Sandeep Kalidindi <de...@gmail.com>.
Any idea about how far the 0.7 release is ??

Cheers,
Deepu.

On Thu, May 13, 2010 at 10:52 PM, Vijay <vi...@gmail.com> wrote:

> "Cassandra requires the schema to be defined before the database starts,
> MongoDB can have any schema at run-time just like a normal database."
>
> This is changing in 0.7
>
> Regards,
> </VJ>
>
>
>
>
> On Wed, May 12, 2010 at 7:25 PM, Jonathan Shook <js...@gmail.com> wrote:
>
>> You can choose to have keys ordered by using an
>> OrderPreservingPartioner with the trade-off that key ranges can get
>> denser on certain nodes than others.
>>
>> On Wed, May 12, 2010 at 7:48 PM, philip andrew <ph...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> > From my understanding, Cassandra entities are indexed on only one key,
>> so
>> > this can be a problem if you are searching for example by two values
>> such as
>> > if you are storing an entity with a x,y then wish to search for entities
>> in
>> > a box ie x>5 and x<10 and y>5 and y<10. MongoDB can do this, Cassandra
>> > cannot due to only indexing on one key.
>> > Cassandra can scale automatically just by adding nodes, almost infinite
>> > storage easily, MongoDB requires database administration to add nodes,
>> > setting up replication or allowing sharding, but not too complex.
>> > MongoDB requires you to create sharded keys if you want to scale
>> > horizontally, Cassandra just works automatically for scale horizontally.
>> > Cassandra requires the schema to be defined before the database starts,
>> > MongoDB can have any schema at run-time just like a normal database.
>> > In the end I choose MongoDB as I require more indexes than Cassandra
>> > provides, although I really like Cassandras ability to store almost
>> infinite
>> > amount of data just by adding nodes.
>> > Thanks, Phil
>> >
>> > On Thu, May 13, 2010 at 5:57 AM, S Ahmed <sa...@gmail.com> wrote:
>> >>
>> >> I tried searching mail-archive, but the search feature is a bit wacky
>> (or
>> >> more probably I don't know how to use it).
>> >> What are the key differences between Cassandra and Mongodb?
>> >> Is there a particular use case where each solution shines?
>> >
>>
>
>

Re: how does cassandra compare with mongodb?

Posted by Vijay <vi...@gmail.com>.
"Cassandra requires the schema to be defined before the database starts,
MongoDB can have any schema at run-time just like a normal database."

This is changing in 0.7

Regards,
</VJ>



On Wed, May 12, 2010 at 7:25 PM, Jonathan Shook <js...@gmail.com> wrote:

> You can choose to have keys ordered by using an
> OrderPreservingPartioner with the trade-off that key ranges can get
> denser on certain nodes than others.
>
> On Wed, May 12, 2010 at 7:48 PM, philip andrew <ph...@gmail.com>
> wrote:
> >
> > Hi,
> > From my understanding, Cassandra entities are indexed on only one key, so
> > this can be a problem if you are searching for example by two values such
> as
> > if you are storing an entity with a x,y then wish to search for entities
> in
> > a box ie x>5 and x<10 and y>5 and y<10. MongoDB can do this, Cassandra
> > cannot due to only indexing on one key.
> > Cassandra can scale automatically just by adding nodes, almost infinite
> > storage easily, MongoDB requires database administration to add nodes,
> > setting up replication or allowing sharding, but not too complex.
> > MongoDB requires you to create sharded keys if you want to scale
> > horizontally, Cassandra just works automatically for scale horizontally.
> > Cassandra requires the schema to be defined before the database starts,
> > MongoDB can have any schema at run-time just like a normal database.
> > In the end I choose MongoDB as I require more indexes than Cassandra
> > provides, although I really like Cassandras ability to store almost
> infinite
> > amount of data just by adding nodes.
> > Thanks, Phil
> >
> > On Thu, May 13, 2010 at 5:57 AM, S Ahmed <sa...@gmail.com> wrote:
> >>
> >> I tried searching mail-archive, but the search feature is a bit wacky
> (or
> >> more probably I don't know how to use it).
> >> What are the key differences between Cassandra and Mongodb?
> >> Is there a particular use case where each solution shines?
> >
>

Re: how does cassandra compare with mongodb?

Posted by Jonathan Shook <js...@gmail.com>.
You can choose to have keys ordered by using an
OrderPreservingPartioner with the trade-off that key ranges can get
denser on certain nodes than others.

On Wed, May 12, 2010 at 7:48 PM, philip andrew <ph...@gmail.com> wrote:
>
> Hi,
> From my understanding, Cassandra entities are indexed on only one key, so
> this can be a problem if you are searching for example by two values such as
> if you are storing an entity with a x,y then wish to search for entities in
> a box ie x>5 and x<10 and y>5 and y<10. MongoDB can do this, Cassandra
> cannot due to only indexing on one key.
> Cassandra can scale automatically just by adding nodes, almost infinite
> storage easily, MongoDB requires database administration to add nodes,
> setting up replication or allowing sharding, but not too complex.
> MongoDB requires you to create sharded keys if you want to scale
> horizontally, Cassandra just works automatically for scale horizontally.
> Cassandra requires the schema to be defined before the database starts,
> MongoDB can have any schema at run-time just like a normal database.
> In the end I choose MongoDB as I require more indexes than Cassandra
> provides, although I really like Cassandras ability to store almost infinite
> amount of data just by adding nodes.
> Thanks, Phil
>
> On Thu, May 13, 2010 at 5:57 AM, S Ahmed <sa...@gmail.com> wrote:
>>
>> I tried searching mail-archive, but the search feature is a bit wacky (or
>> more probably I don't know how to use it).
>> What are the key differences between Cassandra and Mongodb?
>> Is there a particular use case where each solution shines?
>

Re: how does cassandra compare with mongodb?

Posted by philip andrew <ph...@gmail.com>.
Hi,

>From my understanding, Cassandra entities are indexed on only one key, so
this can be a problem if you are searching for example by two values such as
if you are storing an entity with a x,y then wish to search for entities in
a box ie x>5 and x<10 and y>5 and y<10. MongoDB can do this, Cassandra
cannot due to only indexing on one key.

Cassandra can scale automatically just by adding nodes, almost infinite
storage easily, MongoDB requires database administration to add nodes,
setting up replication or allowing sharding, but not too complex.

MongoDB requires you to create sharded keys if you want to scale
horizontally, Cassandra just works automatically for scale horizontally.

Cassandra requires the schema to be defined before the database starts,
MongoDB can have any schema at run-time just like a normal database.

In the end I choose MongoDB as I require more indexes than Cassandra
provides, although I really like Cassandras ability to store almost infinite
amount of data just by adding nodes.

Thanks, Phil

On Thu, May 13, 2010 at 5:57 AM, S Ahmed <sa...@gmail.com> wrote:

> I tried searching mail-archive, but the search feature is a bit wacky (or
> more probably I don't know how to use it).
>
> What are the key differences between Cassandra and Mongodb?
>
> Is there a particular use case where each solution shines?
>