You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Chris Tarnas <cf...@email.com> on 2011/08/30 08:02:05 UTC

HBase and Cassandra on StackOverflow

Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:

http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra

-chris

Re: HBase and Cassandra on StackOverflow

Posted by highpointe <hi...@gmail.com>.

My bad. Was looking on the date of the link. Not the post. Please ignore. 

Sent from my iPhone

On Aug 30, 2011, at 12:02 AM, Chris Tarnas <cf...@email.com> wrote:

> Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:
> 
> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
> 
> -chris

Re: HBase and Cassandra on StackOverflow

Posted by highpointe <hi...@gmail.com>.

This is rather dated. I would love to sew the side by side justification if anyone has made the transition lately. 

Sent from my iPhone

On Aug 30, 2011, at 12:02 AM, Chris Tarnas <cf...@email.com> wrote:

> Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:
> 
> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
> 
> -chris

Re: HBase and Cassandra on StackOverflow

Posted by Ryan Rawson <ry...@gmail.com>.

While data is not fsynced to disk immediately, it is acked by 3
different nodes (Assuming r=3) before HBase acks the client.

-ryan

On Tue, Aug 30, 2011 at 1:04 PM, Joseph Boyd
<jo...@cbsinteractive.com> wrote:
> On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <se...@yahoo.com> wrote:
>>
>> Will the write call to HBase block until the record written is fully
>> replicated ?
>
> no. data isn't written to disk immediately
>
>> If not (since it is happening at the block level), then isn't
>> there a window where a region server goes down, the data might not be
>> available anywhere else, until it comes back up ?
>
> the data would be in the write ahead log.
>
>
> ...joe
>
>
>> On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <ap...@apache.org> wrote:
>>
>> > > Is the replication strategy for HBase completely reliant on HDFS' block
>> > > replication pipelining ?
>> >
>> > Yes.
>> >
>> > > Is this replication process asynchronous ?
>> >
>> >
>> > No.
>> > Best regards,
>> >
>> >
>> >        - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > (via Tom White)
>> >
>> >
>> > >________________________________
>> > >From: Sam Seigal <se...@yahoo.com>
>> > >To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
>> > >Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
>> > >Sent: Tuesday, August 30, 2011 7:35 PM
>> > >Subject: Re: HBase and Cassandra on StackOverflow
>> > >
>> > >A question inline:
>> > >
>> > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org>
>> > wrote:
>> > >
>> > >> Hi Chris,
>> > >>
>> > >> Appreciate your answer on the post.
>> > >>
>> > >> Personally speaking however the endless Cassandra vs. HBase discussion
>> > is
>> > >> tiresome and rarely do blog posts or emails in this regard shed any
>> > light.
>> > >> Often, Cassandra proponents mis-state their case out of ignorance of
>> > HBase
>> > >> or due to commercial or personal agendas. It is difficult to find clear
>> > eyed
>> > >> analysis among the partisans. I'm not sure it will make any difference
>> > >> posting a rebuttal to some random thing jbellis says. Better to focus on
>> > >> improving HBase than play whack a mole.
>> > >>
>> > >>
>> > >> Regarding some of the specific points in that post:
>> > >>
>> > >> HBase is proven in production deployments larger than the largest
>> > publicly
>> > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
>> > basically
>> > >> this is the same order of magnitude, with HBase having a slight edge. I
>> > >> don't see a meaningful difference here. Stating otherwise is false.
>> > >>
>> > >> HBase supports replication between clusters (i.e. data centers). I
>> > believe,
>> > >> but admit I'm not super familiar with the Cassandra option here, that
>> > the
>> > >> main difference is HBase provides simple mechanism and the user must
>> > build a
>> > >> replication architecture useful for them; while Cassandra attempts to
>> > hide
>> > >> some of that complexity. I do not know if they succeed there, but large
>> > >> scale cross data center replication is rarely one size fits all so I
>> > doubt
>> > >> it.
>> > >>
>> > >> Cassandra does not have strong consistency in the sense that HBase
>> > >> provides. It can provide strong consistency, but at the cost of failing
>> > any
>> > >> read if there is insufficient quorum. HBase/HDFS does not have that
>> > >> limitation. On the other hand, HBase has its own and different scenarios
>> > >> where data may not be immediately available. The differences between the
>> > >> systems are nuanced and which to use depends on the use case
>> > requirements.
>> > >>
>> > >>
>> > >I have a question regarding this point. Is the replication strategy for
>> > >HBase completely reliant on HDFS' block replication pipelining ? Is this
>> > >replication process asynchronous ? If it is, then is there not a window,
>> > >where when a machine is to die and the replication pipeline for a
>> > particular
>> > >block has not started yet, that block will be unavailable until the
>> > machine
>> > >comes back up ? Sorry, if I am missing something important here.
>> > >
>> > >
>> > >> Cassandra's RandomPartitioner / hash based partitioning means efficient
>> > >> MapReduce or table scanning is not possible, whereas HBase's distributed
>> > >> ordered tree is naturally efficient for such use cases, I believe
>> > explaining
>> > >> why Hadoop users often prefer it. This may or may not be a problem for
>> > any
>> > >> given use case. Using an ordered partitioner with Cassandra used to
>> > require
>> > >> frequent manual rebalancing to avoid blowing up nodes. I don't know if
>> > more
>> > >> recent versions still have this mis-feature.
>> > >>
>> > >> Cassandra is no less complex than HBase. All of this complexity is
>> > "hidden"
>> > >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
>> > HBase,
>> > >> etc. -- but the Cassandra internals are no less layered. An impartial
>> > >> analysis of implementation and algorithms will reveal that Cassandra's
>> > >> theory of operation in its full detail is substantially more complex.
>> > >> Compare the BigTable and Dynamo papers and this is clear. There are
>> > actually
>> > >> more opportunities for something to go wrong with Cassandra.
>> > >>
>> > >> While we are looking at codebases, it should be noted that HBase has
>> > >> substantially more unit tests.
>> > >>
>> > >> With Cassandra, all RPC is via Thrift with various wrappers, so actually
>> > >> all Cassandra clients are second class in the sense that jbellis means
>> > when
>> > >> he states "Non-Java clients are not second-class citizens".
>> > >>
>> > >> The master-slave versus peer-to-peer argument is larger than Cassandra
>> > vs.
>> > >> HBase, and not nearly as one sided as claimed. The famous (infamous?)
>> > global
>> > >> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
>> > single
>> > >> flipped bit in a gossip message demonstrates how in peer to peer systems
>> > >> every node can be a single point of failure. There is no obvious winner,
>> > >> instead, a series of trade offs. Claiming otherwise is intellectually
>> > >> dishonest. Master-slave architectures seem easier to operate and reason
>> > >> about in my experience. Of course, I'm partial there.
>> > >>
>> > >> I have just scratched the surface.
>> > >>
>> > >>
>> > >> Best regards,
>> > >>
>> > >>
>> > >>        - Andy
>> > >>
>> > >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > >> (via Tom White)
>> > >>
>> > >>
>> > >> >________________________________
>> > >> >From: Chris Tarnas <cf...@email.com>
>> > >> >To: hbase-user@hadoop.apache.org
>> > >> >Sent: Tuesday, August 30, 2011 2:02 PM
>> > >> >Subject: HBase and Cassandra on StackOverflow
>> > >> >
>> > >> >Someone with better knowledge than might be interested in helping
>> > answer
>> > >> this question over at StackOverflow:
>> > >> >
>> > >> >
>> > >>
>> > http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> > >> >
>> > >> >-chris
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> >
>

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

> > Will the write call to HBase block until the record written is fully
> > replicated ?
> no. data isn't written to disk immediately

Not so black and white.

Full replication in HDFS != writes to disk. Full replication is acknowledgement there are replicas at all DataNodes in the pipeline, and with rack-aware placement that is at least one non-rack-local replica. In practice this is good enough to give HDFS 5 or 6 nines of data availability; Hortonworks had a blog post about that recently. 

In our production we do have our DataNodes patched to call fsync() when a block write is completed. This will provide some marginal improvement over the default for the case where suddenly power is lost to the whole datacenter, but marginal is the key word here.

 
Best regards,


- Andy


Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Joseph Boyd <jo...@cbsinteractive.com>
>To: user@hbase.apache.org
>Sent: Wednesday, August 31, 2011 4:04 AM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <se...@yahoo.com> wrote:
>>
>> Will the write call to HBase block until the record written is fully
>> replicated ?
>
>no. data isn't written to disk immediately
>
>> If not (since it is happening at the block level), then isn't
>> there a window where a region server goes down, the data might not be
>> available anywhere else, until it comes back up ?
>
>the data would be in the write ahead log.
>
>
>...joe
>
>
>> On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <ap...@apache.org> wrote:
>>
>> > > Is the replication strategy for HBase completely reliant on HDFS' block
>> > > replication pipelining ?
>> >
>> > Yes.
>> >
>> > > Is this replication process asynchronous ?
>> >
>> >
>> > No.
>> > Best regards,
>> >
>> >
>> >        - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > (via Tom White)
>> >
>> >
>> > >________________________________
>> > >From: Sam Seigal <se...@yahoo.com>
>> > >To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
>> > >Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
>> > >Sent: Tuesday, August 30, 2011 7:35 PM
>> > >Subject: Re: HBase and Cassandra on StackOverflow
>> > >
>> > >A question inline:
>> > >
>> > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org>
>> > wrote:
>> > >
>> > >> Hi Chris,
>> > >>
>> > >> Appreciate your answer on the post.
>> > >>
>> > >> Personally speaking however the endless Cassandra vs. HBase discussion
>> > is
>> > >> tiresome and rarely do blog posts or emails in this regard shed any
>> > light.
>> > >> Often, Cassandra proponents mis-state their case out of ignorance of
>> > HBase
>> > >> or due to commercial or personal agendas. It is difficult to find clear
>> > eyed
>> > >> analysis among the partisans. I'm not sure it will make any difference
>> > >> posting a rebuttal to some random thing jbellis says. Better to focus on
>> > >> improving HBase than play whack a mole.
>> > >>
>> > >>
>> > >> Regarding some of the specific points in that post:
>> > >>
>> > >> HBase is proven in production deployments larger than the largest
>> > publicly
>> > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
>> > basically
>> > >> this is the same order of magnitude, with HBase having a slight edge. I
>> > >> don't see a meaningful difference here. Stating otherwise is false.
>> > >>
>> > >> HBase supports replication between clusters (i.e. data centers). I
>> > believe,
>> > >> but admit I'm not super familiar with the Cassandra option here, that
>> > the
>> > >> main difference is HBase provides simple mechanism and the user must
>> > build a
>> > >> replication architecture useful for them; while Cassandra attempts to
>> > hide
>> > >> some of that complexity. I do not know if they succeed there, but large
>> > >> scale cross data center replication is rarely one size fits all so I
>> > doubt
>> > >> it.
>> > >>
>> > >> Cassandra does not have strong consistency in the sense that HBase
>> > >> provides. It can provide strong consistency, but at the cost of failing
>> > any
>> > >> read if there is insufficient quorum. HBase/HDFS does not have that
>> > >> limitation. On the other hand, HBase has its own and different scenarios
>> > >> where data may not be immediately available. The differences between the
>> > >> systems are nuanced and which to use depends on the use case
>> > requirements.
>> > >>
>> > >>
>> > >I have a question regarding this point. Is the replication strategy for
>> > >HBase completely reliant on HDFS' block replication pipelining ? Is this
>> > >replication process asynchronous ? If it is, then is there not a window,
>> > >where when a machine is to die and the replication pipeline for a
>> > particular
>> > >block has not started yet, that block will be unavailable until the
>> > machine
>> > >comes back up ? Sorry, if I am missing something important here.
>> > >
>> > >
>> > >> Cassandra's RandomPartitioner / hash based partitioning means efficient
>> > >> MapReduce or table scanning is not possible, whereas HBase's distributed
>> > >> ordered tree is naturally efficient for such use cases, I believe
>> > explaining
>> > >> why Hadoop users often prefer it. This may or may not be a problem for
>> > any
>> > >> given use case. Using an ordered partitioner with Cassandra used to
>> > require
>> > >> frequent manual rebalancing to avoid blowing up nodes. I don't know if
>> > more
>> > >> recent versions still have this mis-feature.
>> > >>
>> > >> Cassandra is no less complex than HBase. All of this complexity is
>> > "hidden"
>> > >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
>> > HBase,
>> > >> etc. -- but the Cassandra internals are no less layered. An impartial
>> > >> analysis of implementation and algorithms will reveal that Cassandra's
>> > >> theory of operation in its full detail is substantially more complex.
>> > >> Compare the BigTable and Dynamo papers and this is clear. There are
>> > actually
>> > >> more opportunities for something to go wrong with Cassandra.
>> > >>
>> > >> While we are looking at codebases, it should be noted that HBase has
>> > >> substantially more unit tests.
>> > >>
>> > >> With Cassandra, all RPC is via Thrift with various wrappers, so actually
>> > >> all Cassandra clients are second class in the sense that jbellis means
>> > when
>> > >> he states "Non-Java clients are not second-class citizens".
>> > >>
>> > >> The master-slave versus peer-to-peer argument is larger than Cassandra
>> > vs.
>> > >> HBase, and not nearly as one sided as claimed. The famous (infamous?)
>> > global
>> > >> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
>> > single
>> > >> flipped bit in a gossip message demonstrates how in peer to peer systems
>> > >> every node can be a single point of failure. There is no obvious winner,
>> > >> instead, a series of trade offs. Claiming otherwise is intellectually
>> > >> dishonest. Master-slave architectures seem easier to operate and reason
>> > >> about in my experience. Of course, I'm partial there.
>> > >>
>> > >> I have just scratched the surface.
>> > >>
>> > >>
>> > >> Best regards,
>> > >>
>> > >>
>> > >>        - Andy
>> > >>
>> > >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > >> (via Tom White)
>> > >>
>> > >>
>> > >> >________________________________
>> > >> >From: Chris Tarnas <cf...@email.com>
>> > >> >To: hbase-user@hadoop.apache.org
>> > >> >Sent: Tuesday, August 30, 2011 2:02 PM
>> > >> >Subject: HBase and Cassandra on StackOverflow
>> > >> >
>> > >> >Someone with better knowledge than might be interested in helping
>> > answer
>> > >> this question over at StackOverflow:
>> > >> >
>> > >> >
>> > >>
>> > http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> > >> >
>> > >> >-chris
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> >
>
>
>

Re: HBase and Cassandra on StackOverflow

Posted by Joseph Boyd <jo...@cbsinteractive.com>.

On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <se...@yahoo.com> wrote:
>
> Will the write call to HBase block until the record written is fully
> replicated ?

no. data isn't written to disk immediately

> If not (since it is happening at the block level), then isn't
> there a window where a region server goes down, the data might not be
> available anywhere else, until it comes back up ?

the data would be in the write ahead log.


...joe


> On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <ap...@apache.org> wrote:
>
> > > Is the replication strategy for HBase completely reliant on HDFS' block
> > > replication pipelining ?
> >
> > Yes.
> >
> > > Is this replication process asynchronous ?
> >
> >
> > No.
> > Best regards,
> >
> >
> >        - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
> >
> > >________________________________
> > >From: Sam Seigal <se...@yahoo.com>
> > >To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
> > >Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
> > >Sent: Tuesday, August 30, 2011 7:35 PM
> > >Subject: Re: HBase and Cassandra on StackOverflow
> > >
> > >A question inline:
> > >
> > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org>
> > wrote:
> > >
> > >> Hi Chris,
> > >>
> > >> Appreciate your answer on the post.
> > >>
> > >> Personally speaking however the endless Cassandra vs. HBase discussion
> > is
> > >> tiresome and rarely do blog posts or emails in this regard shed any
> > light.
> > >> Often, Cassandra proponents mis-state their case out of ignorance of
> > HBase
> > >> or due to commercial or personal agendas. It is difficult to find clear
> > eyed
> > >> analysis among the partisans. I'm not sure it will make any difference
> > >> posting a rebuttal to some random thing jbellis says. Better to focus on
> > >> improving HBase than play whack a mole.
> > >>
> > >>
> > >> Regarding some of the specific points in that post:
> > >>
> > >> HBase is proven in production deployments larger than the largest
> > publicly
> > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
> > basically
> > >> this is the same order of magnitude, with HBase having a slight edge. I
> > >> don't see a meaningful difference here. Stating otherwise is false.
> > >>
> > >> HBase supports replication between clusters (i.e. data centers). I
> > believe,
> > >> but admit I'm not super familiar with the Cassandra option here, that
> > the
> > >> main difference is HBase provides simple mechanism and the user must
> > build a
> > >> replication architecture useful for them; while Cassandra attempts to
> > hide
> > >> some of that complexity. I do not know if they succeed there, but large
> > >> scale cross data center replication is rarely one size fits all so I
> > doubt
> > >> it.
> > >>
> > >> Cassandra does not have strong consistency in the sense that HBase
> > >> provides. It can provide strong consistency, but at the cost of failing
> > any
> > >> read if there is insufficient quorum. HBase/HDFS does not have that
> > >> limitation. On the other hand, HBase has its own and different scenarios
> > >> where data may not be immediately available. The differences between the
> > >> systems are nuanced and which to use depends on the use case
> > requirements.
> > >>
> > >>
> > >I have a question regarding this point. Is the replication strategy for
> > >HBase completely reliant on HDFS' block replication pipelining ? Is this
> > >replication process asynchronous ? If it is, then is there not a window,
> > >where when a machine is to die and the replication pipeline for a
> > particular
> > >block has not started yet, that block will be unavailable until the
> > machine
> > >comes back up ? Sorry, if I am missing something important here.
> > >
> > >
> > >> Cassandra's RandomPartitioner / hash based partitioning means efficient
> > >> MapReduce or table scanning is not possible, whereas HBase's distributed
> > >> ordered tree is naturally efficient for such use cases, I believe
> > explaining
> > >> why Hadoop users often prefer it. This may or may not be a problem for
> > any
> > >> given use case. Using an ordered partitioner with Cassandra used to
> > require
> > >> frequent manual rebalancing to avoid blowing up nodes. I don't know if
> > more
> > >> recent versions still have this mis-feature.
> > >>
> > >> Cassandra is no less complex than HBase. All of this complexity is
> > "hidden"
> > >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
> > HBase,
> > >> etc. -- but the Cassandra internals are no less layered. An impartial
> > >> analysis of implementation and algorithms will reveal that Cassandra's
> > >> theory of operation in its full detail is substantially more complex.
> > >> Compare the BigTable and Dynamo papers and this is clear. There are
> > actually
> > >> more opportunities for something to go wrong with Cassandra.
> > >>
> > >> While we are looking at codebases, it should be noted that HBase has
> > >> substantially more unit tests.
> > >>
> > >> With Cassandra, all RPC is via Thrift with various wrappers, so actually
> > >> all Cassandra clients are second class in the sense that jbellis means
> > when
> > >> he states "Non-Java clients are not second-class citizens".
> > >>
> > >> The master-slave versus peer-to-peer argument is larger than Cassandra
> > vs.
> > >> HBase, and not nearly as one sided as claimed. The famous (infamous?)
> > global
> > >> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
> > single
> > >> flipped bit in a gossip message demonstrates how in peer to peer systems
> > >> every node can be a single point of failure. There is no obvious winner,
> > >> instead, a series of trade offs. Claiming otherwise is intellectually
> > >> dishonest. Master-slave architectures seem easier to operate and reason
> > >> about in my experience. Of course, I'm partial there.
> > >>
> > >> I have just scratched the surface.
> > >>
> > >>
> > >> Best regards,
> > >>
> > >>
> > >>        - Andy
> > >>
> > >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > >> (via Tom White)
> > >>
> > >>
> > >> >________________________________
> > >> >From: Chris Tarnas <cf...@email.com>
> > >> >To: hbase-user@hadoop.apache.org
> > >> >Sent: Tuesday, August 30, 2011 2:02 PM
> > >> >Subject: HBase and Cassandra on StackOverflow
> > >> >
> > >> >Someone with better knowledge than might be interested in helping
> > answer
> > >> this question over at StackOverflow:
> > >> >
> > >> >
> > >>
> > http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
> > >> >
> > >> >-chris
> > >> >
> > >> >
> > >>
> > >
> > >
> > >
> >

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

> Will the write call to HBase block until the record written is fully replicated ?

At the HDFS layer, hflush on the write ahead log will block until the data is fully replicated.

At the HBase layer, whether the writer (client) will be blocked until HDFS layer actions complete depends on your settings regarding how the write ahead log operates -- do you have deferred flushing enabled on the table or globally, or not, for example -- and if the particular op has writeToWAL set to false.

> then isn't there a window where a region server goes down, the data
> might not be available anywhere else, until it comes back up

When a regionserver process fails or the node upon which it is running crashes or is partitioned or whatever, store files, flush files, and write-ahead log data are all fully available by way of HDFS to any regionserver in the cluster taking over regions from the failed regionserver.

There is a window of time where the data in the regions of the failed regionsever will not be available, until those regions are redeployed to live regionservers. This is because in the BigTable model, access to a region is available exclusively through an assigned regionserver, and is what I was alluding to when I said "On the other hand, HBase has its own and different scenarios where data may not be immediately available."


Best regards,


      - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Sam Seigal <se...@yahoo.com>
>To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
>Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
>Sent: Wednesday, August 31, 2011 3:22 AM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>Will the write call to HBase block until the record written is fully
>replicated ? If not (since it is happening at the block level), then isn't
>there a window where a region server goes down, the data might not be
>available anywhere else, until it comes back up ?
>
>On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <ap...@apache.org> wrote:
>
>> > Is the replication strategy for HBase completely reliant on HDFS' block
>> > replication pipelining ?
>>
>> Yes.
>>
>> > Is this replication process asynchronous ?
>>
>>
>> No.
>> Best regards,
>>
>>
>>        - Andy
>>
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>>
>>
>> >________________________________
>> >From: Sam Seigal <se...@yahoo.com>
>> >To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
>> >Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
>> >Sent: Tuesday, August 30, 2011 7:35 PM
>> >Subject: Re: HBase and Cassandra on StackOverflow
>> >
>> >A question inline:
>> >
>> >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org>
>> wrote:
>> >
>> >> Hi Chris,
>> >>
>> >> Appreciate your answer on the post.
>> >>
>> >> Personally speaking however the endless Cassandra vs. HBase discussion
>> is
>> >> tiresome and rarely do blog posts or emails in this regard shed any
>> light.
>> >> Often, Cassandra proponents mis-state their case out of ignorance of
>> HBase
>> >> or due to commercial or personal agendas. It is difficult to find clear
>> eyed
>> >> analysis among the partisans. I'm not sure it will make any difference
>> >> posting a rebuttal to some random thing jbellis says. Better to focus on
>> >> improving HBase than play whack a mole.
>> >>
>> >>
>> >> Regarding some of the specific points in that post:
>> >>
>> >> HBase is proven in production deployments larger than the largest
>> publicly
>> >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
>> basically
>> >> this is the same order of magnitude, with HBase having a slight edge. I
>> >> don't see a meaningful difference here. Stating otherwise is false.
>> >>
>> >> HBase supports replication between clusters (i.e. data centers). I
>> believe,
>> >> but admit I'm not super familiar with the Cassandra option here, that
>> the
>> >> main difference is HBase provides simple mechanism and the user must
>> build a
>> >> replication architecture useful for them; while Cassandra attempts to
>> hide
>> >> some of that complexity. I do not know if they succeed there, but large
>> >> scale cross data center replication is rarely one size fits all so I
>> doubt
>> >> it.
>> >>
>> >> Cassandra does not have strong consistency in the sense that HBase
>> >> provides. It can provide strong consistency, but at the cost of failing
>> any
>> >> read if there is insufficient quorum. HBase/HDFS does not have that
>> >> limitation. On the other hand, HBase has its own and different scenarios
>> >> where data may not be immediately available. The differences between the
>> >> systems are nuanced and which to use depends on the use case
>> requirements.
>> >>
>> >>
>> >I have a question regarding this point. Is the replication strategy for
>> >HBase completely reliant on HDFS' block replication pipelining ? Is this
>> >replication process asynchronous ? If it is, then is there not a window,
>> >where when a machine is to die and the replication pipeline for a
>> particular
>> >block has not started yet, that block will be unavailable until the
>> machine
>> >comes back up ? Sorry, if I am missing something important here.
>> >
>> >
>> >> Cassandra's RandomPartitioner / hash based partitioning means efficient
>> >> MapReduce or table scanning is not possible, whereas HBase's distributed
>> >> ordered tree is naturally efficient for such use cases, I believe
>> explaining
>> >> why Hadoop users often prefer it. This may or may not be a problem for
>> any
>> >> given use case. Using an ordered partitioner with Cassandra used to
>> require
>> >> frequent manual rebalancing to avoid blowing up nodes. I don't know if
>> more
>> >> recent versions still have this mis-feature.
>> >>
>> >> Cassandra is no less complex than HBase. All of this complexity is
>> "hidden"
>> >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
>> HBase,
>> >> etc. -- but the Cassandra internals are no less layered. An impartial
>> >> analysis of implementation and algorithms will reveal that Cassandra's
>> >> theory of operation in its full detail is substantially more complex.
>> >> Compare the BigTable and Dynamo papers and this is clear. There are
>> actually
>> >> more opportunities for something to go wrong with Cassandra.
>> >>
>> >> While we are looking at codebases, it should be noted that HBase has
>> >> substantially more unit tests.
>> >>
>> >> With Cassandra, all RPC is via Thrift with various wrappers, so actually
>> >> all Cassandra clients are second class in the sense that jbellis means
>> when
>> >> he states "Non-Java clients are not second-class citizens".
>> >>
>> >> The master-slave versus peer-to-peer argument is larger than Cassandra
>> vs.
>> >> HBase, and not nearly as one sided as claimed. The famous (infamous?)
>> global
>> >> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
>> single
>> >> flipped bit in a gossip message demonstrates how in peer to peer systems
>> >> every node can be a single point of failure. There is no obvious winner,
>> >> instead, a series of trade offs. Claiming otherwise is intellectually
>> >> dishonest. Master-slave architectures seem easier to operate and reason
>> >> about in my experience. Of course, I'm partial there.
>> >>
>> >> I have just scratched the surface.
>> >>
>> >>
>> >> Best regards,
>> >>
>> >>
>> >>        - Andy
>> >>
>> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> >> (via Tom White)
>> >>
>> >>
>> >> >________________________________
>> >> >From: Chris Tarnas <cf...@email.com>
>> >> >To: hbase-user@hadoop.apache.org
>> >> >Sent: Tuesday, August 30, 2011 2:02 PM
>> >> >Subject: HBase and Cassandra on StackOverflow
>> >> >
>> >> >Someone with better knowledge than might be interested in helping
>> answer
>> >> this question over at StackOverflow:
>> >> >
>> >> >
>> >>
>> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> >> >
>> >> >-chris
>> >> >
>> >> >
>> >>
>> >
>> >
>> >
>>
>
>
>

Re: HBase and Cassandra on StackOverflow

Posted by Sam Seigal <se...@yahoo.com>.

Will the write call to HBase block until the record written is fully
replicated ? If not (since it is happening at the block level), then isn't
there a window where a region server goes down, the data might not be
available anywhere else, until it comes back up ?

On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <ap...@apache.org> wrote:

> > Is the replication strategy for HBase completely reliant on HDFS' block
> > replication pipelining ?
>
> Yes.
>
> > Is this replication process asynchronous ?
>
>
> No.
> Best regards,
>
>
>        - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>
> >________________________________
> >From: Sam Seigal <se...@yahoo.com>
> >To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
> >Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
> >Sent: Tuesday, August 30, 2011 7:35 PM
> >Subject: Re: HBase and Cassandra on StackOverflow
> >
> >A question inline:
> >
> >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org>
> wrote:
> >
> >> Hi Chris,
> >>
> >> Appreciate your answer on the post.
> >>
> >> Personally speaking however the endless Cassandra vs. HBase discussion
> is
> >> tiresome and rarely do blog posts or emails in this regard shed any
> light.
> >> Often, Cassandra proponents mis-state their case out of ignorance of
> HBase
> >> or due to commercial or personal agendas. It is difficult to find clear
> eyed
> >> analysis among the partisans. I'm not sure it will make any difference
> >> posting a rebuttal to some random thing jbellis says. Better to focus on
> >> improving HBase than play whack a mole.
> >>
> >>
> >> Regarding some of the specific points in that post:
> >>
> >> HBase is proven in production deployments larger than the largest
> publicly
> >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
> basically
> >> this is the same order of magnitude, with HBase having a slight edge. I
> >> don't see a meaningful difference here. Stating otherwise is false.
> >>
> >> HBase supports replication between clusters (i.e. data centers). I
> believe,
> >> but admit I'm not super familiar with the Cassandra option here, that
> the
> >> main difference is HBase provides simple mechanism and the user must
> build a
> >> replication architecture useful for them; while Cassandra attempts to
> hide
> >> some of that complexity. I do not know if they succeed there, but large
> >> scale cross data center replication is rarely one size fits all so I
> doubt
> >> it.
> >>
> >> Cassandra does not have strong consistency in the sense that HBase
> >> provides. It can provide strong consistency, but at the cost of failing
> any
> >> read if there is insufficient quorum. HBase/HDFS does not have that
> >> limitation. On the other hand, HBase has its own and different scenarios
> >> where data may not be immediately available. The differences between the
> >> systems are nuanced and which to use depends on the use case
> requirements.
> >>
> >>
> >I have a question regarding this point. Is the replication strategy for
> >HBase completely reliant on HDFS' block replication pipelining ? Is this
> >replication process asynchronous ? If it is, then is there not a window,
> >where when a machine is to die and the replication pipeline for a
> particular
> >block has not started yet, that block will be unavailable until the
> machine
> >comes back up ? Sorry, if I am missing something important here.
> >
> >
> >> Cassandra's RandomPartitioner / hash based partitioning means efficient
> >> MapReduce or table scanning is not possible, whereas HBase's distributed
> >> ordered tree is naturally efficient for such use cases, I believe
> explaining
> >> why Hadoop users often prefer it. This may or may not be a problem for
> any
> >> given use case. Using an ordered partitioner with Cassandra used to
> require
> >> frequent manual rebalancing to avoid blowing up nodes. I don't know if
> more
> >> recent versions still have this mis-feature.
> >>
> >> Cassandra is no less complex than HBase. All of this complexity is
> "hidden"
> >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
> HBase,
> >> etc. -- but the Cassandra internals are no less layered. An impartial
> >> analysis of implementation and algorithms will reveal that Cassandra's
> >> theory of operation in its full detail is substantially more complex.
> >> Compare the BigTable and Dynamo papers and this is clear. There are
> actually
> >> more opportunities for something to go wrong with Cassandra.
> >>
> >> While we are looking at codebases, it should be noted that HBase has
> >> substantially more unit tests.
> >>
> >> With Cassandra, all RPC is via Thrift with various wrappers, so actually
> >> all Cassandra clients are second class in the sense that jbellis means
> when
> >> he states "Non-Java clients are not second-class citizens".
> >>
> >> The master-slave versus peer-to-peer argument is larger than Cassandra
> vs.
> >> HBase, and not nearly as one sided as claimed. The famous (infamous?)
> global
> >> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
> single
> >> flipped bit in a gossip message demonstrates how in peer to peer systems
> >> every node can be a single point of failure. There is no obvious winner,
> >> instead, a series of trade offs. Claiming otherwise is intellectually
> >> dishonest. Master-slave architectures seem easier to operate and reason
> >> about in my experience. Of course, I'm partial there.
> >>
> >> I have just scratched the surface.
> >>
> >>
> >> Best regards,
> >>
> >>
> >>        - Andy
> >>
> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> >> (via Tom White)
> >>
> >>
> >> >________________________________
> >> >From: Chris Tarnas <cf...@email.com>
> >> >To: hbase-user@hadoop.apache.org
> >> >Sent: Tuesday, August 30, 2011 2:02 PM
> >> >Subject: HBase and Cassandra on StackOverflow
> >> >
> >> >Someone with better knowledge than might be interested in helping
> answer
> >> this question over at StackOverflow:
> >> >
> >> >
> >>
> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
> >> >
> >> >-chris
> >> >
> >> >
> >>
> >
> >
> >
>

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

> Is the replication strategy for HBase completely reliant on HDFS' block
> replication pipelining ?

Yes.

> Is this replication process asynchronous ? 


No.
Best regards,


       - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Sam Seigal <se...@yahoo.com>
>To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
>Cc: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
>Sent: Tuesday, August 30, 2011 7:35 PM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>A question inline:
>
>On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org> wrote:
>
>> Hi Chris,
>>
>> Appreciate your answer on the post.
>>
>> Personally speaking however the endless Cassandra vs. HBase discussion is
>> tiresome and rarely do blog posts or emails in this regard shed any light.
>> Often, Cassandra proponents mis-state their case out of ignorance of HBase
>> or due to commercial or personal agendas. It is difficult to find clear eyed
>> analysis among the partisans. I'm not sure it will make any difference
>> posting a rebuttal to some random thing jbellis says. Better to focus on
>> improving HBase than play whack a mole.
>>
>>
>> Regarding some of the specific points in that post:
>>
>> HBase is proven in production deployments larger than the largest publicly
>> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically
>> this is the same order of magnitude, with HBase having a slight edge. I
>> don't see a meaningful difference here. Stating otherwise is false.
>>
>> HBase supports replication between clusters (i.e. data centers). I believe,
>> but admit I'm not super familiar with the Cassandra option here, that the
>> main difference is HBase provides simple mechanism and the user must build a
>> replication architecture useful for them; while Cassandra attempts to hide
>> some of that complexity. I do not know if they succeed there, but large
>> scale cross data center replication is rarely one size fits all so I doubt
>> it.
>>
>> Cassandra does not have strong consistency in the sense that HBase
>> provides. It can provide strong consistency, but at the cost of failing any
>> read if there is insufficient quorum. HBase/HDFS does not have that
>> limitation. On the other hand, HBase has its own and different scenarios
>> where data may not be immediately available. The differences between the
>> systems are nuanced and which to use depends on the use case requirements.
>>
>>
>I have a question regarding this point. Is the replication strategy for
>HBase completely reliant on HDFS' block replication pipelining ? Is this
>replication process asynchronous ? If it is, then is there not a window,
>where when a machine is to die and the replication pipeline for a particular
>block has not started yet, that block will be unavailable until the machine
>comes back up ? Sorry, if I am missing something important here.
>
>
>> Cassandra's RandomPartitioner / hash based partitioning means efficient
>> MapReduce or table scanning is not possible, whereas HBase's distributed
>> ordered tree is naturally efficient for such use cases, I believe explaining
>> why Hadoop users often prefer it. This may or may not be a problem for any
>> given use case. Using an ordered partitioner with Cassandra used to require
>> frequent manual rebalancing to avoid blowing up nodes. I don't know if more
>> recent versions still have this mis-feature.
>>
>> Cassandra is no less complex than HBase. All of this complexity is "hidden"
>> in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase,
>> etc. -- but the Cassandra internals are no less layered. An impartial
>> analysis of implementation and algorithms will reveal that Cassandra's
>> theory of operation in its full detail is substantially more complex.
>> Compare the BigTable and Dynamo papers and this is clear. There are actually
>> more opportunities for something to go wrong with Cassandra.
>>
>> While we are looking at codebases, it should be noted that HBase has
>> substantially more unit tests.
>>
>> With Cassandra, all RPC is via Thrift with various wrappers, so actually
>> all Cassandra clients are second class in the sense that jbellis means when
>> he states "Non-Java clients are not second-class citizens".
>>
>> The master-slave versus peer-to-peer argument is larger than Cassandra vs.
>> HBase, and not nearly as one sided as claimed. The famous (infamous?) global
>> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single
>> flipped bit in a gossip message demonstrates how in peer to peer systems
>> every node can be a single point of failure. There is no obvious winner,
>> instead, a series of trade offs. Claiming otherwise is intellectually
>> dishonest. Master-slave architectures seem easier to operate and reason
>> about in my experience. Of course, I'm partial there.
>>
>> I have just scratched the surface.
>>
>>
>> Best regards,
>>
>>
>>        - Andy
>>
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>>
>>
>> >________________________________
>> >From: Chris Tarnas <cf...@email.com>
>> >To: hbase-user@hadoop.apache.org
>> >Sent: Tuesday, August 30, 2011 2:02 PM
>> >Subject: HBase and Cassandra on StackOverflow
>> >
>> >Someone with better knowledge than might be interested in helping answer
>> this question over at StackOverflow:
>> >
>> >
>> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> >
>> >-chris
>> >
>> >
>>
>
>
>

Re: HBase and Cassandra on StackOverflow

Posted by Ryan Rawson <ry...@gmail.com>.

The Hdfs write pipeline is synchronous, so there is no window.
On Aug 30, 2011 4:35 AM, "Sam Seigal" <se...@yahoo.com> wrote:
> A question inline:
>
> On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org>
wrote:
>
>> Hi Chris,
>>
>> Appreciate your answer on the post.
>>
>> Personally speaking however the endless Cassandra vs. HBase discussion is
>> tiresome and rarely do blog posts or emails in this regard shed any
light.
>> Often, Cassandra proponents mis-state their case out of ignorance of
HBase
>> or due to commercial or personal agendas. It is difficult to find clear
eyed
>> analysis among the partisans. I'm not sure it will make any difference
>> posting a rebuttal to some random thing jbellis says. Better to focus on
>> improving HBase than play whack a mole.
>>
>>
>> Regarding some of the specific points in that post:
>>
>> HBase is proven in production deployments larger than the largest
publicly
>> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
basically
>> this is the same order of magnitude, with HBase having a slight edge. I
>> don't see a meaningful difference here. Stating otherwise is false.
>>
>> HBase supports replication between clusters (i.e. data centers). I
believe,
>> but admit I'm not super familiar with the Cassandra option here, that the
>> main difference is HBase provides simple mechanism and the user must
build a
>> replication architecture useful for them; while Cassandra attempts to
hide
>> some of that complexity. I do not know if they succeed there, but large
>> scale cross data center replication is rarely one size fits all so I
doubt
>> it.
>>
>> Cassandra does not have strong consistency in the sense that HBase
>> provides. It can provide strong consistency, but at the cost of failing
any
>> read if there is insufficient quorum. HBase/HDFS does not have that
>> limitation. On the other hand, HBase has its own and different scenarios
>> where data may not be immediately available. The differences between the
>> systems are nuanced and which to use depends on the use case
requirements.
>>
>>
> I have a question regarding this point. Is the replication strategy for
> HBase completely reliant on HDFS' block replication pipelining ? Is this
> replication process asynchronous ? If it is, then is there not a window,
> where when a machine is to die and the replication pipeline for a
particular
> block has not started yet, that block will be unavailable until the
machine
> comes back up ? Sorry, if I am missing something important here.
>
>
>> Cassandra's RandomPartitioner / hash based partitioning means efficient
>> MapReduce or table scanning is not possible, whereas HBase's distributed
>> ordered tree is naturally efficient for such use cases, I believe
explaining
>> why Hadoop users often prefer it. This may or may not be a problem for
any
>> given use case. Using an ordered partitioner with Cassandra used to
require
>> frequent manual rebalancing to avoid blowing up nodes. I don't know if
more
>> recent versions still have this mis-feature.
>>
>> Cassandra is no less complex than HBase. All of this complexity is
"hidden"
>> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
HBase,
>> etc. -- but the Cassandra internals are no less layered. An impartial
>> analysis of implementation and algorithms will reveal that Cassandra's
>> theory of operation in its full detail is substantially more complex.
>> Compare the BigTable and Dynamo papers and this is clear. There are
actually
>> more opportunities for something to go wrong with Cassandra.
>>
>> While we are looking at codebases, it should be noted that HBase has
>> substantially more unit tests.
>>
>> With Cassandra, all RPC is via Thrift with various wrappers, so actually
>> all Cassandra clients are second class in the sense that jbellis means
when
>> he states "Non-Java clients are not second-class citizens".
>>
>> The master-slave versus peer-to-peer argument is larger than Cassandra
vs.
>> HBase, and not nearly as one sided as claimed. The famous (infamous?)
global
>> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
single
>> flipped bit in a gossip message demonstrates how in peer to peer systems
>> every node can be a single point of failure. There is no obvious winner,
>> instead, a series of trade offs. Claiming otherwise is intellectually
>> dishonest. Master-slave architectures seem easier to operate and reason
>> about in my experience. Of course, I'm partial there.
>>
>> I have just scratched the surface.
>>
>>
>> Best regards,
>>
>>
>> - Andy
>>
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>>
>>
>> >________________________________
>> >From: Chris Tarnas <cf...@email.com>
>> >To: hbase-user@hadoop.apache.org
>> >Sent: Tuesday, August 30, 2011 2:02 PM
>> >Subject: HBase and Cassandra on StackOverflow
>> >
>> >Someone with better knowledge than might be interested in helping answer
>> this question over at StackOverflow:
>> >
>> >
>>
http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> >
>> >-chris
>> >
>> >
>>

Re: HBase and Cassandra on StackOverflow

Posted by Sam Seigal <se...@yahoo.com>.

A question inline:

On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org> wrote:

> Hi Chris,
>
> Appreciate your answer on the post.
>
> Personally speaking however the endless Cassandra vs. HBase discussion is
> tiresome and rarely do blog posts or emails in this regard shed any light.
> Often, Cassandra proponents mis-state their case out of ignorance of HBase
> or due to commercial or personal agendas. It is difficult to find clear eyed
> analysis among the partisans. I'm not sure it will make any difference
> posting a rebuttal to some random thing jbellis says. Better to focus on
> improving HBase than play whack a mole.
>
>
> Regarding some of the specific points in that post:
>
> HBase is proven in production deployments larger than the largest publicly
> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically
> this is the same order of magnitude, with HBase having a slight edge. I
> don't see a meaningful difference here. Stating otherwise is false.
>
> HBase supports replication between clusters (i.e. data centers). I believe,
> but admit I'm not super familiar with the Cassandra option here, that the
> main difference is HBase provides simple mechanism and the user must build a
> replication architecture useful for them; while Cassandra attempts to hide
> some of that complexity. I do not know if they succeed there, but large
> scale cross data center replication is rarely one size fits all so I doubt
> it.
>
> Cassandra does not have strong consistency in the sense that HBase
> provides. It can provide strong consistency, but at the cost of failing any
> read if there is insufficient quorum. HBase/HDFS does not have that
> limitation. On the other hand, HBase has its own and different scenarios
> where data may not be immediately available. The differences between the
> systems are nuanced and which to use depends on the use case requirements.
>
>
I have a question regarding this point. Is the replication strategy for
HBase completely reliant on HDFS' block replication pipelining ? Is this
replication process asynchronous ? If it is, then is there not a window,
where when a machine is to die and the replication pipeline for a particular
block has not started yet, that block will be unavailable until the machine
comes back up ? Sorry, if I am missing something important here.


> Cassandra's RandomPartitioner / hash based partitioning means efficient
> MapReduce or table scanning is not possible, whereas HBase's distributed
> ordered tree is naturally efficient for such use cases, I believe explaining
> why Hadoop users often prefer it. This may or may not be a problem for any
> given use case. Using an ordered partitioner with Cassandra used to require
> frequent manual rebalancing to avoid blowing up nodes. I don't know if more
> recent versions still have this mis-feature.
>
> Cassandra is no less complex than HBase. All of this complexity is "hidden"
> in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase,
> etc. -- but the Cassandra internals are no less layered. An impartial
> analysis of implementation and algorithms will reveal that Cassandra's
> theory of operation in its full detail is substantially more complex.
> Compare the BigTable and Dynamo papers and this is clear. There are actually
> more opportunities for something to go wrong with Cassandra.
>
> While we are looking at codebases, it should be noted that HBase has
> substantially more unit tests.
>
> With Cassandra, all RPC is via Thrift with various wrappers, so actually
> all Cassandra clients are second class in the sense that jbellis means when
> he states "Non-Java clients are not second-class citizens".
>
> The master-slave versus peer-to-peer argument is larger than Cassandra vs.
> HBase, and not nearly as one sided as claimed. The famous (infamous?) global
> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single
> flipped bit in a gossip message demonstrates how in peer to peer systems
> every node can be a single point of failure. There is no obvious winner,
> instead, a series of trade offs. Claiming otherwise is intellectually
> dishonest. Master-slave architectures seem easier to operate and reason
> about in my experience. Of course, I'm partial there.
>
> I have just scratched the surface.
>
>
> Best regards,
>
>
>        - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>
> >________________________________
> >From: Chris Tarnas <cf...@email.com>
> >To: hbase-user@hadoop.apache.org
> >Sent: Tuesday, August 30, 2011 2:02 PM
> >Subject: HBase and Cassandra on StackOverflow
> >
> >Someone with better knowledge than might be interested in helping answer
> this question over at StackOverflow:
> >
> >
> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
> >
> >-chris
> >
> >
>

Re: HBase and Cassandra on StackOverflow

Posted by Ryan Rawson <ry...@gmail.com>.

I really like the theory of operation stuff. People say that
centralized operation is a flaw, but I say it's a strength. In a
single datacenter, you have extremely fast .1ms ping or less, there is
no need for a fully decentralized architecture - it can be really hard
to debug.

-ryan


On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <ap...@apache.org> wrote:
> Hi Chris,
>
> Appreciate your answer on the post.
>
> Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole.
>
>
> Regarding some of the specific points in that post:
>
> HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false.
>
> HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it.
>
> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.
>
> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature.
>
> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra.
>
> While we are looking at codebases, it should be noted that HBase has substantially more unit tests.
>
> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".
>
> The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there.
>
> I have just scratched the surface.
>
>
> Best regards,
>
>
>        - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
>
>
>>________________________________
>>From: Chris Tarnas <cf...@email.com>
>>To: hbase-user@hadoop.apache.org
>>Sent: Tuesday, August 30, 2011 2:02 PM
>>Subject: HBase and Cassandra on StackOverflow
>>
>>Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:
>>
>>http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>>
>>-chris
>>
>>

Re: HBase and Cassandra on StackOverflow

Posted by Bernd Fondermann <be...@googlemail.com>.

On Tue, Aug 30, 2011 at 11:47, Andrew Purtell <ap...@apache.org> wrote:
> Hi Chris,
>
> Appreciate your answer on the post.
>
> Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole.
>
>
> Regarding some of the specific points in that post:
>
> HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false.
>
> HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it.
>
> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.
>
> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature.
>
> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra.
>
> While we are looking at codebases, it should be noted that HBase has substantially more unit tests.
>
> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".
>
> The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there.
>
> I have just scratched the surface.

+1, insightful.

Thanks for posting this.

  Bernd

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

Hi Chris,

> Would you mind if I paraphrase your responses on StackOverflow?


Go right ahead.


Best regards,


        - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Chris Tarnas <cf...@email.com>
>To: Andrew Purtell <ap...@apache.org>; user@hbase.apache.org
>Sent: Wednesday, August 31, 2011 1:19 AM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>Hi Andrew,
>
>Would you mind if I paraphrase your responses on StackOverflow?
>
>-chris
>

Re: HBase and Cassandra on StackOverflow

Posted by Chris Tarnas <cf...@email.com>.

Hi Andrew,

Would you mind if I paraphrase your responses on StackOverflow?

-chris

On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote:

> Hi Chris,
> 
> Appreciate your answer on the post.
> 
> Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole.
> 
> 
> Regarding some of the specific points in that post:
> 
> HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false.
> 
> HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it.
> 
> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.
> 
> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature.
> 
> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra.
> 
> While we are looking at codebases, it should be noted that HBase has substantially more unit tests.
> 
> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".
> 
> The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there.
> 
> I have just scratched the surface.
> 
> 
> Best regards,
> 
> 
>        - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
> 
> 
>> ________________________________
>> From: Chris Tarnas <cf...@email.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Tuesday, August 30, 2011 2:02 PM
>> Subject: HBase and Cassandra on StackOverflow
>> 
>> Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:
>> 
>> http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> 
>> -chris
>>

Re: HBase and Cassandra on StackOverflow

Posted by Ryan Rawson <ry...@gmail.com>.

On Tue, Aug 30, 2011 at 10:42 AM, Joe Pallas <jo...@oracle.com> wrote:
>
> On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote:
>
>> Better to focus on improving HBase than play whack a mole.
>
> Absolutely.  So let's talk about improving HBase.  I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months.
>
>> HBase supports replication between clusters (i.e. data centers).
>
> That’s … debatable.  There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.”  My understanding of replication is that you can't replicate any existing data, so unless you activated it on day one, it isn't very useful.  Do I misunderstand?
>
>> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.
>
> That's fair enough, although I think your first two sentences nearly contradict each other :-).  If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior to HBase/HDFS with respect to consistency and availability ("strong" consistency and reads do not fail if any one copy is available).

This is on the surface true, but there are a few hbase use cases that
cass has a harder time supporting:
- increment counter
- CAS calls

some people find these essential to building systems

>
> A more important point, I think, is the one about storage.  HBase uses two different kinds of files, data files and logs, but HDFS doesn't know about that and cannot, for example, optimize data files for write throughput (and random reads) and log files for low latency sequential writes.  (For example, how could performance be improved by adding solid-state disk?)

I think "HDFS doesnt know about that and cannot... optimize" is a bit
of an overstatement... While it is TRUE that currently HDFS does not
do anything, there is no reason why it could do something better.
Adding SSD in an intelligent way would be nice.  Probably not for logs
though.

Will HDFS ever focus on these things?  Probably in the mid-term, I'm
guessing we'll start to see attention on this towards the end of 2012,
or possibly not at all (after all these things dont help MapReduce, so
why bother?)

If an alternate DFS was able to work on these issues, they could very
quickly differentiate themselves over HDFS in terms of HBase support.


>
>> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case.
>
> I don't think you can make a blanket statement that random partitioning makes efficient MapReduce impossible (scanning, yes).  Many M/R tasks process entire tables.  Random partitioning has definite advantages for some cases, and HBase might well benefit from recognizing that and adding some support.
>
>> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered.
>
> Operationally, however, HBase is more complex.  Admins have to configure and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
>
>> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".
>
> That's disingenuous.  Thrift exposes all of the Cassandra API to all of the wrappers, while HBase clients who want to use all of the HBase API must use Java.  That can be fixed, but it is the status quo.
>
> joe
>
>

Re: HBase and Cassandra on StackOverflow

Posted by Edward Capriolo <ed...@gmail.com>.

On Tue, Aug 30, 2011 at 1:42 PM, Joe Pallas <jo...@oracle.com>wrote:

>
> On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote:
>
> > Better to focus on improving HBase than play whack a mole.
>
> Absolutely.  So let's talk about improving HBase.  I'm speaking here as
> someone who has been learning about and experimenting with HBase for more
> than six months.
>
> > HBase supports replication between clusters (i.e. data centers).
>
> That’s … debatable.  There's replication support in the code, but several
> times in the recent past when someone asked about it on this mailing list,
> the response was “I don't know of anyone actually using it.”  My
> understanding of replication is that you can't replicate any existing data,
> so unless you activated it on day one, it isn't very useful.  Do I
> misunderstand?
>
> > Cassandra does not have strong consistency in the sense that HBase
> provides. It can provide strong consistency, but at the cost of failing any
> read if there is insufficient quorum. HBase/HDFS does not have that
> limitation. On the other hand, HBase has its own and different scenarios
> where data may not be immediately available. The differences between the
> systems are nuanced and which to use depends on the use case requirements.
>
> That's fair enough, although I think your first two sentences nearly
> contradict each other :-).  If you use N=3, W=3, R=1 in Cassandra, you
> should get similar behavior to HBase/HDFS with respect to consistency and
> availability ("strong" consistency and reads do not fail if any one copy is
> available).
>
> A more important point, I think, is the one about storage.  HBase uses two
> different kinds of files, data files and logs, but HDFS doesn't know about
> that and cannot, for example, optimize data files for write throughput (and
> random reads) and log files for low latency sequential writes.  (For
> example, how could performance be improved by adding solid-state disk?)
>
> > Cassandra's RandomPartitioner / hash based partitioning means efficient
> MapReduce or table scanning is not possible, whereas HBase's distributed
> ordered tree is naturally efficient for such use cases, I believe explaining
> why Hadoop users often prefer it. This may or may not be a problem for any
> given use case.
>
> I don't think you can make a blanket statement that random partitioning
> makes efficient MapReduce impossible (scanning, yes).  Many M/R tasks
> process entire tables.  Random partitioning has definite advantages for some
> cases, and HBase might well benefit from recognizing that and adding some
> support.
>
> > Cassandra is no less complex than HBase. All of this complexity is
> "hidden" in the sense that with Hadoop/HBase the layering is obvious --
> HDFS, HBase, etc. -- but the Cassandra internals are no less layered.
>
> Operationally, however, HBase is more complex.  Admins have to configure
> and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
>
> > With Cassandra, all RPC is via Thrift with various wrappers, so actually
> all Cassandra clients are second class in the sense that jbellis means when
> he states "Non-Java clients are not second-class citizens".
>
> That's disingenuous.  Thrift exposes all of the Cassandra API to all of the
> wrappers, while HBase clients who want to use all of the HBase API must use
> Java.  That can be fixed, but it is the status quo.
>
> joe
>
>
Hooked into another Cassandra hbase thread...

Cassandra's RandomPartitioner / hash based partitioning means efficient
MapReduce or table scanning is not possible, whereas HBase's distributed
ordered tree is naturally efficient for such use cases, I believe explaining
why Hadoop users often prefer it. This may or may not be a problem for any
given use case.

Many people can and do benefit with this property of HBase.  Efficient
map/reduce still strikes me as an oxymoron :) Yes you can 'push down'
something like 'WHERE key > x and key < y', It is pretty nifty. That does
not really bring you all the way to complex queries.  Cassandra now has
support for built in secondary indexes, and I think soon users will be able
to 'push down' where clauses for 'efficent' map reduce. Also you can
currently range scan on columns (in both directions) in c* which are
efficient. So if you can turn a key ranging design into a column ranging
design you can get the same effect. With both systems Hbase and Cassandra
you likely end up needing to design data around your queries.

Cassandra is no less complex than HBase. All of this complexity is "hidden"
in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase,
etc. -- but the Cassandra internals are no less layered.

*This is an opinion*.  I will disagree on this one. For example, The
Cassandra gossip protocol exchanges two facts (IMHO) 'the state of the ring
UP/DOWN' and the 'token ownership' of nodes. This information only changes
when nodes join or leave the cluster. On the hbase side of things many small
regions are splitting and moving often this involves communication between
several components lets say master, zk, and region servers.  One time setup
complexity is one factor, monitoring and troubleshooting is another. You
also have to consider:

1) making your Namenode actually redundant you need to depend on LinuxHa or
multiple NFS servers
2) someway of protecting your masters/ZK nodes from processor/disk
starvation (IE they need their own machine)
3) Java's semi-piggish memory usage profile, the fact that it rarely gives
it back to the OS, so sharing a system with multiple Java processes is not
ideal because each process tends to bubble up to higher then Xmx!
(DataNode,Regionserver,TaskTracker) same box.

The one JVM per node cassandra stack is less complex architecturally.
I would argue administratively but I do not know of anyone with ROI numbers
on ten node Cassandra vs Hbase clusters :)

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

> So the setup starts by recommending rolling your own hadoop (pain in the
> ass). OR using a beta ( :(  ).


CDH3 is not in beta. The latest version is release, CDH3U1.


I think most people at this point will just use CDH, so all of that about rolling your own compile of Hadoop sources -- that is hard? ("ant") -- is a non-issue.
> First, you have to learn:

> 1) Linux HA
> 2) DRDB
> 
> Right out of the gate just to have a redundant name node.


Likewise HA namenode. Most don't do that I suspect. However, we did. Having a modicum of Linux system administration experience, we were already familiar with DRDB and the RHEL Cluster Suite, so this was not anything we had not seen before.

 
Maybe you are arguing Cassandra is easier for noobs to set up? I guess that's great. But I would not want such a person running my production, and I can't see how any serious person would.

> *Fud ALARM* "Cassandra is rife with cascading cluster failure 
> scenarios."
> ....and hbase never has issues apparently. (remember I am on both lists)

What Ryan said regarding this, I agree completely. I've had occasion over the years to wrangle both master-slave and peer-to-peer systems in various failure modes. In many cases a master gives you a single point of control to regain control of an errant system. There is no such thing in a P2P system, you have to shut down everything and reinitialize. However, refer to my response to the mail that started this thread. Whether master-slave or P2P architecture is appropriate for a given use case involves a series of trade offs. There is no simple answer. Neither is superior to the other.

Best regards,


       - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: Edward Capriolo <ed...@gmail.com>
> To: user@hbase.apache.org
> Cc: 
> Sent: Friday, September 2, 2011 1:53 AM
> Subject: Re: HBase and Cassandra on StackOverflow
> 
> On Wed, Aug 31, 2011 at 1:34 AM, Time Less <ti...@gmail.com> wrote:
> 
>>  Most of your points are dead-on.
>> 
>>  > Cassandra is no less complex than HBase. All of this complexity is
>>  > "hidden" in the sense that with Hadoop/HBase the layering is 
> obvious --
>>  > HDFS, HBase, etc. -- but the Cassandra internals are no less layered.
>>  >
>>  > Operationally, however, HBase is more complex.  Admins have to 
> configure
>>  > and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
>>  >
>> 
>>  I strongly disagree with the premise[1]. Having personally been involved in
>>  the Digg Cassandra rollout, and spent up until a couple months ago being in
>>  part-time weekly contact with the Digg Cassandra administrator, and having
>>  very close ties to the SimpleGeo Cassandra admin, I know it is a fickle
>>  beast. Having also spent a good amount of time at StumbleUpon and Mozilla
>>  (and now Riot Games) I also see first-hand that HBase is far more stable
>>  and
>>  -- dare I say it? -- operationally more simple.
>> 
>>  So okay, HBase is "harder to set up" if following a step-by-step 
> guide on a
>>  wiki is "hard,"[2] but it's FAR easier to administer. 
> Cassandra is rife
>>  with
>>  cascading cluster failure scenarios. I would not recommend running
>>  Cassandra
>>  in a highly-available high-volume data scenario, but don't hesitate to 
> do
>>  so
>>  for HBase.
>> 
>>  I do not know if this is a guaranteed (provable due to architecture)
>>  result,
>>  or just the result of the Cassandra community being... how shall I say...
>>  hostile to administrators. But then, to me it doesn't matter. Results 
> do.
>> 
>>  --
>>  Tim Ellis
>>  Data Architect, Riot Games
>>  [1] That said, the other part of your statement is spot-on, too. It's
>>  surely
>>  possible to improve the HBase architecture or simplify it.
>>  [2] I went from having never set up HBase nor ever used Chef to having
>>  functional Chef recipes that installed a functional HBase/HDFS cluster in
>>  about 2 weeks. From my POV, the biggest stumbling point was that HDFS by
>>  default stores critical data in the underlying filesystem's /tmp 
> directory
>>  by default, which is, for lack of a better word, insane. If I had to
>>  suggest
>>  how to simplify "HBase installation," I'd ask for sane HDFS 
> config files
>>  that are extremely common and difficult-to-ignore.
>> 
> 
> Why are you quoting "harder" what was said was "more 
> complex". Setting up N
> things is more complex then setting up a single thing.
> 
> First, you have to learn:
> 1) Linux HA
> 2) DRDB
> 
> Right out of the gate just to have a redundant name node.
> 
> This is not easy, fast, or simple. In fact this is quite a pain.
> http://docs.google.com/viewer?a=v&q=cache:9rnx-eRzi1AJ:files.meetup.com/1228907/Hadoop%2520Namenode%2520High%2520Availability.pptx+linux+ha+namenode&hl=en&gl=us&pid=bl&srcid=ADGEESig5aJNVAXbLgBwyc311sPSd88jUJbKHx4z2PQtDKHnmM1FuCJpg2IUyqi5JrmUL3RbCb8QRYsjHnP74YuKQfOQXoUZxnhrCy6N1kVpiG1jNi4zhqoKlUTmoDaqS1NegCFb6-WM&sig=AHIEtbQbjN1Olwxui5JmywdWzhqv4Hq3tw&pli=1
> 
> Doing it properly involves setting up physical wires between servers or link
> aggregation groups. You can't script having someone physically run crossover
> cables. You need your switching engineer to set up LAG's.
> Also you may notice that everyone that describes this setup is also
> describing it using linux-ha V1 which was deprecated for over 2 years. Which
> also demonstrates how this process is so complicated people tend to touch it
> and never touch it again because of how fragile it is.
> 
> You are also implying that following the wiki is easy. Personally, I find
> that the wiki has fine detail, but it is confusing.
> Here is why.
> 
> "1.3.1.2. hadoop
> 
> This version of HBase will only run on Hadoop 0.20.x. It will not run on
> hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an
> HDFS that has a durable sync. Currently only the branch-0.20-append branch
> has this attribute[1]. No official releases have been made from this branch
> up to now so you will have to build your own Hadoop from the tip of this
> branch. Michael Noll has written a detailed blog, Building an Hadoop 0.20.x
> version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append.
> Recommended.
> 
> Or rather than build your own, you could use Cloudera's CDH3. CDH has the
> 0.20-append patches needed to add a durable sync (CDH3 betas will suffice;
> b2, b3, or b4)."
> 
> So the setup starts by recommending rolling your own hadoop (pain in the
> ass). OR using a beta ( :(  ).
> 
> Then it gets onto hbase it branches into “Standalone HBase” and Section
> 1.3.2.2, “Distributed”
> Then it branches into "psuedo distributed" and "full 
> distributed" , then the
> zookeeper section offers you two options "1.3.2.2.2.2. ZooKeeper",
> "1.3.2.2.2.2.1. Using existing ZooKeeper ensemble" .
> 
> Not to say this is hard or impossible, but it is a lot of information to
> digest and all the branching decisions are hard to understand to a first
> time user.
> 
> Uppercasing the word FAR does not prove to me that hbase is easier to
> administer nor does the your employment history or second hand stories
> unnamed from people you know. I can tell you why I think Cassandra is easier
> to manage:
> 
> 1) There is only one log file /var/log/cassandra/system.log
> 2) There is only one configuration folder
> /usr/local/cassandra/conf/cassandra.yaml cassandra-env.sh
> 3) I do not need to keep a chart or post it notes where all these 1 off
> components are.  zk server list, hbase master server list, namenode,
> 4) No need to configure auxiliary stuff such as DRBD or Linux-HA
> 
> *Fud ALARM* "Cassandra is rife with cascading cluster failure 
> scenarios."
> ....and hbase never has issues apparently. (remember I am on both lists)
> 
> Also...
> [2] I went from having never set up HBase nor ever used Chef to having
> functional Chef recipes that installed a functional HBase/HDFS cluster in
> about 2 weeks.
> 
> It took me about one hour to accomplish the same result with puppet +
> cassandra.
> http://www.jointhegrid.com/highperfcassandra/?p=62
>

Re: HBase and Cassandra on StackOverflow

Posted by Time Less <ti...@gmail.com>.

> > Can't we just all get along? :-)
> ...
> So, no, frankly we cannot get along. If this were a passionless technical
> discussion that would not be the case. However, from my perspective, which I
> believe is shared by others on the HBase side, the Cassandra project is run
> by asshats and some of their boosters share that unfortunate trait.
>

I have an even more unfortunate position than Andy in this. I started out as
perhaps the one administrator who was most qualified to be an Apache
Cassandra proponent. I was the data director at the flagship premier
big-data shop that was to be using this newly open-sourced Apache Cassandra.

When the FUD started flying, I had the unfortunate circumstance to decide:
should I continue to be part of this community that Andy so accurately
portrays as lead by a bunch of asshats, or should I move on to a community
of professionals that care about technology and big data, rather than self
image? I guess in hindsight the decision wasn't so hard.

At this point I'd like to go back to ignoring Cassandra, and hopefully will
> not have occasion to deal with "Cassandra vs. HBase" again for many months.
>

Hear hear!

-- 
Tim Ellis
Data Architect, Riot Games

RE: HBase and Cassandra on StackOverflow

Posted by Michael Segel <mi...@hotmail.com>.

Ok...

You can look at it this way...

You have Cassandra that hasn't gotten a lot of traction nor has it created enough of a critical mass to be considered long term viable... unlike Hadoop/HBase which has enough critical mass and is viable long term. 

So do you get in to a tit for tat battle or do you say 'whatever' and move on. 
The point is that Hadoop is on everyone's radar while Cassandra isn't. (Cassandra is just another NoSQL database, not a framework. ) So to make Cassandra relevant, it picks a fight with the biggest kid on the block.
(Ok not the biggest kid but you get the idea.)

I think it would be a good idea to just ignore Cassandra. The Hadoop community has nothing to gain in getting in to it with someone making unsubstantiated comments.

JMHO...

-Mike


> Date: Thu, 1 Sep 2011 19:27:36 -0700
> From: apurtell@apache.org
> Subject: Re: HBase and Cassandra on StackOverflow
> To: user@hbase.apache.org
> 
> > From: Michael Segel <mi...@hotmail.com>
> 
> > Can't we just all get along? :-)
> 
> My personal introduction to Cassandra came maybe in the 2009 timeframe. We evaluated it and HBase at the time and chose HBase. No point to discuss why, the world has changed many times over.
> 
> From there, my involvement in the HBase project grew and I didn't think of or hear about Cassandra for a long time.
> 
> Then began an aggressive marketing campaign by Cassandra proponents that spoke negatively about HBase at every opportunity. It was everywhere whether one cared about such things or not. There was also an untrue (but easy to fudge with "marketing" given the technology differences are complex and nuanced) and quite insulting assertion that Cassandra is a superset of HBase. I believe this persists even today.
> 
> So, no, frankly we cannot get along. If this were a passionless technical discussion that would not be the case. However, from my perspective, which I believe is shared by others on the HBase side, the Cassandra project is run by asshats and some of their boosters share that unfortunate trait.
> 
> At this point I'd like to go back to ignoring Cassandra, and hopefully will not have occasion to deal with "Cassandra vs. HBase" again for many months.
> 
> Best regards,
> 
>    - Andy

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

> From: Michael Segel <mi...@hotmail.com>

> Can't we just all get along? :-)

My personal introduction to Cassandra came maybe in the 2009 timeframe. We evaluated it and HBase at the time and chose HBase. No point to discuss why, the world has changed many times over.

From there, my involvement in the HBase project grew and I didn't think of or hear about Cassandra for a long time.

Then began an aggressive marketing campaign by Cassandra proponents that spoke negatively about HBase at every opportunity. It was everywhere whether one cared about such things or not. There was also an untrue (but easy to fudge with "marketing" given the technology differences are complex and nuanced) and quite insulting assertion that Cassandra is a superset of HBase. I believe this persists even today.

So, no, frankly we cannot get along. If this were a passionless technical discussion that would not be the case. However, from my perspective, which I believe is shared by others on the HBase side, the Cassandra project is run by asshats and some of their boosters share that unfortunate trait.

At this point I'd like to go back to ignoring Cassandra, and hopefully will not have occasion to deal with "Cassandra vs. HBase" again for many months.

Best regards,

   - Andy

RE: HBase and Cassandra on StackOverflow

Posted by Michael Segel <mi...@hotmail.com>.



> Date: Thu, 1 Sep 2011 15:13:13 -0700
> Subject: Re: HBase and Cassandra on StackOverflow
> From: timelessness@gmail.com
> To: user@hbase.apache.org
> 
[BIG SNIP]

While you guys are going back and forth... a simple reminder.
Not everyone has the same  base level of experience so their ability to 'cookbook' an install will vary.
Not every company is trying to solve the same problem so that in using either beast, their experiences will vary.

Can't we just all get along? :-)

Both HBase and Cassandra are tools. While similar they have a different fit within the portfolio of options.
Its like being on a golf course and trying to decide if you should use a 7 iron or an 8 iron ...

-Mike

Re: HBase and Cassandra on StackOverflow

Posted by Time Less <ti...@gmail.com>.

Why are you quoting "harder" what was said was "more complex". Setting up N
> things is more complex then setting up a single thing.
>

Okay. Sorry for misinterpreting your meaning. You're right, it's more
complex to set up.

You are also implying that following the wiki is easy. Personally, I find
> that the wiki has fine detail, but it is confusing.
>

True. Running a world-class distributed database isn't trivial. And yeah,
sorry for implying following the wiki is easy. It was for me, but that may
not be for others.


> Uppercasing the word FAR does not prove to me that hbase is easier to
> administer nor does the your employment history or second hand stories
> unnamed from people you know.


A lot of people think credentials are important, especially in this
particular debate of Cassandra vs. HBase, where obviously technical details
are ignored. My point is, I've worked extremely closely with the flagship
deploys of both (Apache) Cassandra and HBase and continue to work closely
with the people who still have to run this stuff at volume today. I'm sorry
you don't find these details important.

[2] I went from having never set up HBase nor ever used Chef to having
> functional Chef recipes that installed a functional HBase/HDFS cluster in
> about 2 weeks.
>
> It took me about one hour to accomplish the same result with puppet +
> cassandra.
> http://www.jointhegrid.com/highperfcassandra/?p=62
>

Something being easy to set up is entirely different than it working at
scale. Note I don't mention how long it took me to set up SQL Lite or write
Chef recipes for it. The whole point of Puppet and Chef is to manage
complexity, which you'll need when running a world-class distributed
database.

-- 
Tim Ellis
Data Architect, Riot Games

Re: HBase and Cassandra on StackOverflow

Posted by Jacques <wh...@gmail.com>.

Don't forget that Gluster just released a beta open source Hadoop connector.
 Their "we'll just a dip a toe in the hadoop community" approach doesn't
inspire confidence.  On the other hand, they have a decent track record
regarding larger HA file system setups and offer many things that MapR
offers (e.g. NFS, built as a distributed file system since day 1, etc.)
along with open source.

I agree with Joe that there aren't great options with regards to HA.

   1. If you're okay with closed source and a big price tag (list is like
   4k/node), MapR is probably your best option.
   2. If you're Facebook or Yahoo you can make a solution work because you
   have the manpower.
   3. If you're not either 1 or 2, you're kinda stuck on the Hadoop side of
   things--you use the best hardware you can for the namenode and use either
   DRBD or a redundant SAN (which can be had for much less than 250k)

I strongly believe that things have the potential to change substantially
within the next 12 months.  (More optimistic than Ryan-- maybe because he
has seen the Hadoop community thrashing for longer).

And yes, ceph is getting closer all the time.

Re: HBase and Cassandra on StackOverflow

Posted by Ryan Rawson <ry...@gmail.com>.

On Fri, Sep 2, 2011 at 10:27 AM, Joseph Pallas <jo...@oracle.com> wrote:
> Drifting off topic a bit …
>
> On Sep 1, 2011, at 12:12 PM, Ryan Rawson wrote:
>
>>> First, you have to learn:
>>> 1) Linux HA
>>> 2) DRDB
>>>
>>> Right out of the gate just to have a redundant name node.
>>
>> Eh, no one would do that.  If you want a redundant name node your only
>> choice is to use Mapr, which I would def recommend since you get a
>> better nn "fail-over" w/o service interruption and significantly
>> higher performance than hdfs.
>
> Really?  People running offline analytics may be fine with an hour of downtime [<http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html> <http://www.hortonworks.com/data-integrity-and-availability-in-apache-hadoop-hdfs/>] for their M/R jobs, but people running interactive services do not find that acceptable.
>
> Is my only option to avoid significant downtime in the event of a name node failure a closed-source offering that has already demonstrated at least one serious data-loss issue <http://answers.mapr.com/questions/415/hbase-table-disappear-after-failover-attempt-and-fall-back>?

Well, actually... yes.  HA/DRDB flip will take at the very least 10-30
seconds, and possibly 10 minutes or longer if your cluster is really
big.  Avatar node presumes a $250k netapp, and still has a 10-30
second flip time once you trigger it.  The NN-HA work is still WIP.

You could always use ceph, right?

>
> I don’t really mean to criticize MapR: they were victims of a hidden dependency, but that’s what happens when you replace part of an integrated stack.  And that is why I find your suggestion that I should not expect to use the integrated stack a little unnerving, because I'm looking at HBase for an online application.
>

Re: HBase and Cassandra on StackOverflow

Posted by Joseph Pallas <jo...@oracle.com>.

Drifting off topic a bit …

On Sep 1, 2011, at 12:12 PM, Ryan Rawson wrote:

>> First, you have to learn:
>> 1) Linux HA
>> 2) DRDB
>> 
>> Right out of the gate just to have a redundant name node.
> 
> Eh, no one would do that.  If you want a redundant name node your only
> choice is to use Mapr, which I would def recommend since you get a
> better nn "fail-over" w/o service interruption and significantly
> higher performance than hdfs.

Really?  People running offline analytics may be fine with an hour of downtime [<http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html> <http://www.hortonworks.com/data-integrity-and-availability-in-apache-hadoop-hdfs/>] for their M/R jobs, but people running interactive services do not find that acceptable.

Is my only option to avoid significant downtime in the event of a name node failure a closed-source offering that has already demonstrated at least one serious data-loss issue <http://answers.mapr.com/questions/415/hbase-table-disappear-after-failover-attempt-and-fall-back>?

I don’t really mean to criticize MapR: they were victims of a hidden dependency, but that’s what happens when you replace part of an integrated stack.  And that is why I find your suggestion that I should not expect to use the integrated stack a little unnerving, because I'm looking at HBase for an online application.

joe

Re: HBase and Cassandra on StackOverflow

Posted by Ryan Rawson <ry...@gmail.com>.

On Thu, Sep 1, 2011 at 10:53 AM, Edward Capriolo <ed...@gmail.com> wrote:
> On Wed, Aug 31, 2011 at 1:34 AM, Time Less <ti...@gmail.com> wrote:
>
>> Most of your points are dead-on.
>>
>> > Cassandra is no less complex than HBase. All of this complexity is
>> > "hidden" in the sense that with Hadoop/HBase the layering is obvious --
>> > HDFS, HBase, etc. -- but the Cassandra internals are no less layered.
>> >
>> > Operationally, however, HBase is more complex.  Admins have to configure
>> > and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
>> >
>>
>> I strongly disagree with the premise[1]. Having personally been involved in
>> the Digg Cassandra rollout, and spent up until a couple months ago being in
>> part-time weekly contact with the Digg Cassandra administrator, and having
>> very close ties to the SimpleGeo Cassandra admin, I know it is a fickle
>> beast. Having also spent a good amount of time at StumbleUpon and Mozilla
>> (and now Riot Games) I also see first-hand that HBase is far more stable
>> and
>> -- dare I say it? -- operationally more simple.
>>
>> So okay, HBase is "harder to set up" if following a step-by-step guide on a
>> wiki is "hard,"[2] but it's FAR easier to administer. Cassandra is rife
>> with
>> cascading cluster failure scenarios. I would not recommend running
>> Cassandra
>> in a highly-available high-volume data scenario, but don't hesitate to do
>> so
>> for HBase.
>>
>> I do not know if this is a guaranteed (provable due to architecture)
>> result,
>> or just the result of the Cassandra community being... how shall I say...
>> hostile to administrators. But then, to me it doesn't matter. Results do.
>>
>> --
>> Tim Ellis
>> Data Architect, Riot Games
>> [1] That said, the other part of your statement is spot-on, too. It's
>> surely
>> possible to improve the HBase architecture or simplify it.
>> [2] I went from having never set up HBase nor ever used Chef to having
>> functional Chef recipes that installed a functional HBase/HDFS cluster in
>> about 2 weeks. From my POV, the biggest stumbling point was that HDFS by
>> default stores critical data in the underlying filesystem's /tmp directory
>> by default, which is, for lack of a better word, insane. If I had to
>> suggest
>> how to simplify "HBase installation," I'd ask for sane HDFS config files
>> that are extremely common and difficult-to-ignore.
>>
>
> Why are you quoting "harder" what was said was "more complex". Setting up N
> things is more complex then setting up a single thing.
>
> First, you have to learn:
> 1) Linux HA
> 2) DRDB
>
> Right out of the gate just to have a redundant name node.

Eh, no one would do that.  If you want a redundant name node your only
choice is to use Mapr, which I would def recommend since you get a
better nn "fail-over" w/o service interruption and significantly
higher performance than hdfs.

>
> This is not easy, fast, or simple. In fact this is quite a pain.
> http://docs.google.com/viewer?a=v&q=cache:9rnx-eRzi1AJ:files.meetup.com/1228907/Hadoop%2520Namenode%2520High%2520Availability.pptx+linux+ha+namenode&hl=en&gl=us&pid=bl&srcid=ADGEESig5aJNVAXbLgBwyc311sPSd88jUJbKHx4z2PQtDKHnmM1FuCJpg2IUyqi5JrmUL3RbCb8QRYsjHnP74YuKQfOQXoUZxnhrCy6N1kVpiG1jNi4zhqoKlUTmoDaqS1NegCFb6-WM&sig=AHIEtbQbjN1Olwxui5JmywdWzhqv4Hq3tw&pli=1
>
> Doing it properly involves setting up physical wires between servers or link
> aggregation groups. You can't script having someone physically run crossover
> cables. You need your switching engineer to set up LAG's.
> Also you may notice that everyone that describes this setup is also
> describing it using linux-ha V1 which was deprecated for over 2 years. Which
> also demonstrates how this process is so complicated people tend to touch it
> and never touch it again because of how fragile it is.
>
> You are also implying that following the wiki is easy. Personally, I find
> that the wiki has fine detail, but it is confusing.
> Here is why.
>
> "1.3.1.2. hadoop
>
> This version of HBase will only run on Hadoop 0.20.x. It will not run on
> hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an
> HDFS that has a durable sync. Currently only the branch-0.20-append branch
> has this attribute[1]. No official releases have been made from this branch
> up to now so you will have to build your own Hadoop from the tip of this
> branch. Michael Noll has written a detailed blog, Building an Hadoop 0.20.x
> version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append.
> Recommended.
>
> Or rather than build your own, you could use Cloudera's CDH3. CDH has the
> 0.20-append patches needed to add a durable sync (CDH3 betas will suffice;
> b2, b3, or b4)."
>
> So the setup starts by recommending rolling your own hadoop (pain in the
> ass). OR using a beta ( :(  ).
>
> Then it gets onto hbase it branches into “Standalone HBase” and Section
> 1.3.2.2, “Distributed”
> Then it branches into "psuedo distributed" and "full distributed" , then the
> zookeeper section offers you two options "1.3.2.2.2.2. ZooKeeper",
> "1.3.2.2.2.2.1. Using existing ZooKeeper ensemble" .
>
> Not to say this is hard or impossible, but it is a lot of information to
> digest and all the branching decisions are hard to understand to a first
> time user.

Moving forward, my plan is to only deploy HBase on top of mapr for
real-time situations where at all possible. HDFS isn't there yet, 2.5
years ago I was optimistic, and they still have more years to go.  In
the mean time, with mapr you get yourself HA, better performance, and
hopefully better error recovery.

>
> Uppercasing the word FAR does not prove to me that hbase is easier to
> administer nor does the your employment history or second hand stories
> unnamed from people you know. I can tell you why I think Cassandra is easier
> to manage:
>
> 1) There is only one log file /var/log/cassandra/system.log
> 2) There is only one configuration folder
> /usr/local/cassandra/conf/cassandra.yaml cassandra-env.sh
> 3) I do not need to keep a chart or post it notes where all these 1 off
> components are.  zk server list, hbase master server list, namenode,
> 4) No need to configure auxiliary stuff such as DRBD or Linux-HA

Just as an aside, no one does #4.

As for #3, what you are really saying is "i dont want to have good
sysadmin/automation practices" - sure a lot of people don't, but if
you do, #3 is a non-issue.  Chef can help.

>
> *Fud ALARM* "Cassandra is rife with cascading cluster failure scenarios."
> ....and hbase never has issues apparently. (remember I am on both lists)

This is not FUD, its a legitimate concern.  The issue isn't if one
system has failures or not, because all fail, but HOW they fail. And
that also leads to HOW you determine what the root cause it, and HOW
you recover.  This sounds like a difference of opinion, but there are
practicalities of how you admin and deal with 3am failure modes.  I
think this is the place where HBase shines very well, but this is a
story you can't tell without people crying "FUD" since it's complex
and thus doesn't translate well.

I also would posit that the HBase master is a _good_ thing.  It
provides a management point, it doesnt participate in the query path,
and is not a major scaling issue.  It lets you give definitive answers
to things like "how busy is my cluster" and "what is online/offline"
"what tables are there" etc etc.  It handles failures in a highly
explicit manner, which is good.


>
> Also...
> [2] I went from having never set up HBase nor ever used Chef to having
> functional Chef recipes that installed a functional HBase/HDFS cluster in
> about 2 weeks.
>
> It took me about one hour to accomplish the same result with puppet +
> cassandra.
> http://www.jointhegrid.com/highperfcassandra/?p=62
>

Re: HBase and Cassandra on StackOverflow

Posted by Arun C Murthy <ac...@hortonworks.com>.

On Sep 1, 2011, at 10:53 AM, Edward Capriolo wrote:

> "1.3.1.2. hadoop
> 
> This version of HBase will only run on Hadoop 0.20.x. It will not run on
> hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an
> HDFS that has a durable sync. Currently only the branch-0.20-append branch
> has this attribute[1]. No official releases have been made from this branch
> up to now so you will have to build your own Hadoop from the tip of this
> branch. Michael Noll has written a detailed blog, Building an Hadoop 0.20.x
> version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append.
> Recommended.
> 
> Or rather than build your own, you could use Cloudera's CDH3. CDH has the
> 0.20-append patches needed to add a durable sync (CDH3 betas will suffice;
> b2, b3, or b4)."
> 
> So the setup starts by recommending rolling your own hadoop (pain in the
> ass). OR using a beta ( :(  ).

It's early days yet, but we seem to be converging towards having a grand unification of security and append patchsets for hadoop-0.20.205 release from Apache.

Arun

Re: HBase and Cassandra on StackOverflow

Posted by Edward Capriolo <ed...@gmail.com>.

On Wed, Aug 31, 2011 at 1:34 AM, Time Less <ti...@gmail.com> wrote:

> Most of your points are dead-on.
>
> > Cassandra is no less complex than HBase. All of this complexity is
> > "hidden" in the sense that with Hadoop/HBase the layering is obvious --
> > HDFS, HBase, etc. -- but the Cassandra internals are no less layered.
> >
> > Operationally, however, HBase is more complex.  Admins have to configure
> > and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
> >
>
> I strongly disagree with the premise[1]. Having personally been involved in
> the Digg Cassandra rollout, and spent up until a couple months ago being in
> part-time weekly contact with the Digg Cassandra administrator, and having
> very close ties to the SimpleGeo Cassandra admin, I know it is a fickle
> beast. Having also spent a good amount of time at StumbleUpon and Mozilla
> (and now Riot Games) I also see first-hand that HBase is far more stable
> and
> -- dare I say it? -- operationally more simple.
>
> So okay, HBase is "harder to set up" if following a step-by-step guide on a
> wiki is "hard,"[2] but it's FAR easier to administer. Cassandra is rife
> with
> cascading cluster failure scenarios. I would not recommend running
> Cassandra
> in a highly-available high-volume data scenario, but don't hesitate to do
> so
> for HBase.
>
> I do not know if this is a guaranteed (provable due to architecture)
> result,
> or just the result of the Cassandra community being... how shall I say...
> hostile to administrators. But then, to me it doesn't matter. Results do.
>
> --
> Tim Ellis
> Data Architect, Riot Games
> [1] That said, the other part of your statement is spot-on, too. It's
> surely
> possible to improve the HBase architecture or simplify it.
> [2] I went from having never set up HBase nor ever used Chef to having
> functional Chef recipes that installed a functional HBase/HDFS cluster in
> about 2 weeks. From my POV, the biggest stumbling point was that HDFS by
> default stores critical data in the underlying filesystem's /tmp directory
> by default, which is, for lack of a better word, insane. If I had to
> suggest
> how to simplify "HBase installation," I'd ask for sane HDFS config files
> that are extremely common and difficult-to-ignore.
>

Why are you quoting "harder" what was said was "more complex". Setting up N
things is more complex then setting up a single thing.

First, you have to learn:
1) Linux HA
2) DRDB

Right out of the gate just to have a redundant name node.

This is not easy, fast, or simple. In fact this is quite a pain.
http://docs.google.com/viewer?a=v&q=cache:9rnx-eRzi1AJ:files.meetup.com/1228907/Hadoop%2520Namenode%2520High%2520Availability.pptx+linux+ha+namenode&hl=en&gl=us&pid=bl&srcid=ADGEESig5aJNVAXbLgBwyc311sPSd88jUJbKHx4z2PQtDKHnmM1FuCJpg2IUyqi5JrmUL3RbCb8QRYsjHnP74YuKQfOQXoUZxnhrCy6N1kVpiG1jNi4zhqoKlUTmoDaqS1NegCFb6-WM&sig=AHIEtbQbjN1Olwxui5JmywdWzhqv4Hq3tw&pli=1

Doing it properly involves setting up physical wires between servers or link
aggregation groups. You can't script having someone physically run crossover
cables. You need your switching engineer to set up LAG's.
Also you may notice that everyone that describes this setup is also
describing it using linux-ha V1 which was deprecated for over 2 years. Which
also demonstrates how this process is so complicated people tend to touch it
and never touch it again because of how fragile it is.

You are also implying that following the wiki is easy. Personally, I find
that the wiki has fine detail, but it is confusing.
Here is why.

"1.3.1.2. hadoop

This version of HBase will only run on Hadoop 0.20.x. It will not run on
hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an
HDFS that has a durable sync. Currently only the branch-0.20-append branch
has this attribute[1]. No official releases have been made from this branch
up to now so you will have to build your own Hadoop from the tip of this
branch. Michael Noll has written a detailed blog, Building an Hadoop 0.20.x
version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append.
Recommended.

Or rather than build your own, you could use Cloudera's CDH3. CDH has the
0.20-append patches needed to add a durable sync (CDH3 betas will suffice;
b2, b3, or b4)."

So the setup starts by recommending rolling your own hadoop (pain in the
ass). OR using a beta ( :(  ).

Then it gets onto hbase it branches into “Standalone HBase” and Section
1.3.2.2, “Distributed”
Then it branches into "psuedo distributed" and "full distributed" , then the
zookeeper section offers you two options "1.3.2.2.2.2. ZooKeeper",
"1.3.2.2.2.2.1. Using existing ZooKeeper ensemble" .

Not to say this is hard or impossible, but it is a lot of information to
digest and all the branching decisions are hard to understand to a first
time user.

Uppercasing the word FAR does not prove to me that hbase is easier to
administer nor does the your employment history or second hand stories
unnamed from people you know. I can tell you why I think Cassandra is easier
to manage:

1) There is only one log file /var/log/cassandra/system.log
2) There is only one configuration folder
/usr/local/cassandra/conf/cassandra.yaml cassandra-env.sh
3) I do not need to keep a chart or post it notes where all these 1 off
components are.  zk server list, hbase master server list, namenode,
4) No need to configure auxiliary stuff such as DRBD or Linux-HA

*Fud ALARM* "Cassandra is rife with cascading cluster failure scenarios."
....and hbase never has issues apparently. (remember I am on both lists)

Also...
[2] I went from having never set up HBase nor ever used Chef to having
functional Chef recipes that installed a functional HBase/HDFS cluster in
about 2 weeks.

It took me about one hour to accomplish the same result with puppet +
cassandra.
http://www.jointhegrid.com/highperfcassandra/?p=62

Re: HBase and Cassandra on StackOverflow

Posted by Time Less <ti...@gmail.com>.

Most of your points are dead-on.

> Cassandra is no less complex than HBase. All of this complexity is
> "hidden" in the sense that with Hadoop/HBase the layering is obvious --
> HDFS, HBase, etc. -- but the Cassandra internals are no less layered.
>
> Operationally, however, HBase is more complex.  Admins have to configure
> and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
>

I strongly disagree with the premise[1]. Having personally been involved in
the Digg Cassandra rollout, and spent up until a couple months ago being in
part-time weekly contact with the Digg Cassandra administrator, and having
very close ties to the SimpleGeo Cassandra admin, I know it is a fickle
beast. Having also spent a good amount of time at StumbleUpon and Mozilla
(and now Riot Games) I also see first-hand that HBase is far more stable and
-- dare I say it? -- operationally more simple.

So okay, HBase is "harder to set up" if following a step-by-step guide on a
wiki is "hard,"[2] but it's FAR easier to administer. Cassandra is rife with
cascading cluster failure scenarios. I would not recommend running Cassandra
in a highly-available high-volume data scenario, but don't hesitate to do so
for HBase.

I do not know if this is a guaranteed (provable due to architecture) result,
or just the result of the Cassandra community being... how shall I say...
hostile to administrators. But then, to me it doesn't matter. Results do.

-- 
Tim Ellis
Data Architect, Riot Games
[1] That said, the other part of your statement is spot-on, too. It's surely
possible to improve the HBase architecture or simplify it.
[2] I went from having never set up HBase nor ever used Chef to having
functional Chef recipes that installed a functional HBase/HDFS cluster in
about 2 weeks. From my POV, the biggest stumbling point was that HDFS by
default stores critical data in the underlying filesystem's /tmp directory
by default, which is, for lack of a better word, insane. If I had to suggest
how to simplify "HBase installation," I'd ask for sane HDFS config files
that are extremely common and difficult-to-ignore.

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.


> http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL

Thanks, that was what I was referring to earlier in this thread. Now bookmarked.
Comments there from those more knowledgable about Cassandra than I seem to indicate that N=3,W=3,R=1 is not practical (one commenter I know to be an expert characterizes it as "suicidal"), and the comments in the collapsed answer indicate there are corner cases known to Cassandra experts where HBase-equivalent strong consistency cannot be maintained even with that setting.
 
So it seems that claims that Cassandra can provide consistency equivalent to HBase are erroneous.


Best regards,


       - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Gary Helmling <gh...@gmail.com>
>To: user@hbase.apache.org
>Sent: Thursday, September 1, 2011 2:21 AM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>> Since this is fairly off-topic at this point, I'll keep it short. The
>> simple
>> rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're
>> guaranteed a consistent result always. You get eventual consistency if
>> W>=Quorum. If W<Quorum, then you can get inconsistent data that must be
>> detected/fixed by readers (often using timestamps or similar techniques).
>> Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same
>> (provably identical?) behaviour as HBase, with respect to consistency.
>>
>>
>For those interested in a comparison of the consistency behavior, there's an
>older, but really excellent thread on quora with detailed analysis:
>http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL
>
>Don't miss the last answer in the the thread.  It's unfortunately collapsed
>due to some quora policy, but it contains some of the best details.
>
>
>

Re: HBase and Cassandra on StackOverflow

Posted by Gary Helmling <gh...@gmail.com>.

> Since this is fairly off-topic at this point, I'll keep it short. The
> simple
> rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're
> guaranteed a consistent result always. You get eventual consistency if
> W>=Quorum. If W<Quorum, then you can get inconsistent data that must be
> detected/fixed by readers (often using timestamps or similar techniques).
> Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same
> (provably identical?) behaviour as HBase, with respect to consistency.
>
>
For those interested in a comparison of the consistency behavior, there's an
older, but really excellent thread on quora with detailed analysis:
http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL

Don't miss the last answer in the the thread.  It's unfortunately collapsed
due to some quora policy, but it contains some of the best details.

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

> > > If you use N=3, W=3, R=1 in Cassandra, you
> > > should get similar behavior
> > > to HBase/HDFS with respect to consistency
> > > and availability
> >
> > My understanding is that R=1 does not guarantee that you won't see
> > different versions of the data in different reads, in some scenarios.
>
> Since this is fairly off-topic at this point, I'll keep it short.
> The simple rule for Dynamo goes like this: if (R+W>N && W>=Quorum),
> then you're guaranteed a consistent result always.

Ok, I'll concede this point rather than go really off-topic with conjecture about corner cases, especially given I'm not a Cassandra expert by any means and could simply be mistaken.

However this is still not quite HBase-equivalent consistency. HBase can provide CAS operations and atomic counters because only one regionserver at a time can mediate operations on a given row.
 
Best regards,

    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)

    From: Time Less <ti...@gmail.com>
    To: user@hbase.apache.org; Andrew Purtell <ap...@apache.org>
    Sent: Wednesday, August 31, 2011 1:47 PM
    Subject: Re: HBase and Cassandra on StackOverflow

    > > If you use N=3, W=3, R=1 in Cassandra, you
    > should get similar behavior
    > > to HBase/HDFS with respect to consistency
    > and availability
    >
    > My understanding is that R=1 does not guarantee that you won't see
    > different versions of the data in different reads, in some scenarios. There
    > was an excellent Quora answer in this regard, I don't remember it offhand,
    > perhaps you can find the link to it or someone can provide it to you.
    >

    Since this is fairly off-topic at this point, I'll keep it short. The simple
    rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're
    guaranteed a consistent result always. You get eventual consistency if
    W>=Quorum. If W<Quorum, then you can get inconsistent data that must be
    detected/fixed by readers (often using timestamps or similar techniques).
    Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same
    (provably identical?) behaviour as HBase, with respect to consistency.

    -- 
    Tim Ellis
    Data Architect, Riot Games

Re: HBase and Cassandra on StackOverflow

Posted by Time Less <ti...@gmail.com>.

> > If you use N=3, W=3, R=1 in Cassandra, you
> should get similar behavior
> > to HBase/HDFS with respect to consistency
> and availability
>
> My understanding is that R=1 does not guarantee that you won't see
> different versions of the data in different reads, in some scenarios. There
> was an excellent Quora answer in this regard, I don't remember it offhand,
> perhaps you can find the link to it or someone can provide it to you.
>

Since this is fairly off-topic at this point, I'll keep it short. The simple
rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're
guaranteed a consistent result always. You get eventual consistency if
W>=Quorum. If W<Quorum, then you can get inconsistent data that must be
detected/fixed by readers (often using timestamps or similar techniques).
Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same
(provably identical?) behaviour as HBase, with respect to consistency.

-- 
Tim Ellis
Data Architect, Riot Games

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

Hi Joe,

> > HBase supports replication between clusters (i.e. data centers).
> 
> That’s
 … debatable.  There's replication support in the code, but
> several 
times in the recent past when someone asked about it on this
> mailing 
list, the response was “I don't know of anyone actually
> using it.” 


I believe SU uses it.

Anyway I think this is really the point I was making here:

> > the main difference is HBase provides simple mechanism and the user
> > must
 build a replication architecture useful for them; while
> > Cassandra 
attempts to hide some of that complexity

So I don't think you nor I are debating this point really, except this:

> My understanding of replication is that you can't replicate any
> existing
 data, so unless you activated it on day one, it isn't very
> useful.

That was a design choice. Existing data should be transferred in advance or in background one-shot with a utility that chooses on an application-specific basis what is useful to replicate. There is also a generic utility provided as a MR job for this purpose.

> If you use N=3, W=3, R=1 in Cassandra, you 
should get similar behavior
> to HBase/HDFS with respect to consistency 
and availability

My understanding is that R=1 does not guarantee that you won't see different versions of the data in different reads, in some scenarios. There was an excellent Quora answer in this regard, I don't remember it offhand, perhaps you can find the link to it or someone can provide it to you.

> Random partitioning has definite advantages for some cases, and HBase
> might well benefit from recognizing that and adding some support.

Or just use salted keys? 

Random partitioning in a distributed ordered tree sounds like impedance mismatch to me.

> HBase uses two different kinds of files, data files and logs, but
> HDFS 
doesn't know about that and cannot, for example, optimize data
> files for write throughput

You are assuming that HDFS is a shrinkwrapped static thing here, no?

Anyway, your point is valid, in the past features that HBase requires of HDFS have not received the level of support in the HDFS developer community that we would have liked. However this is now rapidly changing for the better.

> Operationally, however, HBase is more complex.
> Admins have to configure
 and manage ZooKeeper, HDFS, and HBase.
> Could this be improved?

Sure, there is room for improvement for hiding some of the complexity for evaluators or single system developers or other users who want e.g. a three step quickstart.

Personally I prefer having the ability to tune those layers independent of each other.

And, while complexity may be more "hidden" operationally in the Cassandra case relative to HBase, when there is a problem on your cluster, I don't know if that buys you anything. I suppose it depends on the nature of the problem. I do not believe there is a guarantee that operationally Cassandra is really simpler than HBase when it's 2 am and there is a bug and nodes are going down.


Best regards,


        - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Joe Pallas <jo...@oracle.com>
>To: user@hbase.apache.org
>Sent: Wednesday, August 31, 2011 1:42 AM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>
>On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote:
>
>> Better to focus on improving HBase than play whack a mole.
>
>Absolutely.  So let's talk about improving HBase.  I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months.
>
>> HBase supports replication between clusters (i.e. data centers).
>
>That’s … debatable.  There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.”  My understanding of replication is that you can't replicate any existing data, so unless you activated it on day one, it isn't very useful.  Do I misunderstand?
>
>> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.
>
>That's fair enough, although I think your first two sentences nearly contradict each other :-).  If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior to HBase/HDFS with respect to consistency and availability ("strong" consistency and reads do not fail if any one copy is available).
>
>A more important point, I think, is the one about storage.  HBase uses two different kinds of files, data files and logs, but HDFS doesn't know about that and cannot, for example, optimize data files for write throughput (and random reads) and log files for low latency sequential writes.  (For example, how could performance be improved by adding solid-state disk?)
>
>> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. 
>
>I don't think you can make a blanket statement that random partitioning makes efficient MapReduce impossible (scanning, yes).  Many M/R tasks process entire tables.  Random partitioning has definite advantages for some cases, and HBase might well benefit from recognizing that and adding some support.
>
>> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. 
>
>Operationally, however, HBase is more complex.  Admins have to configure and manage ZooKeeper, HDFS, and HBase.  Could this be improved?
>
>> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".
>
>That's disingenuous.  Thrift exposes all of the Cassandra API to all of the wrappers, while HBase clients who want to use all of the HBase API must use Java.  That can be fixed, but it is the status quo.
>
>joe
>
>
>
>

Re: HBase and Cassandra on StackOverflow

Posted by Joe Pallas <jo...@oracle.com>.

On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote:

> Better to focus on improving HBase than play whack a mole.

Absolutely.  So let's talk about improving HBase.  I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months.

> HBase supports replication between clusters (i.e. data centers).

That’s … debatable.  There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.”  My understanding of replication is that you can't replicate any existing data, so unless you activated it on day one, it isn't very useful.  Do I misunderstand?

> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.

That's fair enough, although I think your first two sentences nearly contradict each other :-).  If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior to HBase/HDFS with respect to consistency and availability ("strong" consistency and reads do not fail if any one copy is available).

A more important point, I think, is the one about storage.  HBase uses two different kinds of files, data files and logs, but HDFS doesn't know about that and cannot, for example, optimize data files for write throughput (and random reads) and log files for low latency sequential writes.  (For example, how could performance be improved by adding solid-state disk?)

> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. 

I don't think you can make a blanket statement that random partitioning makes efficient MapReduce impossible (scanning, yes).  Many M/R tasks process entire tables.  Random partitioning has definite advantages for some cases, and HBase might well benefit from recognizing that and adding some support.

> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. 

Operationally, however, HBase is more complex.  Admins have to configure and manage ZooKeeper, HDFS, and HBase.  Could this be improved?

> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".

That's disingenuous.  Thrift exposes all of the Cassandra API to all of the wrappers, while HBase clients who want to use all of the HBase API must use Java.  That can be fixed, but it is the status quo.

joe

Re: HBase and Cassandra on StackOverflow

Posted by Andrew Purtell <ap...@apache.org>.

Hi Chris,

Appreciate your answer on the post.

Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole.


Regarding some of the specific points in that post:

HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false.

HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it.

Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements.

Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature.

Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra.

While we are looking at codebases, it should be noted that HBase has substantially more unit tests.

With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens".

The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there.

I have just scratched the surface.


Best regards,


       - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Chris Tarnas <cf...@email.com>
>To: hbase-user@hadoop.apache.org
>Sent: Tuesday, August 30, 2011 2:02 PM
>Subject: HBase and Cassandra on StackOverflow
>
>Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:
>
>http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>
>-chris
>
>