You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Keith Stevens <fo...@gmail.com> on 2010/08/16 01:24:01 UTC

The stability of Hadoop jobs outputting to Cassandra

Hello,

I'm currently working on a project that is using HBase and Hadoop, but i'm
currently looking into alternatives to HBase.  Cassandra seems to be the
next best replacement, or perhaps a better replacement, except that the
stable release is lacking support for hadoop jobs writing to Cassandra.

I found CASSANDRA-1101<https://issues.apache.org/jira/browse/CASSANDRA-1101>
and
wanted to know how stable that update is.  Will it be made part of a release
any time soon?  Has anyone been using the update regularly?

Thanks!
--Keith

Re: The stability of Hadoop jobs outputting to Cassandra

Posted by Bill Hastings <bl...@gmail.com>.
I am curious to know the reasons you are moving away from HBase. It would be
great if you could state them.

On Sun, Aug 15, 2010 at 4:24 PM, Keith Stevens <fo...@gmail.com>wrote:

> Hello,
>
> I'm currently working on a project that is using HBase and Hadoop, but i'm
> currently looking into alternatives to HBase.  Cassandra seems to be the
> next best replacement, or perhaps a better replacement, except that the
> stable release is lacking support for hadoop jobs writing to Cassandra.
>
> I found CASSANDRA-1101<https://issues.apache.org/jira/browse/CASSANDRA-1101> and
> wanted to know how stable that update is.  Will it be made part of a release
> any time soon?  Has anyone been using the update regularly?
>
> Thanks!
> --Keith
>



-- 
Cheers
Bill

Re: The stability of Hadoop jobs outputting to Cassandra

Posted by Keith Stevens <fo...@gmail.com>.
Thanks for this update.

After another day at work, and more reading into the Cassandra's underlying
model, I think the problem i am encountering is less due to HBase and more
with user error and a highly faulty cluster.

Cassandra's clean api and integration with thrift were the two biggest
factors that attracted me to it, in addition to personal vouches from people
at my university.  Another attractor is based on several remarks that it was
much simpler to set up than HBase, which has been our main point of failur.
 I have read, though, that Cassandra is not as focused on large scale
analysis of documents, via hadoop, in the way that HBase is.

I'm going to try playing around with Cassandra over the next few days and
see if it's more stable on our often failing cluster when combined with
Hadoop.  I'll definitely try the simple solution of having a thrift
connection to Cassandra in the reducer.

Thanks!
--Keith

On Sun, Aug 15, 2010 at 6:05 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> Status: Fixed, Fix version: 7.0 beta 1 means it's in the beta1 that
> was just released, although
> https://issues.apache.org/jira/browse/CASSANDRA-1315 is open to change
> the API slightly.  Either way, it won't be backported to 0.6.
>
> But you can write to Cassandra from the Hadoop job just fine w/o an
> OutputFormat.  Just create a Thrift connection in your reduce job.
>
> On Sun, Aug 15, 2010 at 6:24 PM, Keith Stevens <fo...@gmail.com>
> wrote:
> > Hello,
> > I'm currently working on a project that is using HBase and Hadoop, but
> i'm
> > currently looking into alternatives to HBase.  Cassandra seems to be the
> > next best replacement, or perhaps a better replacement, except that the
> > stable release is lacking support for hadoop jobs writing to Cassandra.
> > I found CASSANDRA-1101 and wanted to know how stable that update is.
>  Will
> > it be made part of a release any time soon?  Has anyone been using the
> > update regularly?
> > Thanks!
> > --Keith
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: The stability of Hadoop jobs outputting to Cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.
Status: Fixed, Fix version: 7.0 beta 1 means it's in the beta1 that
was just released, although
https://issues.apache.org/jira/browse/CASSANDRA-1315 is open to change
the API slightly.  Either way, it won't be backported to 0.6.

But you can write to Cassandra from the Hadoop job just fine w/o an
OutputFormat.  Just create a Thrift connection in your reduce job.

On Sun, Aug 15, 2010 at 6:24 PM, Keith Stevens <fo...@gmail.com> wrote:
> Hello,
> I'm currently working on a project that is using HBase and Hadoop, but i'm
> currently looking into alternatives to HBase.  Cassandra seems to be the
> next best replacement, or perhaps a better replacement, except that the
> stable release is lacking support for hadoop jobs writing to Cassandra.
> I found CASSANDRA-1101 and wanted to know how stable that update is.  Will
> it be made part of a release any time soon?  Has anyone been using the
> update regularly?
> Thanks!
> --Keith



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com