You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Andrey Ilinykh <ai...@gmail.com> on 2012/10/18 20:46:59 UTC

hadoop consistency level

Hello, everybody!
I'm thinking about running hadoop jobs on the top of the cassandra
cluster. My understanding is - hadoop jobs read data from local nodes
only. Does it mean the consistency level is always ONE?

Thank you,
  Andrey

Re: hadoop consistency level

Posted by Michael Kjellman <mk...@barracuda.com>.

Honestly, I think what they did re Brisk development is fair. They left
the code for any of us in the community to improve it and make it
compatible with newer versions and they need to make money as a company as
well. They already contribute so much to the Cassandra community in
general and they are certainly not trying to stop people from continuing
to develop Brisk. Hadoop jobs that input and output to Cassandra will also
work without it as well. If you need the features of CFS and don’t want to
maintain HDFS then yes you'll have to pay for DSE.

If you are having issues with data not on the particular node that you are
reading from with Hadoop I'd go ahead and set the consistency level in
your job configuration as I recommended previously. Note there is also a
cassandra.consistencylevel.write setting if you are using either the
ColumnFamilyOutputFormat or BulkOutputFormat classes.

In terms of performance I have a MR job that reads in 30 million rows with
QUORUM consistency on 1.1.6 with RandomPartitioner and the mapper takes
about 11 minutes across 3 Hadoop nodes (our Cassandra cluster is obviously
larger but we haven't fully scaled out our Hadoop cluster yet). Hardware
is 2 7200 rpm drives + SSD for the commit log, 32GB of RAM, and 12 cores
per node. Hope this helps.

Best,
michael

On 10/18/12 12:24 PM, "Jean-Nicolas Boulay Desjardins"
<jn...@gmail.com> wrote:

>I am surprise that it was abandoned this way. So if I want to use
>Brisk on Cassandra 1.1 I have to use DataStax Entreprise service...
>
>On Thu, Oct 18, 2012 at 3:00 PM, Michael Kjellman
><mk...@barracuda.com> wrote:
>>
>> Unless you have Brisk (however as far as I know there was one fork that
>> got it working on 1.0 but nothing for 1.1 and is not being actively
>> maintained by Datastax) or go with CFS (which comes with DSE) you are
>>not
>> guaranteed all data is on that hadoop node. You can take a look at the
>>forks
>> if interested here: https://github.com/riptano/brisk/network but I'd
>> personally be afraid to put my eggs in a basket that is certainly not
>>super
>> supported anymore.
>>
>> job.getConfiguration().set("cassandra.consistencylevel.read", "QUORUM");
>> should get you started.
>>
>>
>> Best,
>>
>> michael
>>
>>
>>
>> From: Jean-Nicolas Boulay Desjardins <jn...@gmail.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Thursday, October 18, 2012 11:49 AM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: hadoop consistency level
>>
>> Why don't you look into Brisk:
>> http://www.datastax.com/docs/0.8/brisk/about_brisk
>>
>> On Thu, Oct 18, 2012 at 2:46 PM, Andrey Ilinykh <ai...@gmail.com>
>> wrote:
>>>
>>> Hello, everybody!
>>> I'm thinking about running hadoop jobs on the top of the cassandra
>>> cluster. My understanding is - hadoop jobs read data from local nodes
>>> only. Does it mean the consistency level is always ONE?
>>>
>>> Thank you,
>>>   Andrey
>>
>>
>>
>> ----------------------------------
>> 'Like' us on Facebook for exclusive content and other resources on all
>> Barracuda Networks solutions.
>> Visit http://barracudanetworks.com/facebook
>>   

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: hadoop consistency level

Posted by Jean-Nicolas Boulay Desjardins <jn...@gmail.com>.

I am surprise that it was abandoned this way. So if I want to use
Brisk on Cassandra 1.1 I have to use DataStax Entreprise service...

On Thu, Oct 18, 2012 at 3:00 PM, Michael Kjellman
<mk...@barracuda.com> wrote:
>
> Unless you have Brisk (however as far as I know there was one fork that
> got it working on 1.0 but nothing for 1.1 and is not being actively
> maintained by Datastax) or go with CFS (which comes with DSE) you are not
> guaranteed all data is on that hadoop node. You can take a look at the forks
> if interested here: https://github.com/riptano/brisk/network but I'd
> personally be afraid to put my eggs in a basket that is certainly not super
> supported anymore.
>
> job.getConfiguration().set("cassandra.consistencylevel.read", "QUORUM");
> should get you started.
>
>
> Best,
>
> michael
>
>
>
> From: Jean-Nicolas Boulay Desjardins <jn...@gmail.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Thursday, October 18, 2012 11:49 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: hadoop consistency level
>
> Why don't you look into Brisk:
> http://www.datastax.com/docs/0.8/brisk/about_brisk
>
> On Thu, Oct 18, 2012 at 2:46 PM, Andrey Ilinykh <ai...@gmail.com>
> wrote:
>>
>> Hello, everybody!
>> I'm thinking about running hadoop jobs on the top of the cassandra
>> cluster. My understanding is - hadoop jobs read data from local nodes
>> only. Does it mean the consistency level is always ONE?
>>
>> Thank you,
>>   Andrey
>
>
>
> ----------------------------------
> 'Like' us on Facebook for exclusive content and other resources on all
> Barracuda Networks solutions.
> Visit http://barracudanetworks.com/facebook
>

Re: hadoop consistency level

Posted by Bryan Talbot <bt...@aeriagames.com>.

I believe that reading with CL.ONE will still cause read repair to be run
(in the background) 'read_repair_chance' of the time.

-Bryan


On Thu, Oct 18, 2012 at 1:52 PM, Andrey Ilinykh <ai...@gmail.com> wrote:

> On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman
> <mk...@barracuda.com> wrote:
> > Not sure I understand your question (if there is one..)
> >
> > You are more than welcome to do CL ONE and assuming you have hadoop nodes
> > in the right places on your ring things could work out very nicely. If
> you
> > need to guarantee that you have all the data in your job then you'll need
> > to use QUORUM.
> >
> > If you don't specify a CL in your job config it will default to ONE (at
> > least that's what my read of the ConfigHelper source for 1.1.6 shows)
> >
> I have two questions.
> 1. I can benefit from data locality (and Hadoop) only with CL ONE. Is
> it correct?
> 2. With CL QUORUM cassandra reads data from all replicas. In this case
> Hadoop doesn't give me any  benefits. Application running outside the
> cluster has the same performance. Is it correct?
>
> Thank you,
>   Andrey
>

Re: hadoop consistency level

Posted by Andrey Ilinykh <ai...@gmail.com>.

On Thu, Oct 18, 2012 at 2:31 PM, Jeremy Hanna
<je...@gmail.com> wrote:
>
> On Oct 18, 2012, at 3:52 PM, Andrey Ilinykh <ai...@gmail.com> wrote:
>
>> On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman
>> <mk...@barracuda.com> wrote:
>>> Not sure I understand your question (if there is one..)
>>>
>>> You are more than welcome to do CL ONE and assuming you have hadoop nodes
>>> in the right places on your ring things could work out very nicely. If you
>>> need to guarantee that you have all the data in your job then you'll need
>>> to use QUORUM.
>>>
>>> If you don't specify a CL in your job config it will default to ONE (at
>>> least that's what my read of the ConfigHelper source for 1.1.6 shows)
>>>
>> I have two questions.
>> 1. I can benefit from data locality (and Hadoop) only with CL ONE. Is
>> it correct?
>
> Yes and at QUORUM it's quasi local.  The job tracker finds out where a range is and sends a task to a replica with the data (local).  In the case of CL.QUORUM (see the Read Path section of http://wiki.apache.org/cassandra/ArchitectureInternals), it will do an actual read of the data on the node closest (local).  Then it will get a digest from other nodes to verify that they have the same data.  So in the case of RF=3 and QUORUM, it will read the data on the local node where the task is running and will check the next closest replica for a digest to verify that it is consistent.  Information is sent across the wire and there is the latency of that, but it's not the data that's sent.
>
>> 2. With CL QUORUM cassandra reads data from all replicas. In this case
>> Hadoop doesn't give me any  benefits. Application running outside the
>> cluster has the same performance. Is it correct?
>
> CL QUORUM does not read data from all replicas.  Applications running outside the cluster have to copy the data from the cluster, a much more copy/network intensive operation than using CL.QUORUM with the built-in Hadoop support.
>

Thank you very much, guys! I have a much clearer picture now.

Andrey

Re: hadoop consistency level

Posted by Jeremy Hanna <je...@gmail.com>.

On Oct 18, 2012, at 3:52 PM, Andrey Ilinykh <ai...@gmail.com> wrote:

> On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman
> <mk...@barracuda.com> wrote:
>> Not sure I understand your question (if there is one..)
>> 
>> You are more than welcome to do CL ONE and assuming you have hadoop nodes
>> in the right places on your ring things could work out very nicely. If you
>> need to guarantee that you have all the data in your job then you'll need
>> to use QUORUM.
>> 
>> If you don't specify a CL in your job config it will default to ONE (at
>> least that's what my read of the ConfigHelper source for 1.1.6 shows)
>> 
> I have two questions.
> 1. I can benefit from data locality (and Hadoop) only with CL ONE. Is
> it correct?

Yes and at QUORUM it's quasi local.  The job tracker finds out where a range is and sends a task to a replica with the data (local).  In the case of CL.QUORUM (see the Read Path section of http://wiki.apache.org/cassandra/ArchitectureInternals), it will do an actual read of the data on the node closest (local).  Then it will get a digest from other nodes to verify that they have the same data.  So in the case of RF=3 and QUORUM, it will read the data on the local node where the task is running and will check the next closest replica for a digest to verify that it is consistent.  Information is sent across the wire and there is the latency of that, but it's not the data that's sent.

> 2. With CL QUORUM cassandra reads data from all replicas. In this case
> Hadoop doesn't give me any  benefits. Application running outside the
> cluster has the same performance. Is it correct?

CL QUORUM does not read data from all replicas.  Applications running outside the cluster have to copy the data from the cluster, a much more copy/network intensive operation than using CL.QUORUM with the built-in Hadoop support.

> 
> Thank you,
>  Andrey

Re: hadoop consistency level

Posted by Michael Kjellman <mk...@barracuda.com>.

1. Yes, you can absolutely benefit from data locality, and the InputSplits
will theoretically schedule the map task on Cassandra+Hadoop nodes that
have the data locally. If your application doesn't require you to worry
about that one pesky row that should be local to that node (and that node
is responsible for it but for some reason the data isn't there) then go
ahead and run it with CF ONE. In a perfect world all of the rows should be
there but any seasoned Cassandra user use knows that exceptions happen.

If what Bryan says is right then your first MR job, the mapper would be
missing that row but the subsequent run would contain that data as the
read repair would be triggered in the background. Once again, how
important it is that you get all your data 100% of the time?

2. I would consider thinking a little more about your project if you are
planning on using Hadoop only for data locality. I would say it depends if
your workload would benefit from Hadoop and distributed processing. Hadoop
provides many benefits but, if you require QUORUM consistency and you
don't have a work load that lends itself to a input > output distributed
workload then Hadoop might not be the right tool for the job.

Best,
Michael

On 10/18/12 1:52 PM, "Andrey Ilinykh" <ai...@gmail.com> wrote:

>On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman
><mk...@barracuda.com> wrote:
>> Not sure I understand your question (if there is one..)
>>
>> You are more than welcome to do CL ONE and assuming you have hadoop
>>nodes
>> in the right places on your ring things could work out very nicely. If
>>you
>> need to guarantee that you have all the data in your job then you'll
>>need
>> to use QUORUM.
>>
>> If you don't specify a CL in your job config it will default to ONE (at
>> least that's what my read of the ConfigHelper source for 1.1.6 shows)
>>
>I have two questions.
>1. I can benefit from data locality (and Hadoop) only with CL ONE. Is
>it correct?
>2. With CL QUORUM cassandra reads data from all replicas. In this case
>Hadoop doesn't give me any  benefits. Application running outside the
>cluster has the same performance. Is it correct?
>
>Thank you,
>  Andrey

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: hadoop consistency level

Posted by Andrey Ilinykh <ai...@gmail.com>.

On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman
<mk...@barracuda.com> wrote:
> Not sure I understand your question (if there is one..)
>
> You are more than welcome to do CL ONE and assuming you have hadoop nodes
> in the right places on your ring things could work out very nicely. If you
> need to guarantee that you have all the data in your job then you'll need
> to use QUORUM.
>
> If you don't specify a CL in your job config it will default to ONE (at
> least that's what my read of the ConfigHelper source for 1.1.6 shows)
>
I have two questions.
1. I can benefit from data locality (and Hadoop) only with CL ONE. Is
it correct?
2. With CL QUORUM cassandra reads data from all replicas. In this case
Hadoop doesn't give me any  benefits. Application running outside the
cluster has the same performance. Is it correct?

Thank you,
  Andrey

Re: hadoop consistency level

Posted by Michael Kjellman <mk...@barracuda.com>.

Not sure I understand your question (if there is one..)

You are more than welcome to do CL ONE and assuming you have hadoop nodes
in the right places on your ring things could work out very nicely. If you
need to guarantee that you have all the data in your job then you'll need
to use QUORUM.

If you don't specify a CL in your job config it will default to ONE (at
least that's what my read of the ConfigHelper source for 1.1.6 shows)

On 10/18/12 1:29 PM, "Andrey Ilinykh" <ai...@gmail.com> wrote:

>On Thu, Oct 18, 2012 at 1:24 PM, Michael Kjellman
><mk...@barracuda.com> wrote:
>> Well there is *some* data locality, it's just not guaranteed. My
>> understanding (and someone correct me if I'm wrong) is that
>> ColumnFamilyInputFormat implements InputSplit and the getLocations()
>> method.
>>
>> 
>>http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/map
>>re
>> duce/InputSplit.html
>>
>> ColumnFamilySplit.java contains logic to do it's best to determine what
>> node that particular hadoop node contains the data for that mapper.
>>
>But no guarantee local data is in sync with other nodes. Which means
>you have CL ONE. If you want CL QUORUM you have to make remote call,
>no matter if data is local or not.

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: hadoop consistency level

Posted by Andrey Ilinykh <ai...@gmail.com>.

On Thu, Oct 18, 2012 at 1:24 PM, Michael Kjellman
<mk...@barracuda.com> wrote:
> Well there is *some* data locality, it's just not guaranteed. My
> understanding (and someone correct me if I'm wrong) is that
> ColumnFamilyInputFormat implements InputSplit and the getLocations()
> method.
>
> http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/mapre
> duce/InputSplit.html
>
> ColumnFamilySplit.java contains logic to do it's best to determine what
> node that particular hadoop node contains the data for that mapper.
>
But no guarantee local data is in sync with other nodes. Which means
you have CL ONE. If you want CL QUORUM you have to make remote call,
no matter if data is local or not.

Re: hadoop consistency level

Posted by Michael Kjellman <mk...@barracuda.com>.

Well there is *some* data locality, it's just not guaranteed. My
understanding (and someone correct me if I'm wrong) is that
ColumnFamilyInputFormat implements InputSplit and the getLocations()
method.

http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/mapre
duce/InputSplit.html

ColumnFamilySplit.java contains logic to do it's best to determine what
node that particular hadoop node contains the data for that mapper.

But obviously this isn't guaranteed though that all data will be on that
node.

Also, for the sake of completeness, we have RF=3 on the Keyspace in
question.

On 10/18/12 1:15 PM, "Andrey Ilinykh" <ai...@gmail.com> wrote:

>On Thu, Oct 18, 2012 at 12:00 PM, Michael Kjellman
><mk...@barracuda.com> wrote:
>> Unless you have Brisk (however as far as I know there was one fork that
>>got
>> it working on 1.0 but nothing for 1.1 and is not being actively
>>maintained
>> by Datastax) or go with CFS (which comes with DSE) you are not
>>guaranteed
>> all data is on that hadoop node. You can take a look at the forks if
>> interested here: https://github.com/riptano/brisk/network but I'd
>>personally
>> be afraid to put my eggs in a basket that is certainly not super
>>supported
>> anymore.
>>
>> job.getConfiguration().set("cassandra.consistencylevel.read", "QUORUM");
>> should get you started.
>This is what I don't understand. With QUORUM you read data from at
>least two nodes. If so, you don't benefit from data locality. What's
>the point to use hadoop? I can run application on any machine(s) and
>iterate through column family. What is the difference?
>
>Thank you,
>  Andrey

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: hadoop consistency level

Posted by Andrey Ilinykh <ai...@gmail.com>.

On Thu, Oct 18, 2012 at 12:00 PM, Michael Kjellman
<mk...@barracuda.com> wrote:
> Unless you have Brisk (however as far as I know there was one fork that got
> it working on 1.0 but nothing for 1.1 and is not being actively maintained
> by Datastax) or go with CFS (which comes with DSE) you are not guaranteed
> all data is on that hadoop node. You can take a look at the forks if
> interested here: https://github.com/riptano/brisk/network but I'd personally
> be afraid to put my eggs in a basket that is certainly not super supported
> anymore.
>
> job.getConfiguration().set("cassandra.consistencylevel.read", "QUORUM");
> should get you started.
This is what I don't understand. With QUORUM you read data from at
least two nodes. If so, you don't benefit from data locality. What's
the point to use hadoop? I can run application on any machine(s) and
iterate through column family. What is the difference?

Thank you,
  Andrey

Re: hadoop consistency level

Posted by William Oberman <ob...@civicscience.com>.

A recent thread made it sound like Brisk was no longer a datastax supported
thing (it's DataStax Enterpise, or DSE, now):
http://www.mail-archive.com/user@cassandra.apache.org/msg24921.html

In particular this response:
http://www.mail-archive.com/user@cassandra.apache.org/msg25061.html

On Thu, Oct 18, 2012 at 2:49 PM, Jean-Nicolas Boulay Desjardins <
jnbdzjnbdz@gmail.com> wrote:

> Why don't you look into Brisk:
> http://www.datastax.com/docs/0.8/brisk/about_brisk
>
>
> On Thu, Oct 18, 2012 at 2:46 PM, Andrey Ilinykh <ai...@gmail.com>wrote:
>
>> Hello, everybody!
>> I'm thinking about running hadoop jobs on the top of the cassandra
>> cluster. My understanding is - hadoop jobs read data from local nodes
>> only. Does it mean the consistency level is always ONE?
>>
>> Thank you,
>>   Andrey
>>
>
>

Re: hadoop consistency level

Posted by Michael Kjellman <mk...@barracuda.com>.

Unless you have Brisk (however as far as I know there was one fork that got it working on 1.0 but nothing for 1.1 and is not being actively maintained by Datastax) or go with CFS (which comes with DSE) you are not guaranteed all data is on that hadoop node. You can take a look at the forks if interested here: https://github.com/riptano/brisk/network but I'd personally be afraid to put my eggs in a basket that is certainly not super supported anymore.


job.getConfiguration().set("cassandra.consistencylevel.read", "QUORUM"); should get you started.


Best,

michael


From: Jean-Nicolas Boulay Desjardins <jn...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Thursday, October 18, 2012 11:49 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: hadoop consistency level

Why don't you look into Brisk:  http://www.datastax.com/docs/0.8/brisk/about_brisk

On Thu, Oct 18, 2012 at 2:46 PM, Andrey Ilinykh <ai...@gmail.com>> wrote:
Hello, everybody!
I'm thinking about running hadoop jobs on the top of the cassandra
cluster. My understanding is - hadoop jobs read data from local nodes
only. Does it mean the consistency level is always ONE?

Thank you,
  Andrey


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook

Re: hadoop consistency level

Posted by Jean-Nicolas Boulay Desjardins <jn...@gmail.com>.

Why don't you look into Brisk:
http://www.datastax.com/docs/0.8/brisk/about_brisk

On Thu, Oct 18, 2012 at 2:46 PM, Andrey Ilinykh <ai...@gmail.com> wrote:

> Hello, everybody!
> I'm thinking about running hadoop jobs on the top of the cassandra
> cluster. My understanding is - hadoop jobs read data from local nodes
> only. Does it mean the consistency level is always ONE?
>
> Thank you,
>   Andrey
>