You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Joe Stein <cr...@gmail.com> on 2011/10/27 15:37:44 UTC

Counter Experience (Performance)?

Hey folks, I am interested in what others have seen in regards to their
experience in the amount of depth and width (CF, Rows & Columns) that they
can/do write per batch and simultaneously and what is the inflection point
where performance degrades.   I have been expanding my use of counters and
am finding some interesting nuances some in my code and implementation
related but others I can't yet quantify.

My batches are 1x5x5 (1 row for each of 5 column families and 5 columns for
each of those 1 rows within each of the 5 column families).  I have 3 nodes
each with 100 connections and another thread pool of 100 threads rolling
through 6,000,000 rows off data sending data out to Cassandra (the 1x5x5
matrice is constructed from each line).  I am finding this to be my sweet
spot right now but still not really performing fantastically (or at least
what I had hoped) and I am wondering what else (if anything) I can be doing
to tweak settings or what to be able to push in more columns or rows.   I
find changing my pool settings very much froms this causes error on client
lib but I will send email to that list separately though I think I have that
figured out on my own for now.

Thanks in advance!!!  I hope to get more work going on this in the next day
or so in a more methodic way to find the right count so I can build a sparse
matrice that will perform best for system and business.

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
*/

Re: Counter Experience (Performance)?

Posted by Joe Stein <cr...@gmail.com>.

Thanks Jake, bottleneck is the disk I believe each write is taking 50ms, EBS
probably (doing testing in ec2).

I will move my testing over to our production network and run it on some
nodes on some real hardware since that where it will end up.

I am seeing things slow down linearly and nothing dropping
off precipitously.  Glad to have the benchmarks I have good to compare
things.  Thanks!

On Thu, Oct 27, 2011 at 11:30 AM, Jake Luciani <ja...@gmail.com> wrote:

> What's your bottleneck?
> http://spyced.blogspot.com/2010/01/linux-performance-basics.html
>
>
> On Thu, Oct 27, 2011 at 9:37 AM, Joe Stein <cr...@gmail.com> wrote:
>
>> Hey folks, I am interested in what others have seen in regards to their
>> experience in the amount of depth and width (CF, Rows & Columns) that they
>> can/do write per batch and simultaneously and what is the inflection point
>> where performance degrades.   I have been expanding my use of counters and
>> am finding some interesting nuances some in my code and implementation
>> related but others I can't yet quantify.
>>
>> My batches are 1x5x5 (1 row for each of 5 column families and 5 columns
>> for each of those 1 rows within each of the 5 column families).  I have 3
>> nodes each with 100 connections and another thread pool of 100 threads
>> rolling through 6,000,000 rows off data sending data out to Cassandra (the
>> 1x5x5 matrice is constructed from each line).  I am finding this to be my
>> sweet spot right now but still not really performing fantastically (or at
>> least what I had hoped) and I am wondering what else (if anything) I can be
>> doing to tweak settings or what to be able to push in more columns or rows.
>>   I find changing my pool settings very much froms this causes error on
>> client lib but I will send email to that list separately though I think I
>> have that figured out on my own for now.
>>
>> Thanks in advance!!!  I hope to get more work going on this in the next
>> day or so in a more methodic way to find the right count so I can build a
>> sparse matrice that will perform best for system and business.
>>
>> /*
>> Joe Stein
>> http://www.linkedin.com/in/charmalloc
>> Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
>> */
>>
>
>
>
> --
> http://twitter.com/tjake
>



-- 

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
*/

Re: Counter Experience (Performance)?

Posted by Jake Luciani <ja...@gmail.com>.

What's your bottleneck?
http://spyced.blogspot.com/2010/01/linux-performance-basics.html

On Thu, Oct 27, 2011 at 9:37 AM, Joe Stein <cr...@gmail.com> wrote:

> Hey folks, I am interested in what others have seen in regards to their
> experience in the amount of depth and width (CF, Rows & Columns) that they
> can/do write per batch and simultaneously and what is the inflection point
> where performance degrades.   I have been expanding my use of counters and
> am finding some interesting nuances some in my code and implementation
> related but others I can't yet quantify.
>
> My batches are 1x5x5 (1 row for each of 5 column families and 5 columns for
> each of those 1 rows within each of the 5 column families).  I have 3 nodes
> each with 100 connections and another thread pool of 100 threads rolling
> through 6,000,000 rows off data sending data out to Cassandra (the 1x5x5
> matrice is constructed from each line).  I am finding this to be my sweet
> spot right now but still not really performing fantastically (or at least
> what I had hoped) and I am wondering what else (if anything) I can be doing
> to tweak settings or what to be able to push in more columns or rows.   I
> find changing my pool settings very much froms this causes error on client
> lib but I will send email to that list separately though I think I have that
> figured out on my own for now.
>
> Thanks in advance!!!  I hope to get more work going on this in the next day
> or so in a more methodic way to find the right count so I can build a sparse
> matrice that will perform best for system and business.
>
> /*
> Joe Stein
> http://www.linkedin.com/in/charmalloc
> Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> */
>



-- 
http://twitter.com/tjake