You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Raj N <ra...@gmail.com> on 2012/06/15 16:59:32 UTC

Unbalanced ring in Cassandra 0.8.4

Hi experts,
    I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned
tokens using the first strategy(adding 1) mentioned here -

http://wiki.apache.org/cassandra/Operations?#Token_selection

But when I run nodetool ring on my cluster, this is the result I get -

Address         DC  Rack  Status State   Load        Owns    Token

 113427455640312814857969558651062452225
172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
 56713727820156407428984779325531226112
45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
56713727820156407428984779325531226113
172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
 113427455640312814857969558651062452224
45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
113427455640312814857969558651062452225


As you can see the first node has considerably more load than the
others(almost double) which is surprising since all these are replicas of
each other. I am running Cassandra 0.8.4. Is there an explanation for this
behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
the cause for this?

Thanks
-Raj

Re: Unbalanced ring in Cassandra 0.8.4

Posted by aaron morton <aa...@thelastpickle.com>.

>  Does cleanup only cleanup keys that no longer belong to that node. 
Yes.

I guess it could be an artefact of the bulk load. It's not been reported previously though. Try the cleanup and see how it goes. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 21/06/2012, at 1:34 AM, Raj N wrote:

> Nick, thanks for the response. Does cleanup only cleanup keys that no longer belong to that node. Just to add more color, when I bulk loaded all my data into these 6 nodes, all of them had the same amount of data. After the first nodetool repair, the first node started having more data than the rest of the cluster. And since then it has never come back down. When I run cfstats on the node, the amount of data for every column family is almost 2 times the the amount of data for other. This is true for the number of keys estimate as well. For 1 CF I see more than double the number of keys and that's the largest cf as well with 34 GB data.
> 
> Thanks
> -Rajesh
> 
> On Wed, Jun 20, 2012 at 12:32 AM, Nick Bailey <ni...@datastax.com> wrote:
> No. Cleanup will scan each sstable to remove data that is no longer
> owned by that specific node. It won't compact the sstables together
> however.
> 
> On Tue, Jun 19, 2012 at 11:11 PM, Raj N <ra...@gmail.com> wrote:
> > But wont that also run a major compaction which is not recommended anymore.
> >
> > -Raj
> >
> >
> > On Sun, Jun 17, 2012 at 11:58 PM, aaron morton <aa...@thelastpickle.com>
> > wrote:
> >>
> >> Assuming you have been running repair, it' can't hurt.
> >>
> >> Cheers
> >>
> >> -----------------
> >> Aaron Morton
> >> Freelance Developer
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >>
> >> On 17/06/2012, at 4:06 AM, Raj N wrote:
> >>
> >> Nick, do you think I should still run cleanup on the first node.
> >>
> >> -Rajesh
> >>
> >> On Fri, Jun 15, 2012 at 3:47 PM, Raj N <ra...@gmail.com> wrote:
> >>>
> >>> I did run nodetool move. But that was when I was setting up the cluster
> >>> which means I didn't have any data at that time.
> >>>
> >>> -Raj
> >>>
> >>>
> >>> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com> wrote:
> >>>>
> >>>> Did you start all your nodes at the correct tokens or did you balance
> >>>> by moving them? Moving nodes around won't delete unneeded data after
> >>>> the move is done.
> >>>>
> >>>> Try running 'nodetool cleanup' on all of your nodes.
> >>>>
> >>>> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
> >>>> > Actually I am not worried about the percentage. Its the data I am
> >>>> > concerned
> >>>> > about. Look at the first node. It has 102.07GB data. And the other
> >>>> > nodes
> >>>> > have around 60 GB(one has 69, but lets ignore that one). I am not
> >>>> > understanding why the first node has almost double the data.
> >>>> >
> >>>> > Thanks
> >>>> > -Raj
> >>>> >
> >>>> >
> >>>> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com>
> >>>> > wrote:
> >>>> >>
> >>>> >> This is just a known problem with the nodetool output and multiple
> >>>> >> DCs. Your configuration is correct. The problem with nodetool is
> >>>> >> fixed
> >>>> >> in 1.1.1
> >>>> >>
> >>>> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
> >>>> >>
> >>>> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com>
> >>>> >> wrote:
> >>>> >> > Hi experts,
> >>>> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have
> >>>> >> > assigned
> >>>> >> > tokens using the first strategy(adding 1) mentioned here -
> >>>> >> >
> >>>> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
> >>>> >> >
> >>>> >> > But when I run nodetool ring on my cluster, this is the result I
> >>>> >> > get -
> >>>> >> >
> >>>> >> > Address         DC  Rack  Status State   Load        Owns    Token
> >>>> >> >
> >>>> >> >  113427455640312814857969558651062452225
> >>>> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
> >>>> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
> >>>> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
> >>>> >> >  56713727820156407428984779325531226112
> >>>> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
> >>>> >> > 56713727820156407428984779325531226113
> >>>> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
> >>>> >> >  113427455640312814857969558651062452224
> >>>> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
> >>>> >> > 113427455640312814857969558651062452225
> >>>> >> >
> >>>> >> >
> >>>> >> > As you can see the first node has considerably more load than the
> >>>> >> > others(almost double) which is surprising since all these are
> >>>> >> > replicas
> >>>> >> > of
> >>>> >> > each other. I am running Cassandra 0.8.4. Is there an explanation
> >>>> >> > for
> >>>> >> > this
> >>>> >> > behaviour?
> >>>> >> > Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
> >>>> >> > the
> >>>> >> > cause for this?
> >>>> >> >
> >>>> >> > Thanks
> >>>> >> > -Raj
> >>>> >
> >>>> >
> >>>
> >>>
> >>
> >>
> >
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Raj N <ra...@gmail.com>.

Nick, thanks for the response. Does cleanup only cleanup keys that no
longer belong to that node. Just to add more color, when I bulk loaded all
my data into these 6 nodes, all of them had the same amount of data. After
the first nodetool repair, the first node started having more data than the
rest of the cluster. And since then it has never come back down. When I run
cfstats on the node, the amount of data for every column family is almost 2
times the the amount of data for other. This is true for the number of keys
estimate as well. For 1 CF I see more than double the number of keys and
that's the largest cf as well with 34 GB data.

Thanks
-Rajesh

On Wed, Jun 20, 2012 at 12:32 AM, Nick Bailey <ni...@datastax.com> wrote:

> No. Cleanup will scan each sstable to remove data that is no longer
> owned by that specific node. It won't compact the sstables together
> however.
>
> On Tue, Jun 19, 2012 at 11:11 PM, Raj N <ra...@gmail.com> wrote:
> > But wont that also run a major compaction which is not recommended
> anymore.
> >
> > -Raj
> >
> >
> > On Sun, Jun 17, 2012 at 11:58 PM, aaron morton <aa...@thelastpickle.com>
> > wrote:
> >>
> >> Assuming you have been running repair, it' can't hurt.
> >>
> >> Cheers
> >>
> >> -----------------
> >> Aaron Morton
> >> Freelance Developer
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >>
> >> On 17/06/2012, at 4:06 AM, Raj N wrote:
> >>
> >> Nick, do you think I should still run cleanup on the first node.
> >>
> >> -Rajesh
> >>
> >> On Fri, Jun 15, 2012 at 3:47 PM, Raj N <ra...@gmail.com> wrote:
> >>>
> >>> I did run nodetool move. But that was when I was setting up the cluster
> >>> which means I didn't have any data at that time.
> >>>
> >>> -Raj
> >>>
> >>>
> >>> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com>
> wrote:
> >>>>
> >>>> Did you start all your nodes at the correct tokens or did you balance
> >>>> by moving them? Moving nodes around won't delete unneeded data after
> >>>> the move is done.
> >>>>
> >>>> Try running 'nodetool cleanup' on all of your nodes.
> >>>>
> >>>> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com>
> wrote:
> >>>> > Actually I am not worried about the percentage. Its the data I am
> >>>> > concerned
> >>>> > about. Look at the first node. It has 102.07GB data. And the other
> >>>> > nodes
> >>>> > have around 60 GB(one has 69, but lets ignore that one). I am not
> >>>> > understanding why the first node has almost double the data.
> >>>> >
> >>>> > Thanks
> >>>> > -Raj
> >>>> >
> >>>> >
> >>>> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com>
> >>>> > wrote:
> >>>> >>
> >>>> >> This is just a known problem with the nodetool output and multiple
> >>>> >> DCs. Your configuration is correct. The problem with nodetool is
> >>>> >> fixed
> >>>> >> in 1.1.1
> >>>> >>
> >>>> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
> >>>> >>
> >>>> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com>
> >>>> >> wrote:
> >>>> >> > Hi experts,
> >>>> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have
> >>>> >> > assigned
> >>>> >> > tokens using the first strategy(adding 1) mentioned here -
> >>>> >> >
> >>>> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
> >>>> >> >
> >>>> >> > But when I run nodetool ring on my cluster, this is the result I
> >>>> >> > get -
> >>>> >> >
> >>>> >> > Address         DC  Rack  Status State   Load        Owns
>  Token
> >>>> >> >
> >>>> >> >  113427455640312814857969558651062452225
> >>>> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
> >>>> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
> >>>> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
> >>>> >> >  56713727820156407428984779325531226112
> >>>> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
> >>>> >> > 56713727820156407428984779325531226113
> >>>> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
> >>>> >> >  113427455640312814857969558651062452224
> >>>> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
> >>>> >> > 113427455640312814857969558651062452225
> >>>> >> >
> >>>> >> >
> >>>> >> > As you can see the first node has considerably more load than the
> >>>> >> > others(almost double) which is surprising since all these are
> >>>> >> > replicas
> >>>> >> > of
> >>>> >> > each other. I am running Cassandra 0.8.4. Is there an explanation
> >>>> >> > for
> >>>> >> > this
> >>>> >> > behaviour?
> >>>> >> > Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
> >>>> >> > the
> >>>> >> > cause for this?
> >>>> >> >
> >>>> >> > Thanks
> >>>> >> > -Raj
> >>>> >
> >>>> >
> >>>
> >>>
> >>
> >>
> >
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Nick Bailey <ni...@datastax.com>.

No. Cleanup will scan each sstable to remove data that is no longer
owned by that specific node. It won't compact the sstables together
however.

On Tue, Jun 19, 2012 at 11:11 PM, Raj N <ra...@gmail.com> wrote:
> But wont that also run a major compaction which is not recommended anymore.
>
> -Raj
>
>
> On Sun, Jun 17, 2012 at 11:58 PM, aaron morton <aa...@thelastpickle.com>
> wrote:
>>
>> Assuming you have been running repair, it' can't hurt.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 17/06/2012, at 4:06 AM, Raj N wrote:
>>
>> Nick, do you think I should still run cleanup on the first node.
>>
>> -Rajesh
>>
>> On Fri, Jun 15, 2012 at 3:47 PM, Raj N <ra...@gmail.com> wrote:
>>>
>>> I did run nodetool move. But that was when I was setting up the cluster
>>> which means I didn't have any data at that time.
>>>
>>> -Raj
>>>
>>>
>>> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com> wrote:
>>>>
>>>> Did you start all your nodes at the correct tokens or did you balance
>>>> by moving them? Moving nodes around won't delete unneeded data after
>>>> the move is done.
>>>>
>>>> Try running 'nodetool cleanup' on all of your nodes.
>>>>
>>>> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
>>>> > Actually I am not worried about the percentage. Its the data I am
>>>> > concerned
>>>> > about. Look at the first node. It has 102.07GB data. And the other
>>>> > nodes
>>>> > have around 60 GB(one has 69, but lets ignore that one). I am not
>>>> > understanding why the first node has almost double the data.
>>>> >
>>>> > Thanks
>>>> > -Raj
>>>> >
>>>> >
>>>> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com>
>>>> > wrote:
>>>> >>
>>>> >> This is just a known problem with the nodetool output and multiple
>>>> >> DCs. Your configuration is correct. The problem with nodetool is
>>>> >> fixed
>>>> >> in 1.1.1
>>>> >>
>>>> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
>>>> >>
>>>> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com>
>>>> >> wrote:
>>>> >> > Hi experts,
>>>> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have
>>>> >> > assigned
>>>> >> > tokens using the first strategy(adding 1) mentioned here -
>>>> >> >
>>>> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
>>>> >> >
>>>> >> > But when I run nodetool ring on my cluster, this is the result I
>>>> >> > get -
>>>> >> >
>>>> >> > Address         DC  Rack  Status State   Load        Owns    Token
>>>> >> >
>>>> >> >  113427455640312814857969558651062452225
>>>> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
>>>> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
>>>> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
>>>> >> >  56713727820156407428984779325531226112
>>>> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
>>>> >> > 56713727820156407428984779325531226113
>>>> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
>>>> >> >  113427455640312814857969558651062452224
>>>> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
>>>> >> > 113427455640312814857969558651062452225
>>>> >> >
>>>> >> >
>>>> >> > As you can see the first node has considerably more load than the
>>>> >> > others(almost double) which is surprising since all these are
>>>> >> > replicas
>>>> >> > of
>>>> >> > each other. I am running Cassandra 0.8.4. Is there an explanation
>>>> >> > for
>>>> >> > this
>>>> >> > behaviour?
>>>> >> > Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
>>>> >> > the
>>>> >> > cause for this?
>>>> >> >
>>>> >> > Thanks
>>>> >> > -Raj
>>>> >
>>>> >
>>>
>>>
>>
>>
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Raj N <ra...@gmail.com>.

But wont that also run a major compaction which is not recommended anymore.

-Raj

On Sun, Jun 17, 2012 at 11:58 PM, aaron morton <aa...@thelastpickle.com>wrote:

> Assuming you have been running repair, it' can't hurt.
>
> Cheers
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 17/06/2012, at 4:06 AM, Raj N wrote:
>
> Nick, do you think I should still run cleanup on the first node.
>
> -Rajesh
>
> On Fri, Jun 15, 2012 at 3:47 PM, Raj N <ra...@gmail.com> wrote:
>
>> I did run nodetool move. But that was when I was setting up the cluster
>> which means I didn't have any data at that time.
>>
>> -Raj
>>
>>
>> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com> wrote:
>>
>>> Did you start all your nodes at the correct tokens or did you balance
>>> by moving them? Moving nodes around won't delete unneeded data after
>>> the move is done.
>>>
>>> Try running 'nodetool cleanup' on all of your nodes.
>>>
>>> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
>>> > Actually I am not worried about the percentage. Its the data I am
>>> concerned
>>> > about. Look at the first node. It has 102.07GB data. And the other
>>> nodes
>>> > have around 60 GB(one has 69, but lets ignore that one). I am not
>>> > understanding why the first node has almost double the data.
>>> >
>>> > Thanks
>>> > -Raj
>>> >
>>> >
>>> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com>
>>> wrote:
>>> >>
>>> >> This is just a known problem with the nodetool output and multiple
>>> >> DCs. Your configuration is correct. The problem with nodetool is fixed
>>> >> in 1.1.1
>>> >>
>>> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
>>> >>
>>> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com>
>>> wrote:
>>> >> > Hi experts,
>>> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have
>>> assigned
>>> >> > tokens using the first strategy(adding 1) mentioned here -
>>> >> >
>>> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
>>> >> >
>>> >> > But when I run nodetool ring on my cluster, this is the result I
>>> get -
>>> >> >
>>> >> > Address         DC  Rack  Status State   Load        Owns    Token
>>> >> >
>>> >> >  113427455640312814857969558651062452225
>>> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
>>> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
>>> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
>>> >> >  56713727820156407428984779325531226112
>>> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
>>> >> > 56713727820156407428984779325531226113
>>> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
>>> >> >  113427455640312814857969558651062452224
>>> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
>>> >> > 113427455640312814857969558651062452225
>>> >> >
>>> >> >
>>> >> > As you can see the first node has considerably more load than the
>>> >> > others(almost double) which is surprising since all these are
>>> replicas
>>> >> > of
>>> >> > each other. I am running Cassandra 0.8.4. Is there an explanation
>>> for
>>> >> > this
>>> >> > behaviour? Could
>>> https://issues.apache.org/jira/browse/CASSANDRA-2433 be
>>> >> > the
>>> >> > cause for this?
>>> >> >
>>> >> > Thanks
>>> >> > -Raj
>>> >
>>> >
>>>
>>
>>
>
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by aaron morton <aa...@thelastpickle.com>.

Assuming you have been running repair, it' can't hurt. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/06/2012, at 4:06 AM, Raj N wrote:

> Nick, do you think I should still run cleanup on the first node.
> 
> -Rajesh
> 
> On Fri, Jun 15, 2012 at 3:47 PM, Raj N <ra...@gmail.com> wrote:
> I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time.
> 
> -Raj
> 
> 
> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com> wrote:
> Did you start all your nodes at the correct tokens or did you balance
> by moving them? Moving nodes around won't delete unneeded data after
> the move is done.
> 
> Try running 'nodetool cleanup' on all of your nodes.
> 
> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
> > Actually I am not worried about the percentage. Its the data I am concerned
> > about. Look at the first node. It has 102.07GB data. And the other nodes
> > have around 60 GB(one has 69, but lets ignore that one). I am not
> > understanding why the first node has almost double the data.
> >
> > Thanks
> > -Raj
> >
> >
> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com> wrote:
> >>
> >> This is just a known problem with the nodetool output and multiple
> >> DCs. Your configuration is correct. The problem with nodetool is fixed
> >> in 1.1.1
> >>
> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
> >>
> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com> wrote:
> >> > Hi experts,
> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned
> >> > tokens using the first strategy(adding 1) mentioned here -
> >> >
> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
> >> >
> >> > But when I run nodetool ring on my cluster, this is the result I get -
> >> >
> >> > Address         DC  Rack  Status State   Load        Owns    Token
> >> >
> >> >  113427455640312814857969558651062452225
> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
> >> >  56713727820156407428984779325531226112
> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
> >> > 56713727820156407428984779325531226113
> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
> >> >  113427455640312814857969558651062452224
> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
> >> > 113427455640312814857969558651062452225
> >> >
> >> >
> >> > As you can see the first node has considerably more load than the
> >> > others(almost double) which is surprising since all these are replicas
> >> > of
> >> > each other. I am running Cassandra 0.8.4. Is there an explanation for
> >> > this
> >> > behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
> >> > the
> >> > cause for this?
> >> >
> >> > Thanks
> >> > -Raj
> >
> >
> 
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Raj N <ra...@gmail.com>.

Nick, do you think I should still run cleanup on the first node.

-Rajesh

On Fri, Jun 15, 2012 at 3:47 PM, Raj N <ra...@gmail.com> wrote:

> I did run nodetool move. But that was when I was setting up the cluster
> which means I didn't have any data at that time.
>
> -Raj
>
>
> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com> wrote:
>
>> Did you start all your nodes at the correct tokens or did you balance
>> by moving them? Moving nodes around won't delete unneeded data after
>> the move is done.
>>
>> Try running 'nodetool cleanup' on all of your nodes.
>>
>> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
>> > Actually I am not worried about the percentage. Its the data I am
>> concerned
>> > about. Look at the first node. It has 102.07GB data. And the other nodes
>> > have around 60 GB(one has 69, but lets ignore that one). I am not
>> > understanding why the first node has almost double the data.
>> >
>> > Thanks
>> > -Raj
>> >
>> >
>> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com>
>> wrote:
>> >>
>> >> This is just a known problem with the nodetool output and multiple
>> >> DCs. Your configuration is correct. The problem with nodetool is fixed
>> >> in 1.1.1
>> >>
>> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
>> >>
>> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com>
>> wrote:
>> >> > Hi experts,
>> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have
>> assigned
>> >> > tokens using the first strategy(adding 1) mentioned here -
>> >> >
>> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
>> >> >
>> >> > But when I run nodetool ring on my cluster, this is the result I get
>> -
>> >> >
>> >> > Address         DC  Rack  Status State   Load        Owns    Token
>> >> >
>> >> >  113427455640312814857969558651062452225
>> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
>> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
>> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
>> >> >  56713727820156407428984779325531226112
>> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
>> >> > 56713727820156407428984779325531226113
>> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
>> >> >  113427455640312814857969558651062452224
>> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
>> >> > 113427455640312814857969558651062452225
>> >> >
>> >> >
>> >> > As you can see the first node has considerably more load than the
>> >> > others(almost double) which is surprising since all these are
>> replicas
>> >> > of
>> >> > each other. I am running Cassandra 0.8.4. Is there an explanation for
>> >> > this
>> >> > behaviour? Could
>> https://issues.apache.org/jira/browse/CASSANDRA-2433 be
>> >> > the
>> >> > cause for this?
>> >> >
>> >> > Thanks
>> >> > -Raj
>> >
>> >
>>
>
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Raj N <ra...@gmail.com>.

I did run nodetool move. But that was when I was setting up the cluster
which means I didn't have any data at that time.

-Raj

On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey <ni...@datastax.com> wrote:

> Did you start all your nodes at the correct tokens or did you balance
> by moving them? Moving nodes around won't delete unneeded data after
> the move is done.
>
> Try running 'nodetool cleanup' on all of your nodes.
>
> On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
> > Actually I am not worried about the percentage. Its the data I am
> concerned
> > about. Look at the first node. It has 102.07GB data. And the other nodes
> > have around 60 GB(one has 69, but lets ignore that one). I am not
> > understanding why the first node has almost double the data.
> >
> > Thanks
> > -Raj
> >
> >
> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com> wrote:
> >>
> >> This is just a known problem with the nodetool output and multiple
> >> DCs. Your configuration is correct. The problem with nodetool is fixed
> >> in 1.1.1
> >>
> >> https://issues.apache.org/jira/browse/CASSANDRA-3412
> >>
> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com> wrote:
> >> > Hi experts,
> >> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have
> assigned
> >> > tokens using the first strategy(adding 1) mentioned here -
> >> >
> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection
> >> >
> >> > But when I run nodetool ring on my cluster, this is the result I get -
> >> >
> >> > Address         DC  Rack  Status State   Load        Owns    Token
> >> >
> >> >  113427455640312814857969558651062452225
> >> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
> >> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
> >> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
> >> >  56713727820156407428984779325531226112
> >> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
> >> > 56713727820156407428984779325531226113
> >> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
> >> >  113427455640312814857969558651062452224
> >> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
> >> > 113427455640312814857969558651062452225
> >> >
> >> >
> >> > As you can see the first node has considerably more load than the
> >> > others(almost double) which is surprising since all these are replicas
> >> > of
> >> > each other. I am running Cassandra 0.8.4. Is there an explanation for
> >> > this
> >> > behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433
>  be
> >> > the
> >> > cause for this?
> >> >
> >> > Thanks
> >> > -Raj
> >
> >
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Nick Bailey <ni...@datastax.com>.

Did you start all your nodes at the correct tokens or did you balance
by moving them? Moving nodes around won't delete unneeded data after
the move is done.

Try running 'nodetool cleanup' on all of your nodes.

On Fri, Jun 15, 2012 at 12:24 PM, Raj N <ra...@gmail.com> wrote:
> Actually I am not worried about the percentage. Its the data I am concerned
> about. Look at the first node. It has 102.07GB data. And the other nodes
> have around 60 GB(one has 69, but lets ignore that one). I am not
> understanding why the first node has almost double the data.
>
> Thanks
> -Raj
>
>
> On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com> wrote:
>>
>> This is just a known problem with the nodetool output and multiple
>> DCs. Your configuration is correct. The problem with nodetool is fixed
>> in 1.1.1
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-3412
>>
>> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com> wrote:
>> > Hi experts,
>> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned
>> > tokens using the first strategy(adding 1) mentioned here -
>> >
>> > http://wiki.apache.org/cassandra/Operations?#Token_selection
>> >
>> > But when I run nodetool ring on my cluster, this is the result I get -
>> >
>> > Address         DC  Rack  Status State   Load        Owns    Token
>> >
>> >  113427455640312814857969558651062452225
>> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
>> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
>> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
>> >  56713727820156407428984779325531226112
>> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
>> > 56713727820156407428984779325531226113
>> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
>> >  113427455640312814857969558651062452224
>> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
>> > 113427455640312814857969558651062452225
>> >
>> >
>> > As you can see the first node has considerably more load than the
>> > others(almost double) which is surprising since all these are replicas
>> > of
>> > each other. I am running Cassandra 0.8.4. Is there an explanation for
>> > this
>> > behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
>> > the
>> > cause for this?
>> >
>> > Thanks
>> > -Raj
>
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Raj N <ra...@gmail.com>.

Actually I am not worried about the percentage. Its the data I am concerned
about. Look at the first node. It has 102.07GB data. And the other nodes
have around 60 GB(one has 69, but lets ignore that one). I am not
understanding why the first node has almost double the data.

Thanks
-Raj

On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey <ni...@datastax.com> wrote:

> This is just a known problem with the nodetool output and multiple
> DCs. Your configuration is correct. The problem with nodetool is fixed
> in 1.1.1
>
> https://issues.apache.org/jira/browse/CASSANDRA-3412
>
> On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com> wrote:
> > Hi experts,
> >     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned
> > tokens using the first strategy(adding 1) mentioned here -
> >
> > http://wiki.apache.org/cassandra/Operations?#Token_selection
> >
> > But when I run nodetool ring on my cluster, this is the result I get -
> >
> > Address         DC  Rack  Status State   Load        Owns    Token
> >
> >  113427455640312814857969558651062452225
> > 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
> > 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
> > 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
> >  56713727820156407428984779325531226112
> > 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
> > 56713727820156407428984779325531226113
> > 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
> >  113427455640312814857969558651062452224
> > 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
> > 113427455640312814857969558651062452225
> >
> >
> > As you can see the first node has considerably more load than the
> > others(almost double) which is surprising since all these are replicas of
> > each other. I am running Cassandra 0.8.4. Is there an explanation for
> this
> > behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be
> the
> > cause for this?
> >
> > Thanks
> > -Raj
>

Re: Unbalanced ring in Cassandra 0.8.4

Posted by Nick Bailey <ni...@datastax.com>.

This is just a known problem with the nodetool output and multiple
DCs. Your configuration is correct. The problem with nodetool is fixed
in 1.1.1

https://issues.apache.org/jira/browse/CASSANDRA-3412

On Fri, Jun 15, 2012 at 9:59 AM, Raj N <ra...@gmail.com> wrote:
> Hi experts,
>     I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned
> tokens using the first strategy(adding 1) mentioned here -
>
> http://wiki.apache.org/cassandra/Operations?#Token_selection
>
> But when I run nodetool ring on my cluster, this is the result I get -
>
> Address         DC  Rack  Status State   Load        Owns    Token
>
>  113427455640312814857969558651062452225
> 172.17.72.91    DC1 RAC13 Up     Normal  102.07 GB   33.33%  0
> 45.10.80.144    DC2 RAC5  Up     Normal  59.1 GB     0.00%   1
> 172.17.72.93    DC1 RAC18 Up     Normal  59.57 GB    33.33%
>  56713727820156407428984779325531226112
> 45.10.80.146    DC2 RAC7  Up     Normal  59.64 GB    0.00%
> 56713727820156407428984779325531226113
> 172.17.72.95    DC1 RAC19 Up     Normal  69.58 GB    33.33%
>  113427455640312814857969558651062452224
> 45.10.80.148    DC2 RAC9  Up     Normal  59.31 GB    0.00%
> 113427455640312814857969558651062452225
>
>
> As you can see the first node has considerably more load than the
> others(almost double) which is surprising since all these are replicas of
> each other. I am running Cassandra 0.8.4. Is there an explanation for this
> behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the
> cause for this?
>
> Thanks
> -Raj