You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Arya Goudarzi (JIRA)" <ji...@apache.org> on 2010/07/15 02:24:49 UTC

[jira] Updated: (CASSANDRA-1221) loadbalance operation never completes on a 3 node cluster

     [ https://issues.apache.org/jira/browse/CASSANDRA-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arya Goudarzi updated CASSANDRA-1221:
-------------------------------------

    Attachment: system1.log
                system2.log
                system3.log

Cassandra System Logs for node 1-3

> loadbalance operation never completes on a 3 node cluster
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-1221
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1221
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7
>            Reporter: Gary Dusbabek
>            Assignee: Gary Dusbabek
>             Fix For: 0.7
>
>         Attachments: system1.log, system2.log, system3.log
>
>
> Arya Goudarzi reports:
> Please confirm if this is an issue and should be reported or I am doing something wrong. I could not find anything relevant on JIRA:
> Playing with 0.7 nightly (today's build), I setup a 3 node cluster this way:
>  - Added one node;
>  - Loaded default schema with RF 1 from YAML using JMX;
>  - Loaded 2M keys using py_stress;
>  - Bootstrapped a second node;
>  - Cleaned up the first node;
>  - Bootstrapped a third node;
>  - Cleaned up the second node;
> I got the following ring:
> Address       Status     Load          Range                                      Ring
>                                       154293670372423273273390365393543806425
> 10.50.26.132  Up         518.63 MB     69164917636305877859094619660693892452     |<--|
> 10.50.26.134  Up         234.8 MB      111685517405103688771527967027648896391    |   |
> 10.50.26.133  Up         235.26 MB     154293670372423273273390365393543806425    |-->|
> Now I ran:
> nodetool --host 10.50.26.132 loadbalance
> It's been going for a while. I checked the streams
> nodetool --host 10.50.26.134 streams
> Mode: Normal
> Not sending any streams.
> Streaming from: /10.50.26.132
>   Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-3-Data.db/[(0,22206096), (22206096,27271682)]
>   Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-4-Data.db/[(0,15180462), (15180462,18656982)]
>   Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-5-Data.db/[(0,353139829), (353139829,433883659)]
>   Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-6-Data.db/[(0,366336059), (366336059,450095320)]
> nodetool --host 10.50.26.132 streams
> Mode: Leaving: streaming data to other nodes
> Streaming to: /10.50.26.134
>   /var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), (366336059,450095320)]
> Not receiving any streams.
> These have been going for the past 2 hours.
> I see in the logs of the node with 134 IP address and I saw this:
> INFO [GOSSIP_STAGE:1] 2010-06-22 16:30:54,679 StorageService.java (line 603) Will not change my token ownership to /10.50.26.132
> So, to my understanding from wikis loadbalance supposed to decommission and re-bootstrap again by sending its tokens to other nodes and then bootstrap again. It's been stuck in streaming for the past 2 hours and the size of ring has not changed. The log in the first node says it has started streaming for the past hours:
> INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 72) Beginning transfer process to /10.50.26.134 for ranges (154293670372423273273390365393543806425,69164917636305877859094619660693892452]
>  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 82) Flushing memtables for Keyspace1...
>  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,266 StreamOut.java (line 128) Stream context metadata [/var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), (366336059,450095320)]] 1 sstables.
>  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 135) Sending a stream initiate message to /10.50.26.134 ...
>  INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 140) Waiting for transfer to /10.50.26.134 to complete
>  INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 359) LocationInfo has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1277249454413.log', position=720)
>  INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 622) Enqueuing flush of Memtable(LocationInfo)@1637794189
>  INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,370 Memtable.java (line 149) Writing Memtable(LocationInfo)@1637794189
>  INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,528 Memtable.java (line 163) Completed flushing /var/lib/cassandra/data/system/LocationInfo-d-9-Data.db
>  INFO [MEMTABLE-POST-FLUSHER:1] 2010-06-22 17:36:53,529 ColumnFamilyStore.java (line 374) Discarding 1000
> Nothing more after this line.
> Am I doing something wrong?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.