You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Carlos A <na...@gmail.com> on 2016/01/11 06:57:29 UTC

In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

Hello all,

I have a small dev environment with 4 machines. One of them, I had it
removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
I then reinstalled it and tried to join. It is on UJ status for a week now
and no changes.

I had tried node-repair etc but nothing.

nodetool status output

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID
            Rack
UN  192.168.1.30  16.13 MB   256          ?
0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
UN  192.168.1.31  20.12 MB   256          ?
1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
UN  192.168.1.32  17.73 MB   256          ?
7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
UJ  192.168.1.33  877.6 KB   256          ?
7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1

Note: Non-system keyspaces don't have the same replication settings,
effective ownership information is meaningless

Any tips on fixing this?

Thanks,

C.

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

Posted by Carlos Fernando Scheidecker Antunes <na...@gmail.com>.

DuyHai,

Nothing wrong on the logs either.



> nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> MutationStage                     0         0          11464         0                 0
> ViewMutationStage                 0         0              0         0                 0
> ReadStage                         0         0              0         0                 0
> RequestResponseStage              0         0             10         0                 0
> ReadRepairStage                   0         0              0         0                 0
> CounterMutationStage              0         0              0         0                 0
> MiscStage                         0         0              0         0                 0
> CompactionExecutor                0         0            683         0                 0
> MemtableReclaimMemory             0         0            357         0                 0
> PendingRangeCalculator            0         0              5         0                 0
> GossipStage                       0         0        2682208         0                 0
> SecondaryIndexManagement          0         0              0         0                 0
> HintsDispatcher                   0         0              0         0                 0
> MigrationStage                    0         0              0         0                 0
> MemtablePostFlush                 0         0            375         0                 0
> ValidationExecutor                0         0              0         0                 0
> Sampler                           0         0              0         0                 0
> MemtableFlushWriter               0         0            357         0                 0
> InternalResponseStage             0         0              0         0                 0
> AntiEntropyStage                  0         0              0         0                 0
> CacheCleanupExecutor              0         0              0         0                 0
> 
> Message type           Dropped
> READ                         0
> RANGE_SLICE                  0
> _TRACE                       0
> HINT                         0
> MUTATION                     0
> COUNTER_MUTATION             0
> BATCH_STORE                  0
> BATCH_REMOVE                 0
> REQUEST_RESPONSE             0
> PAGED_RANGE                  0
> READ_REPAIR                  0


On Tue, 2016-01-12 at 13:05 +0100, DuyHai Doan wrote:
> Oh, sorry, did not notice the version in the title. Did you check the
> system.log to verify if there isn't any Exception related to data
> streaming ? What is the output of "nodetool tpstats" ?
> 
> 
> On Tue, Jan 12, 2016 at 1:00 PM, DuyHai Doan <do...@gmail.com>
> wrote:
> 
>         What is your Cassandra version ? In earlier versions there was
>         some issues with streaming that can make the joining process
>         stuck.
>         
>         
>         On Mon, Jan 11, 2016 at 6:57 AM, Carlos A
>         <na...@gmail.com> wrote:
>         
>                 Hello all,
>                 
>                 
>                 
>                 I have a small dev environment with 4 machines. One of
>                 them, I had it removed (.33) from the cluster because
>                 I wanted to upgrade its HD to a SSD. I then
>                 reinstalled it and tried to join. It is on UJ status
>                 for a week now and no changes.
>                 
>                 
>                 I had tried node-repair etc but nothing.
>                 
>                 
>                 nodetool status output
>                 
>                 
>                 Datacenter: DC1
>                 ===============
>                 Status=Up/Down
>                 |/ State=Normal/Leaving/Joining/Moving
>                 --  Address       Load       Tokens       Owns    Host
>                 ID                               Rack
>                 UN  192.168.1.30  16.13 MB   256          ?
>                 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
>                 UN  192.168.1.31  20.12 MB   256          ?
>                 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
>                 UN  192.168.1.32  17.73 MB   256          ?
>                 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
>                 UJ  192.168.1.33  877.6 KB   256          ?
>                 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>                 
>                 
>                 Note: Non-system keyspaces don't have the same
>                 replication settings, effective ownership information
>                 is meaningless
>                 
>                 
>                 Any tips on fixing this?
>                 
>                 
>                 Thanks,
>                 
>                 
>                 C.
>         
>         
>         
> 
> 
>

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

Posted by DuyHai Doan <do...@gmail.com>.

Oh, sorry, did not notice the version in the title. Did you check the
system.log to verify if there isn't any Exception related to data streaming
? What is the output of "nodetool tpstats" ?

On Tue, Jan 12, 2016 at 1:00 PM, DuyHai Doan <do...@gmail.com> wrote:

> What is your Cassandra version ? In earlier versions there was some issues
> with streaming that can make the joining process stuck.
>
> On Mon, Jan 11, 2016 at 6:57 AM, Carlos A <na...@gmail.com> wrote:
>
>> Hello all,
>>
>> I have a small dev environment with 4 machines. One of them, I had it
>> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
>> I then reinstalled it and tried to join. It is on UJ status for a week now
>> and no changes.
>>
>> I had tried node-repair etc but nothing.
>>
>> nodetool status output
>>
>> Datacenter: DC1
>> ===============
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address       Load       Tokens       Owns    Host ID
>>               Rack
>> UN  192.168.1.30  16.13 MB   256          ?
>> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
>> UN  192.168.1.31  20.12 MB   256          ?
>> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
>> UN  192.168.1.32  17.73 MB   256          ?
>> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
>> UJ  192.168.1.33  877.6 KB   256          ?
>> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>>
>> Note: Non-system keyspaces don't have the same replication settings,
>> effective ownership information is meaningless
>>
>> Any tips on fixing this?
>>
>> Thanks,
>>
>> C.
>>
>
>

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

Posted by DuyHai Doan <do...@gmail.com>.

What is your Cassandra version ? In earlier versions there was some issues
with streaming that can make the joining process stuck.

On Mon, Jan 11, 2016 at 6:57 AM, Carlos A <na...@gmail.com> wrote:

> Hello all,
>
> I have a small dev environment with 4 machines. One of them, I had it
> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
> I then reinstalled it and tried to join. It is on UJ status for a week now
> and no changes.
>
> I had tried node-repair etc but nothing.
>
> nodetool status output
>
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address       Load       Tokens       Owns    Host ID
>               Rack
> UN  192.168.1.30  16.13 MB   256          ?
> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
> UN  192.168.1.31  20.12 MB   256          ?
> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
> UN  192.168.1.32  17.73 MB   256          ?
> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
> UJ  192.168.1.33  877.6 KB   256          ?
> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
> Any tips on fixing this?
>
> Thanks,
>
> C.
>

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

Posted by daemeon reiydelle <da...@gmail.com>.

What do the logs say on the seed node (and on the UJ node)?

Look for timeout messages.

This problem has occurred for me when there was high network utilization
between the seed and the joining node, also routing issues.

*.......*

*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Sun, Jan 17, 2016 at 2:24 PM, Kai Wang <de...@gmail.com> wrote:

> Carlos,
>
> so you essentially replace the 33 node. Did you follow this
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?
> The link is for 2.x not sure about 3.x. What if you change the new node to
> .34?
>
>
>
> On Mon, Jan 11, 2016 at 12:57 AM, Carlos A <na...@gmail.com> wrote:
>
>> Hello all,
>>
>> I have a small dev environment with 4 machines. One of them, I had it
>> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
>> I then reinstalled it and tried to join. It is on UJ status for a week now
>> and no changes.
>>
>> I had tried node-repair etc but nothing.
>>
>> nodetool status output
>>
>> Datacenter: DC1
>> ===============
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address       Load       Tokens       Owns    Host ID
>>               Rack
>> UN  192.168.1.30  16.13 MB   256          ?
>> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
>> UN  192.168.1.31  20.12 MB   256          ?
>> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
>> UN  192.168.1.32  17.73 MB   256          ?
>> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
>> UJ  192.168.1.33  877.6 KB   256          ?
>> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>>
>> Note: Non-system keyspaces don't have the same replication settings,
>> effective ownership information is meaningless
>>
>> Any tips on fixing this?
>>
>> Thanks,
>>
>> C.
>>
>
>

Re: In UJ status for over a week trying to rejoin cluster in Cassandra 3.0.1

Posted by Kai Wang <de...@gmail.com>.

Carlos,

so you essentially replace the 33 node. Did you follow this
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?
The link is for 2.x not sure about 3.x. What if you change the new node to
.34?



On Mon, Jan 11, 2016 at 12:57 AM, Carlos A <na...@gmail.com> wrote:

> Hello all,
>
> I have a small dev environment with 4 machines. One of them, I had it
> removed (.33) from the cluster because I wanted to upgrade its HD to a SSD.
> I then reinstalled it and tried to join. It is on UJ status for a week now
> and no changes.
>
> I had tried node-repair etc but nothing.
>
> nodetool status output
>
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address       Load       Tokens       Owns    Host ID
>               Rack
> UN  192.168.1.30  16.13 MB   256          ?
> 0e524b1c-b254-45d0-98ee-63b8f34a8531  RAC1
> UN  192.168.1.31  20.12 MB   256          ?
> 1f8000f5-026c-42c7-8189-cf19fbede566  RAC1
> UN  192.168.1.32  17.73 MB   256          ?
> 7b06f9e9-7c41-4364-ab18-f6976fd359e4  RAC1
> UJ  192.168.1.33  877.6 KB   256          ?
> 7a1507b5-198e-4a3a-a9fd-7af9e588fde2  RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
> Any tips on fixing this?
>
> Thanks,
>
> C.
>