You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ran Tavory <ra...@gmail.com> on 2010/05/18 13:55:43 UTC

ConcurrentModificationException in gossiper while decommissioning another node

While the node 192.168.252.61 was in the process of decommissioning I see
this error in two other nodes:

 INFO [Timer-1] 2010-05-18 06:01:12,048 Gossiper.java (line 179) InetAddress
/192.168.252.62 is now dead.
 INFO [GMFD:1] 2010-05-18 06:04:00,189 Gossiper.java (line 568) InetAddress
/192.168.252.62 is now UP
 INFO [Timer-1] 2010-05-18 06:11:45,311 Gossiper.java (line 401) FatClient /
192.168.252.61 has been silent for 3600000ms, removing from gossip
ERROR [Timer-1] 2010-05-18 06:11:45,315 CassandraDaemon.java (line 88) Fatal
exception in thread Thread[Timer-1,5,main]
java.lang.RuntimeException: java.util.ConcurrentModificationException
        at
org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:97)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
Caused by: java.util.ConcurrentModificationException
        at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
        at
org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:382)
        at
org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:91)
        ... 2 more


.61 is the decommissioned node. .62 was under load (streams transferred to
it from .61)

I simply ran nodetool decommission on the 61 node and then (after an hour, I
guess) I saw this error in two other live nodes.

Does this ring any bell? It's either a bug, or that I wasn't
running decommission correctly...

Re: ConcurrentModificationException in gossiper while decommissioning another node

Posted by Ran Tavory <ra...@gmail.com>.
that sounds like it, thanks

On Tue, May 18, 2010 at 3:53 PM, roger schildmeijer
<sc...@gmail.com>wrote:

> This is hopefully fixed in trunk (CASSANDRA-757 (revision 938597));
> "Replace synchronization in Gossiper with concurrent data structures and
> volatile fields."
>
> // Roger Schildmeijer
>
>
> On Tue, May 18, 2010 at 1:55 PM, Ran Tavory <ra...@gmail.com> wrote:
>
>> While the node 192.168.252.61 was in the process of decommissioning I see
>> this error in two other nodes:
>>
>>  INFO [Timer-1] 2010-05-18 06:01:12,048 Gossiper.java (line 179)
>> InetAddress /192.168.252.62 is now dead.
>>  INFO [GMFD:1] 2010-05-18 06:04:00,189 Gossiper.java (line 568)
>> InetAddress /192.168.252.62 is now UP
>>  INFO [Timer-1] 2010-05-18 06:11:45,311 Gossiper.java (line 401) FatClient
>> /192.168.252.61 has been silent for 3600000ms, removing from gossip
>> ERROR [Timer-1] 2010-05-18 06:11:45,315 CassandraDaemon.java (line 88)
>> Fatal exception in thread Thread[Timer-1,5,main]
>> java.lang.RuntimeException: java.util.ConcurrentModificationException
>>         at
>> org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:97)
>>         at java.util.TimerThread.mainLoop(Timer.java:512)
>>         at java.util.TimerThread.run(Timer.java:462)
>> Caused by: java.util.ConcurrentModificationException
>>         at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
>>         at
>> org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:382)
>>         at
>> org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:91)
>>         ... 2 more
>>
>>
>> .61 is the decommissioned node. .62 was under load (streams transferred to
>> it from .61)
>>
>> I simply ran nodetool decommission on the 61 node and then (after an hour,
>> I guess) I saw this error in two other live nodes.
>>
>> Does this ring any bell? It's either a bug, or that I wasn't
>> running decommission correctly...
>>
>
>

Re: ConcurrentModificationException in gossiper while decommissioning another node

Posted by roger schildmeijer <sc...@gmail.com>.
This is hopefully fixed in trunk (CASSANDRA-757 (revision 938597)); "Replace
synchronization in Gossiper with concurrent data structures and volatile
fields."

// Roger Schildmeijer

On Tue, May 18, 2010 at 1:55 PM, Ran Tavory <ra...@gmail.com> wrote:

> While the node 192.168.252.61 was in the process of decommissioning I see
> this error in two other nodes:
>
>  INFO [Timer-1] 2010-05-18 06:01:12,048 Gossiper.java (line 179)
> InetAddress /192.168.252.62 is now dead.
>  INFO [GMFD:1] 2010-05-18 06:04:00,189 Gossiper.java (line 568) InetAddress
> /192.168.252.62 is now UP
>  INFO [Timer-1] 2010-05-18 06:11:45,311 Gossiper.java (line 401) FatClient
> /192.168.252.61 has been silent for 3600000ms, removing from gossip
> ERROR [Timer-1] 2010-05-18 06:11:45,315 CassandraDaemon.java (line 88)
> Fatal exception in thread Thread[Timer-1,5,main]
> java.lang.RuntimeException: java.util.ConcurrentModificationException
>         at
> org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:97)
>         at java.util.TimerThread.mainLoop(Timer.java:512)
>         at java.util.TimerThread.run(Timer.java:462)
> Caused by: java.util.ConcurrentModificationException
>         at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
>         at
> org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:382)
>         at
> org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:91)
>         ... 2 more
>
>
> .61 is the decommissioned node. .62 was under load (streams transferred to
> it from .61)
>
> I simply ran nodetool decommission on the 61 node and then (after an hour,
> I guess) I saw this error in two other live nodes.
>
> Does this ring any bell? It's either a bug, or that I wasn't
> running decommission correctly...
>