You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Wade Simmons (JIRA)" <ji...@apache.org> on 2010/07/16 22:29:52 UTC
[jira] Created: (CASSANDRA-1289) GossipTimerTask stops running if
an Exception occurs
GossipTimerTask stops running if an Exception occurs
----------------------------------------------------
Key: CASSANDRA-1289
URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.6.3, 0.6.2, 0.6.1, 0.6, 0.7
Reporter: Wade Simmons
The GossipTimerTask run() method has a try/catch around its body, but it re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask to no longer run (due to the way the underlying Java Timer implementation works), stopping the periodic gossip status checks.
Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and you get into a state where the server keeps running, but gossip is no longer occurring, preventing node addition / removal from happening.
I see two potential choices:
1) Log the error but don't re-throw it so that the GossipTimerTask will continue to run on its next interval.
2) Shutdown the server, since continuing to run without gossip subtly breaks other functionality / knowledge of other nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1289) GossipTimerTask stops running if
an Exception occurs
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889554#action_12889554 ]
Jonathan Ellis commented on CASSANDRA-1289:
-------------------------------------------
committed w/ changes since it was simple:
uses .error instead of .warn
uses .error(message, exception) so the entire stack trace will be logged
> GossipTimerTask stops running if an Exception occurs
> ----------------------------------------------------
>
> Key: CASSANDRA-1289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6, 0.6.1, 0.6.2, 0.6.3
> Reporter: Wade Simmons
> Assignee: Brandon Williams
> Fix For: 0.6.4
>
> Attachments: 1289.txt
>
>
> The GossipTimerTask run() method has a try/catch around its body, but it re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask to no longer run (due to the way the underlying Java Timer implementation works), stopping the periodic gossip status checks.
> Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and you get into a state where the server keeps running, but gossip is no longer occurring, preventing node addition / removal from happening.
> I see two potential choices:
> 1) Log the error but don't re-throw it so that the GossipTimerTask will continue to run on its next interval.
> 2) Shutdown the server, since continuing to run without gossip subtly breaks other functionality / knowledge of other nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1289) GossipTimerTask stops running if
an Exception occurs
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-1289:
----------------------------------------
Attachment: 1289.txt
> GossipTimerTask stops running if an Exception occurs
> ----------------------------------------------------
>
> Key: CASSANDRA-1289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6, 0.6.1, 0.6.2, 0.6.3
> Reporter: Wade Simmons
> Assignee: Brandon Williams
> Fix For: 0.6.4
>
> Attachments: 1289.txt
>
>
> The GossipTimerTask run() method has a try/catch around its body, but it re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask to no longer run (due to the way the underlying Java Timer implementation works), stopping the periodic gossip status checks.
> Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and you get into a state where the server keeps running, but gossip is no longer occurring, preventing node addition / removal from happening.
> I see two potential choices:
> 1) Log the error but don't re-throw it so that the GossipTimerTask will continue to run on its next interval.
> 2) Shutdown the server, since continuing to run without gossip subtly breaks other functionality / knowledge of other nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1289) GossipTimerTask stops running if
an Exception occurs
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-1289:
----------------------------------------
Attachment: 1289.txt
Patch to catch the exception and log it, as suggested in CASSANDRA-757
> GossipTimerTask stops running if an Exception occurs
> ----------------------------------------------------
>
> Key: CASSANDRA-1289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6, 0.6.1, 0.6.2, 0.6.3, 0.7
> Reporter: Wade Simmons
> Attachments: 1289.txt
>
>
> The GossipTimerTask run() method has a try/catch around its body, but it re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask to no longer run (due to the way the underlying Java Timer implementation works), stopping the periodic gossip status checks.
> Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and you get into a state where the server keeps running, but gossip is no longer occurring, preventing node addition / removal from happening.
> I see two potential choices:
> 1) Log the error but don't re-throw it so that the GossipTimerTask will continue to run on its next interval.
> 2) Shutdown the server, since continuing to run without gossip subtly breaks other functionality / knowledge of other nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1289) GossipTimerTask stops running if
an Exception occurs
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-1289:
----------------------------------------
Attachment: (was: 1289.txt)
> GossipTimerTask stops running if an Exception occurs
> ----------------------------------------------------
>
> Key: CASSANDRA-1289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6, 0.6.1, 0.6.2, 0.6.3
> Reporter: Wade Simmons
> Assignee: Brandon Williams
> Fix For: 0.6.4
>
> Attachments: 1289.txt
>
>
> The GossipTimerTask run() method has a try/catch around its body, but it re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask to no longer run (due to the way the underlying Java Timer implementation works), stopping the periodic gossip status checks.
> Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and you get into a state where the server keeps running, but gossip is no longer occurring, preventing node addition / removal from happening.
> I see two potential choices:
> 1) Log the error but don't re-throw it so that the GossipTimerTask will continue to run on its next interval.
> 2) Shutdown the server, since continuing to run without gossip subtly breaks other functionality / knowledge of other nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.