You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Anuj Wadehra (JIRA)" <ji...@apache.org> on 2015/09/04 14:44:46 UTC

[jira] [Commented] (CASSANDRA-8907) Raise GCInspector alerts to WARN

    [ https://issues.apache.org/jira/browse/CASSANDRA-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730733#comment-14730733 ] 

Anuj Wadehra commented on CASSANDRA-8907:
-----------------------------------------

Hi Joshua McKenzie..I have a different thought on this.. I think this property should be enabled by default. Instead, if the property is missing, we should take a reasonably high gc warn limit say 1000ms.

I think, no matter what kind of application it is, there is always a gc WARN limit for it considering following factors:
1. Application Throughput Requirements/Service Level Agreements  OR
2. phi_convict_threshold : Larger GC Pauses may lead to nodes being marked down which is unacceptable by any application. . I think that no application can afford unreasonably high GC pauses say >60 secs.

Many Production systems rely on keywords for patrolling logs based on “ERROR” and “WARN” and any unreasonably high GC pause should be at WARN level  else nodes will be marked down without any warnings.



> Raise GCInspector alerts to WARN
> --------------------------------
>
>                 Key: CASSANDRA-8907
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8907
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Adam Hattrell
>              Labels: patch
>         Attachments: cassnadra-8907.patch
>
>
> I'm fairly regularly running into folks wondering why their applications are reporting down nodes.  Yet, they report, when they grepped the logs they have no WARN or ERRORs listed.
> Nine times out of ten, when I look through the logs we see a ton of ParNew or CMS gc pauses occurring similar to the following:
> INFO [ScheduledTasks:1] 2013-03-07 18:44:46,795 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1835 ms for 3 collections, 2606015656 used; max is 10611589120
> INFO [ScheduledTasks:1] 2013-03-07 19:45:08,029 GCInspector.java (line 122) GC for ParNew: 9866 ms for 8 collections, 2910124308 used; max is 6358564864
> To my mind these should be WARN's as they have the potential to be significantly impacting the clusters performance as a whole.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)