You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@helix.apache.org by "Vinoth Chandar (JIRA)" <ji...@apache.org> on 2015/05/03 19:25:06 UTC

[jira] [Updated] (HELIX-594) Misleading NPE trying to reconnect, upon ZK Timeout

     [ https://issues.apache.org/jira/browse/HELIX-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinoth Chandar updated HELIX-594:
---------------------------------
    Description: 
 this is a safety feature where Helix automatically detects GC and disconnects from the cluster automatically. Unfortunately in some cases it surfaces as NPE. 

We should probably describe the reason for disabling in the instance config. Currently we just disable the node, we should probably add an attribute DISABLE_CAUSE:"TOO MANY DISCONNECTS FROM ZK. CHECK JAVA GC LOG" or something like that.


  was:
We always get the following errors on startup.. (#1 looks like the leader elector for controller... ) . Ours is a FULL_AUTO embedded controller helix configuration.

1.org.apache.helix.manager.zk.ZkBaseDataAccessor.doCreate(ZkBaseDataAccessor.java:138)
Node already exists. path: /streamio/STATEMODELDEFS/STORAGE_DEFAULT_SM_SCHEMATA


2. org.apache.helix.manager.zk.CallbackHandler.invoke(CallbackHandler.java:130) 
Skip processing callbacks for listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1a9f9f09, path: /streamio/INSTANCES/datapipe11-sjc1-controller-/MESSAGES, expected types: [CALLBACK, FINALIZE] but was INIT


3.org.apache.helix.healthcheck.ParticipantHealthReportTask.stop(ParticipantHealthReportTask.java:67)
ParticipantHealthReportTimerTask already stopped
org.apache.helix.healthcheck.ParticipantHealthReportTask in stop at line 67


> Misleading NPE trying to reconnect, upon ZK Timeout
> ---------------------------------------------------
>
>                 Key: HELIX-594
>                 URL: https://issues.apache.org/jira/browse/HELIX-594
>             Project: Apache Helix
>          Issue Type: Improvement
>          Components: helix-core
>    Affects Versions: 0.6.5
>            Reporter: Vinoth Chandar
>            Priority: Minor
>             Fix For: master
>
>
>  this is a safety feature where Helix automatically detects GC and disconnects from the cluster automatically. Unfortunately in some cases it surfaces as NPE. 
> We should probably describe the reason for disabling in the instance config. Currently we just disable the node, we should probably add an attribute DISABLE_CAUSE:"TOO MANY DISCONNECTS FROM ZK. CHECK JAVA GC LOG" or something like that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)