You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "hahadada (JIRA)" <ji...@apache.org> on 2016/09/14 12:47:20 UTC
[jira] [Updated] (IGNITE-3898) Add more information for Debugging

     [ https://issues.apache.org/jira/browse/IGNITE-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

hahadada updated IGNITE-3898:
-----------------------------
    Description: 
Hi,

We see following error in our grid quite frequently.


2016-09-09 09:36:12.458 BST [WARN] -
thread="tcp-disco-msg-worker-#2%%" -
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi" - Timed out
waiting for message to be read (most probably, the reason is in long GC
pauses on remote node) [curTimeout=20000]

What is missing in this are the details about remote node (id/host/port).

I believe if it is trying to connect and timeout, then it also knows which node that is) those details so that one can go and instantly check memory status. For larger grid, checking logs and memory status for each remote node is difficult.

Can we please add node details along with user attributes for remote node in same log message. This way it will be easier to identify which remote node caused issue in a large grid.

As to details, default nodeid/hostname/port shall suffice, however, it would still be less obvious to identify nodes.

Can we add a flag -DPrintTheseUserAttributesAboutRemoteNode="<comma separated list of attributes>"(apologies for very simple name)
or -DPrintAlluserAttributesAboutRemoteNode. This way one can identify nodes more cleanly in same log message.

Thanks.



  was:
Hi,

We see following error in our grid quite frequently.


2016-09-09 09:36:12.458 BST [WARN] -
thread="tcp-disco-msg-worker-#2%%" -
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi" - Timed out
waiting for message to be read (most probably, the reason is in long GC
pauses on remote node) [curTimeout=20000]

What is missing in this are the details about remote node (id/host/port).

I believe if it is trying to connect and timeout, then it also knows which node that is) those details so that one can go and instantly check memory status. For larger grid, checking logs and memory status for each remote node is difficult.

Can we please add node details along with user attributes for remote node in same log message. This way it will be easier to identify which remote node caused issue in a large grid.

As to details, default nodeid/hostname/port shall suffice, however, it would still be less obvious to identify nodes.

Can we add a flag -DPrintTheseUserAttributesAboutRemoteNode="<comma separated list of attributes>"(apologies for very simple name)
or -DPrintAlluserAttributesAboutRemoteNode. This way one can identify nodes more cleanly in same log message.

Thanks.


Thanks.


> Add more information for Debugging
> ----------------------------------
>
>                 Key: IGNITE-3898
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3898
>             Project: Ignite
>          Issue Type: Improvement
>          Components: clients
>    Affects Versions: 1.6
>            Reporter: hahadada
>
> Hi,
> We see following error in our grid quite frequently.
> 2016-09-09 09:36:12.458 BST [WARN] -
> thread="tcp-disco-msg-worker-#2%%" -
> class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi" - Timed out
> waiting for message to be read (most probably, the reason is in long GC
> pauses on remote node) [curTimeout=20000]
> What is missing in this are the details about remote node (id/host/port).
> I believe if it is trying to connect and timeout, then it also knows which node that is) those details so that one can go and instantly check memory status. For larger grid, checking logs and memory status for each remote node is difficult.
> Can we please add node details along with user attributes for remote node in same log message. This way it will be easier to identify which remote node caused issue in a large grid.
> As to details, default nodeid/hostname/port shall suffice, however, it would still be less obvious to identify nodes.
> Can we add a flag -DPrintTheseUserAttributesAboutRemoteNode="<comma separated list of attributes>"(apologies for very simple name)
> or -DPrintAlluserAttributesAboutRemoteNode. This way one can identify nodes more cleanly in same log message.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)