You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Yakov Zhdanov (JIRA)" <ji...@apache.org> on 2017/05/03 18:59:04 UTC

[jira] [Commented] (IGNITE-5155) Need to improve stats dump on exchange timeout

    [ https://issues.apache.org/jira/browse/IGNITE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995440#comment-15995440 ] 

Yakov Zhdanov commented on IGNITE-5155:
---------------------------------------

[~zstan], since you currently work on IGNITE-5125 I thought this issue would be interesting for you. Reassign to me if you disagree.

Thanks!



> Need to improve stats dump on exchange timeout
> ----------------------------------------------
>
>                 Key: IGNITE-5155
>                 URL: https://issues.apache.org/jira/browse/IGNITE-5155
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Yakov Zhdanov
>            Assignee: Stanilovsky Evgeny
>             Fix For: 2.1
>
>
> Currently, on large topologies info dumped on "Failed to wait for partition map exchange" (org/apache/ignite/internal/processors/cache/GridCachePartitionExchangeManager.java:1713) floods the log and we need to reduce information dumped.
> 1. Reduce output for exchange futures that are already done. Keep event, topology version, servers count, clients count (more?)
> 2. Do not dump the whole communication stats, but send message to exchange coordinator, ask for its status and for number of messages received and for acked messages from local node.
> 3. we can think of sending new message from cache node to coordinator that may be a sign of a problem on that node (e.g. unreleased tx locks or still renting partitions) and coordinator may include this info to a status thus every Ignite node may point to a problem node in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)