You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ivan Daschinskiy (Jira)" <ji...@apache.org> on 2020/11/03 16:46:00 UTC

[jira] [Updated] (IGNITE-13564) Improve SYSTEM_WORKER_BLOCKED reporting.

     [ https://issues.apache.org/jira/browse/IGNITE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Daschinskiy updated IGNITE-13564:
--------------------------------------
    Description: 
Currently, reporting of system thread blocking has major drawbacks.

1. As system worker blocking is detected by another thread, due to implementation, failure handler receives not full information about problem. In {{FailureContext}} we have only two fields -- {{type}} and {{err}}.  Throwable {{err}} is generated in thread-detector flow, so we lost a context of main problem. 
2. Currently, due to implementation, we print not full stacktrace of blocking thread in {{org.apache.ignite.internal.worker.WorkersRegistry#onIdle}}. 


This two drawbacks can lead to completely loss of information about blocking system thread.

I suggests:
1. Add another parameter in {{FailureContext}}, namely {{worker}}
2. Fix threaddump printing. 

  was:
Currently, reporting of system thread blocking has major drawbacks.

1. As system worker blocking is detected by another thread, due to implementation, failure handler receives not full information about problem. In {{FailureContext}} we have only two fields -- {{type}} and {{err}}.  Throwable {{err}} is generated in thread-detector flow, so we lost a context of main problem. 
2. Currently, due to implementation, we print not full stacktrace of blocking thread in {{org.apache.ignite.internal.worker.WorkersRegistry#onIdle}}. 
3. Current approach doesn't work when there is one thread in registry, this fact isn't checked and this can cause an infinite looping of single thread, calling {{onIdle}} Fixed in 

This two drawbacks can lead to completely loss of information about blocking system thread.

I suggests:
1. Add another parameter in {{FailureContext}}, namely {{worker}}
2. Fix threaddump printing.
3. Add assertion when there is only one system thread in registry


> Improve SYSTEM_WORKER_BLOCKED reporting.
> ----------------------------------------
>
>                 Key: IGNITE-13564
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13564
>             Project: Ignite
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.8.1
>            Reporter: Ivan Daschinskiy
>            Priority: Major
>             Fix For: 2.10
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, reporting of system thread blocking has major drawbacks.
> 1. As system worker blocking is detected by another thread, due to implementation, failure handler receives not full information about problem. In {{FailureContext}} we have only two fields -- {{type}} and {{err}}.  Throwable {{err}} is generated in thread-detector flow, so we lost a context of main problem. 
> 2. Currently, due to implementation, we print not full stacktrace of blocking thread in {{org.apache.ignite.internal.worker.WorkersRegistry#onIdle}}. 
> This two drawbacks can lead to completely loss of information about blocking system thread.
> I suggests:
> 1. Add another parameter in {{FailureContext}}, namely {{worker}}
> 2. Fix threaddump printing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)