You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mikhail Cherkasov (JIRA)" <ji...@apache.org> on 2017/09/12 10:21:00 UTC

[jira] [Assigned] (IGNITE-6323) Ignite node not stopping after segmentation

     [ https://issues.apache.org/jira/browse/IGNITE-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikhail Cherkasov reassigned IGNITE-6323:
-----------------------------------------

    Assignee: Mikhail Cherkasov

> Ignite node not stopping after segmentation
> -------------------------------------------
>
>                 Key: IGNITE-6323
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6323
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 2.0
>            Reporter: Mikhail Cherkasov
>            Assignee: Mikhail Cherkasov
>             Fix For: 2.3
>
>         Attachments: thread-dump-9-1.txt, thread-dump-9-2.txt, thread-dump-9-4.txt
>
>
> The problem was found by a user and described in user list:
> http://apache-ignite-users.70518.x6.nabble.com/Ignite-node-not-stopping-after-segmentation-td16773.html
> copy of the message:
> """
> I have follow up question on segmentation from my previous post. The issue I am trying to resolve is that ignite node does not stop on the segmented node. Here is brief information on my application.
>  
> I have embedded Ignite into my application and using it for distributed caches. I am running Ignite cluster in my lab environment. I have two nodes in the cluster. In current setup, the application receives about 1 million data points every minute. I am putting the data into ignite distributed cache using data streamer. This way data gets distributed among members and each member further processes the data. The application also uses other distributed caches while processing the data.
>  
> When a member node gets segmented, it does not stop. I get BEFORE_NODE_STOP event but nothing happens after that. Node hangs in some unstable state. I am suspecting that when node is trying to stop there are data in buffers of streamer which needs sent to other members. Because the node is segmented, it is not able to flush/drop the data. The application is also trying to access caches while node is stopping, that also causes deadlock situation.
>  
> I have tried few things to make it work,
> Letting node stop after segmentation which is the default behavior. But the node gets stuck.
> Setting segmentation policy to NOOP. Plan was to stop the node manually after some clean up.
> This way when I get segmented event, I first try to close data streamer instance and cache instance. But when I trying to close data streamer, the close() call gets stuck. I was calling close with true to drop everything is streamer. But that did not help.
> On receiving segmentation event, restrict the application from accessing any caches. Then stop the node. Even then the node gets stuck.
>  
> I have attached few thread dumps here. In each of them one thread is trying to stop the node, but gets into waiting state.
> """



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)