You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Biren Shah <Bi...@servicenow.com> on 2017/09/06 21:41:56 UTC

Ignite node not stopping after segmentation

Hi,

I have follow up question on segmentation from my previous post<http://apache-ignite-users.70518.x6.nabble.com/Cluster-segmentation-td16314.html#a16411>. The issue I am trying to resolve is that ignite node does not stop on the segmented node. Here is brief information on my application.

I have embedded Ignite into my application and using it for distributed caches. I am running Ignite cluster in my lab environment. I have two nodes in the cluster. In current setup, the application receives about 1 million data points every minute. I am putting the data into ignite distributed cache using data streamer. This way data gets distributed among members and each member further processes the data. The application also uses other distributed caches while processing the data.

When a member node gets segmented, it does not stop. I get BEFORE_NODE_STOP event but nothing happens after that. Node hangs in some unstable state. I am suspecting that when node is trying to stop there are data in buffers of streamer which needs sent to other members. Because the node is segmented, it is not able to flush/drop the data. The application is also trying to access caches while node is stopping, that also causes deadlock situation.

I have tried few things to make it work,

  1.  Letting node stop after segmentation which is the default behavior. But the node gets stuck.
  2.  Setting segmentation policy to NOOP. Plan was to stop the node manually after some clean up.
     *   This way when I get segmented event, I first try to close data streamer instance and cache instance. But when I trying to close data streamer, the close() call gets stuck. I was calling close with true to drop everything is streamer. But that did not help.
     *   On receiving segmentation event, restrict the application from accessing any caches. Then stop the node. Even then the node gets stuck.

I have attached few thread dumps here. In each of them one thread is trying to stop the node, but gets into waiting state.

Thanks,
Biren

Re: Ignite node not stopping after segmentation

Posted by luqmanahmad <lu...@gmail.com>.
See [1] for free network segmentation plugin

[1]  https://github.com/luqmanahmad/ignite-plugins
<https://github.com/luqmanahmad/ignite-plugins>  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite node not stopping after segmentation

Posted by Mikhail <mi...@gmail.com>.
Hi Biren,

Ignite <http://ignite.apache.org>   has a good  doc
<http://apacheignite.readme.io>   for jmv tuning:
https://apacheignite.readme.io/docs/jvm-and-system-tuning

I think G1 can help to avoid long pauses, also you can increase failure
detection timeout:
https://apacheignite.readme.io/docs/cluster-config#failure-detection-timeout

Thanks,
Mike.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite node not stopping after segmentation

Posted by Biren Shah <Bi...@servicenow.com>.
I updated the version and so far, I have not faced the issue. So, looks like the issue has been fixed in 2.1. But we still trying to figure out why segmentation happens so frequently. Would like to continue the discussion to resolve the cause of segmentation.

We are using ignite for distributed caches. The cache sizes are not huge. Total cached data is < 2G. The data does not change that frequently. We load the caches at the start of the application and then just keep reading them.

So far what we have noticed is that GC pause causes the segmentation. The application receives hug amount of data every minute. And with all other processing, we create lots of object. But most of the allocation is outside ignite, even then ignite node segments out.

We are planning to go to production in a month. In current state we are not able to scale. Any suggestions on fine turning Ignite or things we should watch out when using ignite?

Thanks,
Biren

From: Mikhail Cherkasov <mc...@gridgain.com>
Reply-To: "user@ignite.apache.org" <us...@ignite.apache.org>
Date: Sunday, September 10, 2017 at 10:27 AM
To: "user@ignite.apache.org" <us...@ignite.apache.org>
Subject: Re: Ignite node not stopping after segmentation

Biren, could you please try to reproduce the issue with 2.1?

On Fri, Sep 8, 2017 at 8:20 PM, Biren Shah <Bi...@servicenow.com>> wrote:
We are on 2.0.

Thanks,
Biren

On 9/8/17, 9:51 AM, "Mikhail" <mi...@gmail.com>> wrote:

    Hi all,

    I created a  ticket for further investigations:

    https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_IGNITE-2D6323&d=DwICAg&c=Zok6nrOF6Fe0JtVEqKh3FEeUbToa1PtNBZf6G01cvEQ&r=rbkF1xy5tYmkV8VMdTRVaIVhaXCNGxmyTB5plfGtWuY&m=uzZiitEwF5QUYB1Cd5brGizW0MbK-uv3hZigSOD1CAg&s=Gk8LHYyTj0YslsKeA_gKC5D7Nc1McAMv1ypiEDCbJng&e=

     Biren, could you please confirm that you use the latest 2.1 version?

    Thanks,
    Michael.




    --
    Sent from: https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dignite-2Dusers.70518.x6.nabble.com_&d=DwICAg&c=Zok6nrOF6Fe0JtVEqKh3FEeUbToa1PtNBZf6G01cvEQ&r=rbkF1xy5tYmkV8VMdTRVaIVhaXCNGxmyTB5plfGtWuY&m=uzZiitEwF5QUYB1Cd5brGizW0MbK-uv3hZigSOD1CAg&s=aVej7K3jUlvXvrdyJkLenRbQ0YRZu-qJkxxxwKaWYVM&e=




--
Thanks,
Mikhail.

Re: Ignite node not stopping after segmentation

Posted by Mikhail Cherkasov <mc...@gridgain.com>.
Biren, could you please try to reproduce the issue with 2.1?

On Fri, Sep 8, 2017 at 8:20 PM, Biren Shah <Bi...@servicenow.com>
wrote:

> We are on 2.0.
>
> Thanks,
> Biren
>
> On 9/8/17, 9:51 AM, "Mikhail" <mi...@gmail.com> wrote:
>
>     Hi all,
>
>     I created a  ticket for further investigations:
>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.
> apache.org_jira_browse_IGNITE-2D6323&d=DwICAg&c=
> Zok6nrOF6Fe0JtVEqKh3FEeUbToa1PtNBZf6G01cvEQ&r=
> rbkF1xy5tYmkV8VMdTRVaIVhaXCNGxmyTB5plfGtWuY&m=
> uzZiitEwF5QUYB1Cd5brGizW0MbK-uv3hZigSOD1CAg&s=Gk8LHYyTj0YslsKeA_
> gKC5D7Nc1McAMv1ypiEDCbJng&e=
>
>      Biren, could you please confirm that you use the latest 2.1 version?
>
>     Thanks,
>     Michael.
>
>
>
>
>     --
>     Sent from: https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-
> 2Dignite-2Dusers.70518.x6.nabble.com_&d=DwICAg&c=
> Zok6nrOF6Fe0JtVEqKh3FEeUbToa1PtNBZf6G01cvEQ&r=
> rbkF1xy5tYmkV8VMdTRVaIVhaXCNGxmyTB5plfGtWuY&m=
> uzZiitEwF5QUYB1Cd5brGizW0MbK-uv3hZigSOD1CAg&s=
> aVej7K3jUlvXvrdyJkLenRbQ0YRZu-qJkxxxwKaWYVM&e=
>
>
>


-- 
Thanks,
Mikhail.

Re: Ignite node not stopping after segmentation

Posted by Biren Shah <Bi...@servicenow.com>.
We are on 2.0.

Thanks,
Biren

On 9/8/17, 9:51 AM, "Mikhail" <mi...@gmail.com> wrote:

    Hi all,
    
    I created a  ticket for further investigations:
    
    https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_IGNITE-2D6323&d=DwICAg&c=Zok6nrOF6Fe0JtVEqKh3FEeUbToa1PtNBZf6G01cvEQ&r=rbkF1xy5tYmkV8VMdTRVaIVhaXCNGxmyTB5plfGtWuY&m=uzZiitEwF5QUYB1Cd5brGizW0MbK-uv3hZigSOD1CAg&s=Gk8LHYyTj0YslsKeA_gKC5D7Nc1McAMv1ypiEDCbJng&e=
    
     Biren, could you please confirm that you use the latest 2.1 version?
    
    Thanks,
    Michael.
    
    
    
    
    --
    Sent from: https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dignite-2Dusers.70518.x6.nabble.com_&d=DwICAg&c=Zok6nrOF6Fe0JtVEqKh3FEeUbToa1PtNBZf6G01cvEQ&r=rbkF1xy5tYmkV8VMdTRVaIVhaXCNGxmyTB5plfGtWuY&m=uzZiitEwF5QUYB1Cd5brGizW0MbK-uv3hZigSOD1CAg&s=aVej7K3jUlvXvrdyJkLenRbQ0YRZu-qJkxxxwKaWYVM&e=
    


Re: Ignite node not stopping after segmentation

Posted by Mikhail <mi...@gmail.com>.
Hi all,

I created a  ticket for further investigations:

https://issues.apache.org/jira/browse/IGNITE-6323

 Biren, could you please confirm that you use the latest 2.1 version?

Thanks,
Michael.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite node not stopping after segmentation

Posted by Yakov Zhdanov <yz...@apache.org>.
Thanks for the report. I think we need to file a ticket and start
investigation. Can anyone from community pick this up?

--Yakov