You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Anil <an...@gmail.com> on 2017/02/17 15:04:00 UTC

Ignit Cache Stopped

Hi,

We noticed whenever long running queries fired, nodes are going out of
topology and entire ignite cluster is down.

In my case, a filter criteria could get 5L records. So each API request
could fetch 250 records. When page number is getting increased the query
execution time is high and entire cluster is down

 https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?

Can we set seperate thread pool for queries executions, compute jobs and
other services instead of common public thread pool ?

Thanks

Re: Ignit Cache Stopped

Posted by Andrey Gura <ag...@apache.org>.

I think it is just H2 wrapper for string values.

On Tue, Feb 21, 2017 at 8:21 AM, Anil <an...@gmail.com> wrote:
> Thanks Andrey.
>
> I see node is down even gc log looks good. I will try to reproduce.
>
> May I know what is the org.h2.value.ValueString objects in the attached the
> screenshot ?
>
> Thanks.
>
> On 20 February 2017 at 18:37, Andrey Gura <ag...@apache.org> wrote:
>>
>> Anil,
>>
>> No, it doesn't. Only client should left topology in this case.
>>
>> On Mon, Feb 20, 2017 at 3:44 PM, Anil <an...@gmail.com> wrote:
>> > Hi Andrey,
>> >
>> > Does client ignite gc impact ignite cluster topology ?
>> >
>> > Thanks
>> >
>> > On 17 February 2017 at 22:56, Andrey Gura <ag...@apache.org> wrote:
>> >>
>> >> From GC logs at the end of files I see Full GC pauses like this:
>> >>
>> >> 2017-02-17T04:29:22.118-0800: 21122.643: [Full GC (Allocation Failure)
>> >>  10226M->8526M(10G), 26.8952036 secs]
>> >>    [Eden: 0.0B(512.0M)->0.0B(536.0M) Survivors: 0.0B->0.0B Heap:
>> >> 10226.0M(10.0G)->8526.8M(10.0G)], [Metaspace:
>> >> 77592K->77592K(1120256K)]
>> >>
>> >> Your heap is exhausted. During GC discovery doesn't receive heart
>> >> betas and nodes stopped due to segmentation. Please check your nodes'
>> >> logs for NODE_SEGMENTED pattern. If it is your case try to tune GC or
>> >> reduce load on GC (see for details [1])
>> >>
>> >> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
>> >>
>> >> On Fri, Feb 17, 2017 at 6:35 PM, Anil <an...@gmail.com> wrote:
>> >> > Hi Andrey,
>> >> >
>> >> > The queyr execution time is very high when limit 10000+250 .
>> >> >
>> >> > 10 GB of heap memory for both client and servers. I have attached the
>> >> > gc
>> >> > logs of 4 servers. Could you please take a look ? thanks.
>> >> >
>> >> >
>> >> > On 17 February 2017 at 20:52, Anil <an...@gmail.com> wrote:
>> >> >>
>> >> >> Hi Andrey,
>> >> >>
>> >> >> I checked GClogs  and everything looks good.
>> >> >>
>> >> >> Thanks
>> >> >>
>> >> >> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:
>> >> >>>
>> >> >>> Anil,
>> >> >>>
>> >> >>> IGNITE-4003 isn't related with your problem.
>> >> >>>
>> >> >>> I think that nodes are going out of topology due to long GC pauses.
>> >> >>> You can easily check this using GC logs.
>> >> >>>
>> >> >>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
>> >> >>> > Hi,
>> >> >>> >
>> >> >>> > We noticed whenever long running queries fired, nodes are going
>> >> >>> > out
>> >> >>> > of
>> >> >>> > topology and entire ignite cluster is down.
>> >> >>> >
>> >> >>> > In my case, a filter criteria could get 5L records. So each API
>> >> >>> > request
>> >> >>> > could fetch 250 records. When page number is getting increased
>> >> >>> > the
>> >> >>> > query
>> >> >>> > execution time is high and entire cluster is down
>> >> >>> >
>> >> >>> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to
>> >> >>> > this ?
>> >> >>> >
>> >> >>> > Can we set seperate thread pool for queries executions, compute
>> >> >>> > jobs
>> >> >>> > and
>> >> >>> > other services instead of common public thread pool ?
>> >> >>> >
>> >> >>> > Thanks
>> >> >>> >
>> >> >>> >
>> >> >>
>> >> >>
>> >> >
>> >
>> >
>
>

Re: Ignit Cache Stopped

Posted by Anil <an...@gmail.com>.

Thanks Andrey.

I see node is down even gc log looks good. I will try to reproduce.

May I know what is the org.h2.value.ValueString objects in the attached the
screenshot ?

Thanks.

On 20 February 2017 at 18:37, Andrey Gura <ag...@apache.org> wrote:

> Anil,
>
> No, it doesn't. Only client should left topology in this case.
>
> On Mon, Feb 20, 2017 at 3:44 PM, Anil <an...@gmail.com> wrote:
> > Hi Andrey,
> >
> > Does client ignite gc impact ignite cluster topology ?
> >
> > Thanks
> >
> > On 17 February 2017 at 22:56, Andrey Gura <ag...@apache.org> wrote:
> >>
> >> From GC logs at the end of files I see Full GC pauses like this:
> >>
> >> 2017-02-17T04:29:22.118-0800: 21122.643: [Full GC (Allocation Failure)
> >>  10226M->8526M(10G), 26.8952036 secs]
> >>    [Eden: 0.0B(512.0M)->0.0B(536.0M) Survivors: 0.0B->0.0B Heap:
> >> 10226.0M(10.0G)->8526.8M(10.0G)], [Metaspace:
> >> 77592K->77592K(1120256K)]
> >>
> >> Your heap is exhausted. During GC discovery doesn't receive heart
> >> betas and nodes stopped due to segmentation. Please check your nodes'
> >> logs for NODE_SEGMENTED pattern. If it is your case try to tune GC or
> >> reduce load on GC (see for details [1])
> >>
> >> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
> >>
> >> On Fri, Feb 17, 2017 at 6:35 PM, Anil <an...@gmail.com> wrote:
> >> > Hi Andrey,
> >> >
> >> > The queyr execution time is very high when limit 10000+250 .
> >> >
> >> > 10 GB of heap memory for both client and servers. I have attached the
> gc
> >> > logs of 4 servers. Could you please take a look ? thanks.
> >> >
> >> >
> >> > On 17 February 2017 at 20:52, Anil <an...@gmail.com> wrote:
> >> >>
> >> >> Hi Andrey,
> >> >>
> >> >> I checked GClogs  and everything looks good.
> >> >>
> >> >> Thanks
> >> >>
> >> >> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:
> >> >>>
> >> >>> Anil,
> >> >>>
> >> >>> IGNITE-4003 isn't related with your problem.
> >> >>>
> >> >>> I think that nodes are going out of topology due to long GC pauses.
> >> >>> You can easily check this using GC logs.
> >> >>>
> >> >>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
> >> >>> > Hi,
> >> >>> >
> >> >>> > We noticed whenever long running queries fired, nodes are going
> out
> >> >>> > of
> >> >>> > topology and entire ignite cluster is down.
> >> >>> >
> >> >>> > In my case, a filter criteria could get 5L records. So each API
> >> >>> > request
> >> >>> > could fetch 250 records. When page number is getting increased the
> >> >>> > query
> >> >>> > execution time is high and entire cluster is down
> >> >>> >
> >> >>> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to
> this ?
> >> >>> >
> >> >>> > Can we set seperate thread pool for queries executions, compute
> jobs
> >> >>> > and
> >> >>> > other services instead of common public thread pool ?
> >> >>> >
> >> >>> > Thanks
> >> >>> >
> >> >>> >
> >> >>
> >> >>
> >> >
> >
> >
>

Re: Ignit Cache Stopped

Posted by Andrey Gura <ag...@apache.org>.

Anil,

No, it doesn't. Only client should left topology in this case.

On Mon, Feb 20, 2017 at 3:44 PM, Anil <an...@gmail.com> wrote:
> Hi Andrey,
>
> Does client ignite gc impact ignite cluster topology ?
>
> Thanks
>
> On 17 February 2017 at 22:56, Andrey Gura <ag...@apache.org> wrote:
>>
>> From GC logs at the end of files I see Full GC pauses like this:
>>
>> 2017-02-17T04:29:22.118-0800: 21122.643: [Full GC (Allocation Failure)
>>  10226M->8526M(10G), 26.8952036 secs]
>>    [Eden: 0.0B(512.0M)->0.0B(536.0M) Survivors: 0.0B->0.0B Heap:
>> 10226.0M(10.0G)->8526.8M(10.0G)], [Metaspace:
>> 77592K->77592K(1120256K)]
>>
>> Your heap is exhausted. During GC discovery doesn't receive heart
>> betas and nodes stopped due to segmentation. Please check your nodes'
>> logs for NODE_SEGMENTED pattern. If it is your case try to tune GC or
>> reduce load on GC (see for details [1])
>>
>> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
>>
>> On Fri, Feb 17, 2017 at 6:35 PM, Anil <an...@gmail.com> wrote:
>> > Hi Andrey,
>> >
>> > The queyr execution time is very high when limit 10000+250 .
>> >
>> > 10 GB of heap memory for both client and servers. I have attached the gc
>> > logs of 4 servers. Could you please take a look ? thanks.
>> >
>> >
>> > On 17 February 2017 at 20:52, Anil <an...@gmail.com> wrote:
>> >>
>> >> Hi Andrey,
>> >>
>> >> I checked GClogs  and everything looks good.
>> >>
>> >> Thanks
>> >>
>> >> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:
>> >>>
>> >>> Anil,
>> >>>
>> >>> IGNITE-4003 isn't related with your problem.
>> >>>
>> >>> I think that nodes are going out of topology due to long GC pauses.
>> >>> You can easily check this using GC logs.
>> >>>
>> >>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > We noticed whenever long running queries fired, nodes are going out
>> >>> > of
>> >>> > topology and entire ignite cluster is down.
>> >>> >
>> >>> > In my case, a filter criteria could get 5L records. So each API
>> >>> > request
>> >>> > could fetch 250 records. When page number is getting increased the
>> >>> > query
>> >>> > execution time is high and entire cluster is down
>> >>> >
>> >>> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?
>> >>> >
>> >>> > Can we set seperate thread pool for queries executions, compute jobs
>> >>> > and
>> >>> > other services instead of common public thread pool ?
>> >>> >
>> >>> > Thanks
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>
>

Re: Ignit Cache Stopped

Posted by Anil <an...@gmail.com>.

Hi Andrey,

Does client ignite gc impact ignite cluster topology ?

Thanks

On 17 February 2017 at 22:56, Andrey Gura <ag...@apache.org> wrote:

> From GC logs at the end of files I see Full GC pauses like this:
>
> 2017-02-17T04:29:22.118-0800: 21122.643: [Full GC (Allocation Failure)
>  10226M->8526M(10G), 26.8952036 secs]
>    [Eden: 0.0B(512.0M)->0.0B(536.0M) Survivors: 0.0B->0.0B Heap:
> 10226.0M(10.0G)->8526.8M(10.0G)], [Metaspace:
> 77592K->77592K(1120256K)]
>
> Your heap is exhausted. During GC discovery doesn't receive heart
> betas and nodes stopped due to segmentation. Please check your nodes'
> logs for NODE_SEGMENTED pattern. If it is your case try to tune GC or
> reduce load on GC (see for details [1])
>
> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
>
> On Fri, Feb 17, 2017 at 6:35 PM, Anil <an...@gmail.com> wrote:
> > Hi Andrey,
> >
> > The queyr execution time is very high when limit 10000+250 .
> >
> > 10 GB of heap memory for both client and servers. I have attached the gc
> > logs of 4 servers. Could you please take a look ? thanks.
> >
> >
> > On 17 February 2017 at 20:52, Anil <an...@gmail.com> wrote:
> >>
> >> Hi Andrey,
> >>
> >> I checked GClogs  and everything looks good.
> >>
> >> Thanks
> >>
> >> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:
> >>>
> >>> Anil,
> >>>
> >>> IGNITE-4003 isn't related with your problem.
> >>>
> >>> I think that nodes are going out of topology due to long GC pauses.
> >>> You can easily check this using GC logs.
> >>>
> >>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
> >>> > Hi,
> >>> >
> >>> > We noticed whenever long running queries fired, nodes are going out
> of
> >>> > topology and entire ignite cluster is down.
> >>> >
> >>> > In my case, a filter criteria could get 5L records. So each API
> request
> >>> > could fetch 250 records. When page number is getting increased the
> >>> > query
> >>> > execution time is high and entire cluster is down
> >>> >
> >>> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?
> >>> >
> >>> > Can we set seperate thread pool for queries executions, compute jobs
> >>> > and
> >>> > other services instead of common public thread pool ?
> >>> >
> >>> > Thanks
> >>> >
> >>> >
> >>
> >>
> >
>

Re: Ignit Cache Stopped

Posted by Andrey Gura <ag...@apache.org>.

From GC logs at the end of files I see Full GC pauses like this:

2017-02-17T04:29:22.118-0800: 21122.643: [Full GC (Allocation Failure)
 10226M->8526M(10G), 26.8952036 secs]
   [Eden: 0.0B(512.0M)->0.0B(536.0M) Survivors: 0.0B->0.0B Heap:
10226.0M(10.0G)->8526.8M(10.0G)], [Metaspace:
77592K->77592K(1120256K)]

Your heap is exhausted. During GC discovery doesn't receive heart
betas and nodes stopped due to segmentation. Please check your nodes'
logs for NODE_SEGMENTED pattern. If it is your case try to tune GC or
reduce load on GC (see for details [1])

[1] https://apacheignite.readme.io/docs/jvm-and-system-tuning

On Fri, Feb 17, 2017 at 6:35 PM, Anil <an...@gmail.com> wrote:
> Hi Andrey,
>
> The queyr execution time is very high when limit 10000+250 .
>
> 10 GB of heap memory for both client and servers. I have attached the gc
> logs of 4 servers. Could you please take a look ? thanks.
>
>
> On 17 February 2017 at 20:52, Anil <an...@gmail.com> wrote:
>>
>> Hi Andrey,
>>
>> I checked GClogs  and everything looks good.
>>
>> Thanks
>>
>> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:
>>>
>>> Anil,
>>>
>>> IGNITE-4003 isn't related with your problem.
>>>
>>> I think that nodes are going out of topology due to long GC pauses.
>>> You can easily check this using GC logs.
>>>
>>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
>>> > Hi,
>>> >
>>> > We noticed whenever long running queries fired, nodes are going out of
>>> > topology and entire ignite cluster is down.
>>> >
>>> > In my case, a filter criteria could get 5L records. So each API request
>>> > could fetch 250 records. When page number is getting increased the
>>> > query
>>> > execution time is high and entire cluster is down
>>> >
>>> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?
>>> >
>>> > Can we set seperate thread pool for queries executions, compute jobs
>>> > and
>>> > other services instead of common public thread pool ?
>>> >
>>> > Thanks
>>> >
>>> >
>>
>>
>

Re: Ignit Cache Stopped

Posted by Anil <an...@gmail.com>.

Hi Andrey,

The queyr execution time is very high when *limit 10000+250* .

10 GB of heap memory for both client and servers. I have attached the gc
logs of 4 servers. Could you please take a look ? thanks.


On 17 February 2017 at 20:52, Anil <an...@gmail.com> wrote:

> Hi Andrey,
>
> I checked GClogs  and everything looks good.
>
> Thanks
>
> On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:
>
>> Anil,
>>
>> IGNITE-4003 isn't related with your problem.
>>
>> I think that nodes are going out of topology due to long GC pauses.
>> You can easily check this using GC logs.
>>
>> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
>> > Hi,
>> >
>> > We noticed whenever long running queries fired, nodes are going out of
>> > topology and entire ignite cluster is down.
>> >
>> > In my case, a filter criteria could get 5L records. So each API request
>> > could fetch 250 records. When page number is getting increased the query
>> > execution time is high and entire cluster is down
>> >
>> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?
>> >
>> > Can we set seperate thread pool for queries executions, compute jobs and
>> > other services instead of common public thread pool ?
>> >
>> > Thanks
>> >
>> >
>>
>
>

Re: Ignit Cache Stopped

Posted by Anil <an...@gmail.com>.

Hi Andrey,

I checked GClogs  and everything looks good.

Thanks

On 17 February 2017 at 20:45, Andrey Gura <ag...@apache.org> wrote:

> Anil,
>
> IGNITE-4003 isn't related with your problem.
>
> I think that nodes are going out of topology due to long GC pauses.
> You can easily check this using GC logs.
>
> On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
> > Hi,
> >
> > We noticed whenever long running queries fired, nodes are going out of
> > topology and entire ignite cluster is down.
> >
> > In my case, a filter criteria could get 5L records. So each API request
> > could fetch 250 records. When page number is getting increased the query
> > execution time is high and entire cluster is down
> >
> >  https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?
> >
> > Can we set seperate thread pool for queries executions, compute jobs and
> > other services instead of common public thread pool ?
> >
> > Thanks
> >
> >
>

Re: Ignit Cache Stopped

Posted by Andrey Gura <ag...@apache.org>.

Anil,

IGNITE-4003 isn't related with your problem.

I think that nodes are going out of topology due to long GC pauses.
You can easily check this using GC logs.

On Fri, Feb 17, 2017 at 6:04 PM, Anil <an...@gmail.com> wrote:
> Hi,
>
> We noticed whenever long running queries fired, nodes are going out of
> topology and entire ignite cluster is down.
>
> In my case, a filter criteria could get 5L records. So each API request
> could fetch 250 records. When page number is getting increased the query
> execution time is high and entire cluster is down
>
>  https://issues.apache.org/jira/browse/IGNITE-4003 related to this ?
>
> Can we set seperate thread pool for queries executions, compute jobs and
> other services instead of common public thread pool ?
>
> Thanks
>
>