You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@impala.apache.org by Brock Noland <br...@phdata.io> on 2018/08/21 18:12:47 UTC

Impalad JVM OOM minutes after restart

Hi folks,

I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at
any one time. All of a sudden the JVM inside the Impalad started
running out of memory.

I got a heap dump, but the heap was 32GB, host is 240GB, so it's very
large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open
it. I was able to get JHAT to opening it when setting JHAT's heap to
160GB. It's pretty unwieldy so much of the JHAT functionality doesn't
work.

I am spelunking around, but really curious if there is some places I
should check....

I am only an occasional reader of Impala source so I am just pointing
out things which felt interesting:

* Impalad was restarted shortly before the JVM OOM
* Joining Parquet on S3 with Kudu
* Only 13  instances of org.apache.impala.catalog.HdfsTable
* 176836 instances of org.apache.impala.analysis.Analyzer - this feels
odd to me. I remember one bug a while back in Hive when it would clone
the query tree until it ran OOM.
* 176796 of those _user fields point at the same user
* org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048
org.apache.impala.analysis.Analyzer@GlobalState objects pointing at
it.
*  There is only a single instance of
org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to
indicate there is only a single query running. I've tracked that query
down in CM. The users need to compute stats, but I don't feel that is
relevant to this JVM OOM condition.

Any pointers on what I might look for?

Cheers,
Brock

Re: Impalad JVM OOM minutes after restart

Posted by Brock Noland <br...@phdata.io>.

Jeezy - yes unfortunately I cannot share the query details at this
time. No hs_err file was generated.

Philip - Yeah that seems to be the way to go.

On Tue, Aug 21, 2018 at 1:51 PM, Philip Zeyliger <ph...@cloudera.com> wrote:
> Hi Brock,
>
> If you want to make Eclipse MAT more usable, set JAVA_TOOL_OPTIONS="-Xmx2g
>  -XX:+HeapDumpOnOutOfMemoryError" and you should see the max heap at 2GB,
> thereby making Eclipse MAT friendlier. Folks have also been using
> http://www.jxray.com/.
>
> The query itself will also be interesting. If there's something like an
> loop in analyzing it, you could imagine that showing up as an OOM. The heap
> dump should tell us.
>
> -- Philip
>
> On Tue, Aug 21, 2018 at 11:32 AM Brock Noland <br...@phdata.io> wrote:
>
>> Hi Jeezy,
>>
>> Thanks, good tip.
>>
>> The MS is quite small. Even mysqldump format is only 12MB. The largest
>> catalog-update I could find is only 1.5MB which should be easy to
>> process with 32GB of of heap. Lastly, it's possible we can reproduce
>> by running the query the impalad was processing during the issue,
>> going to wait until after the users head home to try, but it doesn't
>> appear reproducible in the method you describe. When we restarted, it
>> did not reproduce until users started running queries.
>>
>> I0820 19:45:25.106437 25474 statestore.cc:568] Preparing initial
>> catalog-update topic update for impalad@XXX:22000. Size = 1.45 MB
>>
>> Brock
>>
>> On Tue, Aug 21, 2018 at 1:18 PM, Jeszy <je...@gmail.com> wrote:
>> > Hey,
>> >
>> > If it happens shortly after a restart, there is a fair chance you're
>> > crashing while processing the initial catalog topic update. Statestore
>> > logs will tell you how big that was (it takes more memory to process
>> > it than the actual size of the update).
>> > If this is the case, it should also be reproducible, ie. the daemon
>> > will keep restarting and running OOM on initial update until you clear
>> > the metadata cache either by restarting catalog or via a (global)
>> > invalidate metadata.
>> >
>> > HTH
>> > On Tue, 21 Aug 2018 at 20:13, Brock Noland <br...@phdata.io> wrote:
>> >>
>> >> Hi folks,
>> >>
>> >> I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at
>> >> any one time. All of a sudden the JVM inside the Impalad started
>> >> running out of memory.
>> >>
>> >> I got a heap dump, but the heap was 32GB, host is 240GB, so it's very
>> >> large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open
>> >> it. I was able to get JHAT to opening it when setting JHAT's heap to
>> >> 160GB. It's pretty unwieldy so much of the JHAT functionality doesn't
>> >> work.
>> >>
>> >> I am spelunking around, but really curious if there is some places I
>> >> should check....
>> >>
>> >> I am only an occasional reader of Impala source so I am just pointing
>> >> out things which felt interesting:
>> >>
>> >> * Impalad was restarted shortly before the JVM OOM
>> >> * Joining Parquet on S3 with Kudu
>> >> * Only 13  instances of org.apache.impala.catalog.HdfsTable
>> >> * 176836 instances of org.apache.impala.analysis.Analyzer - this feels
>> >> odd to me. I remember one bug a while back in Hive when it would clone
>> >> the query tree until it ran OOM.
>> >> * 176796 of those _user fields point at the same user
>> >> * org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048
>> >> org.apache.impala.analysis.Analyzer@GlobalState objects pointing at
>> >> it.
>> >> *  There is only a single instance of
>> >> org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to
>> >> indicate there is only a single query running. I've tracked that query
>> >> down in CM. The users need to compute stats, but I don't feel that is
>> >> relevant to this JVM OOM condition.
>> >>
>> >> Any pointers on what I might look for?
>> >>
>> >> Cheers,
>> >> Brock
>>

Re: Impalad JVM OOM minutes after restart

Posted by Philip Zeyliger <ph...@cloudera.com>.

Hi Brock,

If you want to make Eclipse MAT more usable, set JAVA_TOOL_OPTIONS="-Xmx2g
 -XX:+HeapDumpOnOutOfMemoryError" and you should see the max heap at 2GB,
thereby making Eclipse MAT friendlier. Folks have also been using
http://www.jxray.com/.

The query itself will also be interesting. If there's something like an
loop in analyzing it, you could imagine that showing up as an OOM. The heap
dump should tell us.

-- Philip

On Tue, Aug 21, 2018 at 11:32 AM Brock Noland <br...@phdata.io> wrote:

> Hi Jeezy,
>
> Thanks, good tip.
>
> The MS is quite small. Even mysqldump format is only 12MB. The largest
> catalog-update I could find is only 1.5MB which should be easy to
> process with 32GB of of heap. Lastly, it's possible we can reproduce
> by running the query the impalad was processing during the issue,
> going to wait until after the users head home to try, but it doesn't
> appear reproducible in the method you describe. When we restarted, it
> did not reproduce until users started running queries.
>
> I0820 19:45:25.106437 25474 statestore.cc:568] Preparing initial
> catalog-update topic update for impalad@XXX:22000. Size = 1.45 MB
>
> Brock
>
> On Tue, Aug 21, 2018 at 1:18 PM, Jeszy <je...@gmail.com> wrote:
> > Hey,
> >
> > If it happens shortly after a restart, there is a fair chance you're
> > crashing while processing the initial catalog topic update. Statestore
> > logs will tell you how big that was (it takes more memory to process
> > it than the actual size of the update).
> > If this is the case, it should also be reproducible, ie. the daemon
> > will keep restarting and running OOM on initial update until you clear
> > the metadata cache either by restarting catalog or via a (global)
> > invalidate metadata.
> >
> > HTH
> > On Tue, 21 Aug 2018 at 20:13, Brock Noland <br...@phdata.io> wrote:
> >>
> >> Hi folks,
> >>
> >> I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at
> >> any one time. All of a sudden the JVM inside the Impalad started
> >> running out of memory.
> >>
> >> I got a heap dump, but the heap was 32GB, host is 240GB, so it's very
> >> large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open
> >> it. I was able to get JHAT to opening it when setting JHAT's heap to
> >> 160GB. It's pretty unwieldy so much of the JHAT functionality doesn't
> >> work.
> >>
> >> I am spelunking around, but really curious if there is some places I
> >> should check....
> >>
> >> I am only an occasional reader of Impala source so I am just pointing
> >> out things which felt interesting:
> >>
> >> * Impalad was restarted shortly before the JVM OOM
> >> * Joining Parquet on S3 with Kudu
> >> * Only 13  instances of org.apache.impala.catalog.HdfsTable
> >> * 176836 instances of org.apache.impala.analysis.Analyzer - this feels
> >> odd to me. I remember one bug a while back in Hive when it would clone
> >> the query tree until it ran OOM.
> >> * 176796 of those _user fields point at the same user
> >> * org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048
> >> org.apache.impala.analysis.Analyzer@GlobalState objects pointing at
> >> it.
> >> *  There is only a single instance of
> >> org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to
> >> indicate there is only a single query running. I've tracked that query
> >> down in CM. The users need to compute stats, but I don't feel that is
> >> relevant to this JVM OOM condition.
> >>
> >> Any pointers on what I might look for?
> >>
> >> Cheers,
> >> Brock
>

Re: Impalad JVM OOM minutes after restart

Posted by Jeszy <je...@gmail.com>.

Hm, that's interesting because:
- I haven't yet seen query planning itself cause OOM
- if it was catalog related to the tables involved in the query,
following initial topic size would be bigger

Can you share diagnostic data, like the query text, definitions and
stats for tables involved, hs_err_pid written on crash, etc?
On Tue, 21 Aug 2018 at 20:32, Brock Noland <br...@phdata.io> wrote:
>
> Hi Jeezy,
>
> Thanks, good tip.
>
> The MS is quite small. Even mysqldump format is only 12MB. The largest
> catalog-update I could find is only 1.5MB which should be easy to
> process with 32GB of of heap. Lastly, it's possible we can reproduce
> by running the query the impalad was processing during the issue,
> going to wait until after the users head home to try, but it doesn't
> appear reproducible in the method you describe. When we restarted, it
> did not reproduce until users started running queries.
>
> I0820 19:45:25.106437 25474 statestore.cc:568] Preparing initial
> catalog-update topic update for impalad@XXX:22000. Size = 1.45 MB
>
> Brock
>
> On Tue, Aug 21, 2018 at 1:18 PM, Jeszy <je...@gmail.com> wrote:
> > Hey,
> >
> > If it happens shortly after a restart, there is a fair chance you're
> > crashing while processing the initial catalog topic update. Statestore
> > logs will tell you how big that was (it takes more memory to process
> > it than the actual size of the update).
> > If this is the case, it should also be reproducible, ie. the daemon
> > will keep restarting and running OOM on initial update until you clear
> > the metadata cache either by restarting catalog or via a (global)
> > invalidate metadata.
> >
> > HTH
> > On Tue, 21 Aug 2018 at 20:13, Brock Noland <br...@phdata.io> wrote:
> >>
> >> Hi folks,
> >>
> >> I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at
> >> any one time. All of a sudden the JVM inside the Impalad started
> >> running out of memory.
> >>
> >> I got a heap dump, but the heap was 32GB, host is 240GB, so it's very
> >> large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open
> >> it. I was able to get JHAT to opening it when setting JHAT's heap to
> >> 160GB. It's pretty unwieldy so much of the JHAT functionality doesn't
> >> work.
> >>
> >> I am spelunking around, but really curious if there is some places I
> >> should check....
> >>
> >> I am only an occasional reader of Impala source so I am just pointing
> >> out things which felt interesting:
> >>
> >> * Impalad was restarted shortly before the JVM OOM
> >> * Joining Parquet on S3 with Kudu
> >> * Only 13  instances of org.apache.impala.catalog.HdfsTable
> >> * 176836 instances of org.apache.impala.analysis.Analyzer - this feels
> >> odd to me. I remember one bug a while back in Hive when it would clone
> >> the query tree until it ran OOM.
> >> * 176796 of those _user fields point at the same user
> >> * org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048
> >> org.apache.impala.analysis.Analyzer@GlobalState objects pointing at
> >> it.
> >> *  There is only a single instance of
> >> org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to
> >> indicate there is only a single query running. I've tracked that query
> >> down in CM. The users need to compute stats, but I don't feel that is
> >> relevant to this JVM OOM condition.
> >>
> >> Any pointers on what I might look for?
> >>
> >> Cheers,
> >> Brock

Re: Impalad JVM OOM minutes after restart

Posted by Brock Noland <br...@phdata.io>.

Hi Jeezy,

Thanks, good tip.

The MS is quite small. Even mysqldump format is only 12MB. The largest
catalog-update I could find is only 1.5MB which should be easy to
process with 32GB of of heap. Lastly, it's possible we can reproduce
by running the query the impalad was processing during the issue,
going to wait until after the users head home to try, but it doesn't
appear reproducible in the method you describe. When we restarted, it
did not reproduce until users started running queries.

I0820 19:45:25.106437 25474 statestore.cc:568] Preparing initial
catalog-update topic update for impalad@XXX:22000. Size = 1.45 MB

Brock

On Tue, Aug 21, 2018 at 1:18 PM, Jeszy <je...@gmail.com> wrote:
> Hey,
>
> If it happens shortly after a restart, there is a fair chance you're
> crashing while processing the initial catalog topic update. Statestore
> logs will tell you how big that was (it takes more memory to process
> it than the actual size of the update).
> If this is the case, it should also be reproducible, ie. the daemon
> will keep restarting and running OOM on initial update until you clear
> the metadata cache either by restarting catalog or via a (global)
> invalidate metadata.
>
> HTH
> On Tue, 21 Aug 2018 at 20:13, Brock Noland <br...@phdata.io> wrote:
>>
>> Hi folks,
>>
>> I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at
>> any one time. All of a sudden the JVM inside the Impalad started
>> running out of memory.
>>
>> I got a heap dump, but the heap was 32GB, host is 240GB, so it's very
>> large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open
>> it. I was able to get JHAT to opening it when setting JHAT's heap to
>> 160GB. It's pretty unwieldy so much of the JHAT functionality doesn't
>> work.
>>
>> I am spelunking around, but really curious if there is some places I
>> should check....
>>
>> I am only an occasional reader of Impala source so I am just pointing
>> out things which felt interesting:
>>
>> * Impalad was restarted shortly before the JVM OOM
>> * Joining Parquet on S3 with Kudu
>> * Only 13  instances of org.apache.impala.catalog.HdfsTable
>> * 176836 instances of org.apache.impala.analysis.Analyzer - this feels
>> odd to me. I remember one bug a while back in Hive when it would clone
>> the query tree until it ran OOM.
>> * 176796 of those _user fields point at the same user
>> * org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048
>> org.apache.impala.analysis.Analyzer@GlobalState objects pointing at
>> it.
>> *  There is only a single instance of
>> org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to
>> indicate there is only a single query running. I've tracked that query
>> down in CM. The users need to compute stats, but I don't feel that is
>> relevant to this JVM OOM condition.
>>
>> Any pointers on what I might look for?
>>
>> Cheers,
>> Brock

Re: Impalad JVM OOM minutes after restart

Posted by Jeszy <je...@gmail.com>.

Hey,

If it happens shortly after a restart, there is a fair chance you're
crashing while processing the initial catalog topic update. Statestore
logs will tell you how big that was (it takes more memory to process
it than the actual size of the update).
If this is the case, it should also be reproducible, ie. the daemon
will keep restarting and running OOM on initial update until you clear
the metadata cache either by restarting catalog or via a (global)
invalidate metadata.

HTH
On Tue, 21 Aug 2018 at 20:13, Brock Noland <br...@phdata.io> wrote:
>
> Hi folks,
>
> I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at
> any one time. All of a sudden the JVM inside the Impalad started
> running out of memory.
>
> I got a heap dump, but the heap was 32GB, host is 240GB, so it's very
> large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open
> it. I was able to get JHAT to opening it when setting JHAT's heap to
> 160GB. It's pretty unwieldy so much of the JHAT functionality doesn't
> work.
>
> I am spelunking around, but really curious if there is some places I
> should check....
>
> I am only an occasional reader of Impala source so I am just pointing
> out things which felt interesting:
>
> * Impalad was restarted shortly before the JVM OOM
> * Joining Parquet on S3 with Kudu
> * Only 13  instances of org.apache.impala.catalog.HdfsTable
> * 176836 instances of org.apache.impala.analysis.Analyzer - this feels
> odd to me. I remember one bug a while back in Hive when it would clone
> the query tree until it ran OOM.
> * 176796 of those _user fields point at the same user
> * org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048
> org.apache.impala.analysis.Analyzer@GlobalState objects pointing at
> it.
> *  There is only a single instance of
> org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to
> indicate there is only a single query running. I've tracked that query
> down in CM. The users need to compute stats, but I don't feel that is
> relevant to this JVM OOM condition.
>
> Any pointers on what I might look for?
>
> Cheers,
> Brock