You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Norbert Burger <no...@gmail.com> on 2014/02/22 04:05:03 UTC

Metastore performance on HDFS-backed table with 15000+ partitions

Hi folks,

We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.

In Hive, we have an external table backed by HDFS which has a 3-level
partitioning scheme that currently has 15000+ partitions.

Within the last day or so, queries against this table have started failing.
 A simple query which shouldn't take very long at all (select * from ...
limit 10) fails after several minutes with a client OOME.  I get the same
outcome on count(*) queries (which I thought wouldn't send any data back to
the client).  Increasing heap on both client and server JVMs (via
HADOOP_HEAPSIZE) doesn't have any impact.

We were only able to work around the client OOMEs by reducing the number of
partitions in the table.

Looking at the MySQL querylog, my thought is that the Hive client is quite
busy making requests for partitions that doesn't contribute to the query.
 Has anyone else had similar experience against tables this size?

Thanks,
Norbert

Re: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by Stephen Sprague <sp...@gmail.com>.
most interesting.  we had an issue recently with querying a table with 15K
columns and running out of heap storage but not 15K partitions.

15K partitions shouldn't be causing a problem in my humble estimation.
Maybe a million but not 15K. :)

So is there a traceback we can look at? or its not heap but real memory?

and this is the local hive client? or the hiveserver?

Thanks,
Stephen.



On Fri, Feb 21, 2014 at 7:05 PM, Norbert Burger <no...@gmail.com>wrote:

> Hi folks,
>
> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
>
> In Hive, we have an external table backed by HDFS which has a 3-level
> partitioning scheme that currently has 15000+ partitions.
>
> Within the last day or so, queries against this table have started
> failing.  A simple query which shouldn't take very long at all (select *
> from ... limit 10) fails after several minutes with a client OOME.  I get
> the same outcome on count(*) queries (which I thought wouldn't send any
> data back to the client).  Increasing heap on both client and server JVMs
> (via HADOOP_HEAPSIZE) doesn't have any impact.
>
> We were only able to work around the client OOMEs by reducing the number
> of partitions in the table.
>
> Looking at the MySQL querylog, my thought is that the Hive client is quite
> busy making requests for partitions that doesn't contribute to the query.
>  Has anyone else had similar experience against tables this size?
>
> Thanks,
> Norbert
>

RE: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by java8964 <ja...@hotmail.com>.
That is good to know.
We are using Hive 0.9. Right now the biggest table contains 2 years data, and we partitioned by hour, as the data volume is big.
So right now, it has 2*365*24 around 17000+ partitions. So far we didn't see too much problem yet, but I do have some concerns about it.
We are using IBM BigInsight, which is using derby as the hive metastore, not as mysql as my most experience was on.
Yong

From: norbert.burger@gmail.com
Date: Thu, 27 Feb 2014 07:57:05 -0500
Subject: Re: Metastore performance on HDFS-backed table with 15000+ partitions
To: user@hive.apache.org

Thanks everyone for the feedback.  Just to follow up in case someone else runs into this: I can confirm that local client works around the OOMEs, but it's still very slow.
It does seem like we were hitting some combination of HIVE-4051 and HIVE-5158.  We'll try reducing partition count first, and then switch to 0.12.0 if that doesn't improve things significantly.


Fwiw - http://www.slideshare.net/oom65/optimize-hivequeriespptx also has has some good rules-of-thumb.



Norbert

On Sat, Feb 22, 2014 at 1:27 PM, Stephen Sprague <sp...@gmail.com> wrote:


yeah. That traceback pretty much spells it out - its metastore related and that's where the partitions are stored.





I'm with the others on this. HiveServer2 is still a little jankey on memory management.  I bounce mine once a day at midnight just to play it safe (and because i can.)





Again, for me, i use the hive local client for production jobs and remote client for adhoc stuff.

you may wish to confirm the local hive client has no problem with your query.





other than that you either increase your heap size on the HS2 process and hope for the best and/or file a bug report.





bottom line hiveserver2 isn't production bullet proof just yet, IMHO. Others may disagree.

Regards,
Stephen.







On Sat, Feb 22, 2014 at 9:50 AM, Norbert Burger <no...@gmail.com> wrote:




Thanks all for the quick feedback.
I'm a bit surprised to learn 15k is considered too much, but we can work around it.  I guess I'm also curious why the query planner needs to know about all partitions even in the case of simple select/limit queries, where the query might target only a single partition.






Here's the client-side OOME with HADOOP_HEAPSIZE=2048:
https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt







This was from a CDH4.3.0 client hitting HIveServer2.  Any idea what's consuming the heap?
Norbert





On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo <ed...@gmail.com> wrote:






Dont make tbales with that many partitions. It is an anti pattern. I hwve tables with 2000 partitions a day and that is rewlly to many. Hive needs go load that informqtion into memory to plan the query.



On Saturday, February 22, 2014, Terje Marthinussen <tm...@gmail.com> wrote:

> Query optimizer in hive is awful on memory consumption. 15k partitions sounds a bit early for it to fail though..
>
> What is your heap size?
>
> Regards,
> Terje
>
>> On 22 Feb 2014, at 12:05, Norbert Burger <no...@gmail.com> wrote:







>>
>> Hi folks,
>>
>> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
>>
>> In Hive, we have an external table backed by HDFS which has a 3-level partitioning scheme that currently has 15000+ partitions.







>>
>> Within the last day or so, queries against this table have started failing.  A simple query which shouldn't take very long at all (select * from ... limit 10) fails after several minutes with a client OOME.  I get the same outcome on count(*) queries (which I thought wouldn't send any data back to the client).  Increasing heap on both client and server JVMs (via HADOOP_HEAPSIZE) doesn't have any impact.







>>
>> We were only able to work around the client OOMEs by reducing the number of partitions in the table.
>>
>> Looking at the MySQL querylog, my thought is that the Hive client is quite busy making requests for partitions that doesn't contribute to the query.  Has anyone else had similar experience against tables this size?







>>
>> Thanks,
>> Norbert
>

-- 
Sorry this was sent from mobile. Will do less grammar and spell check than usual.






 		 	   		  

Re: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by Norbert Burger <no...@gmail.com>.
Thanks everyone for the feedback.  Just to follow up in case someone else
runs into this: I can confirm that local client works around the OOMEs, but
it's still very slow.

It does seem like we were hitting some combination of HIVE-4051 and
HIVE-5158.  We'll try reducing partition count first, and then switch to
0.12.0 if that doesn't improve things significantly.

Fwiw - http://www.slideshare.net/oom65/optimize-hivequeriespptx also has
has some good rules-of-thumb.

Norbert


On Sat, Feb 22, 2014 at 1:27 PM, Stephen Sprague <sp...@gmail.com> wrote:

> yeah. That traceback pretty much spells it out - its metastore related and
> that's where the partitions are stored.
>
> I'm with the others on this. HiveServer2 is still a little jankey on
> memory management.  I bounce mine once a day at midnight just to play it
> safe (and because i can.)
>
> Again, for me, i use the hive local client for production jobs and remote
> client for adhoc stuff.
>
> you may wish to confirm the local hive client has no problem with your
> query.
>
> other than that you either increase your heap size on the HS2 process and
> hope for the best and/or file a bug report.
>
> bottom line hiveserver2 isn't production bullet proof just yet, IMHO.
> Others may disagree.
>
> Regards,
> Stephen.
>
>
>
> On Sat, Feb 22, 2014 at 9:50 AM, Norbert Burger <no...@gmail.com>wrote:
>
>> Thanks all for the quick feedback.
>>
>> I'm a bit surprised to learn 15k is considered too much, but we can work
>> around it.  I guess I'm also curious why the query planner needs to know
>> about all partitions even in the case of simple select/limit queries, where
>> the query might target only a single partition.
>>
>> Here's the client-side OOME with HADOOP_HEAPSIZE=2048:
>>
>>
>> https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt
>>
>> This was from a CDH4.3.0 client hitting HIveServer2.  Any idea what's
>> consuming the heap?
>>
>> Norbert
>>
>>
>> On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo <ed...@gmail.com>wrote:
>>
>>> Dont make tbales with that many partitions. It is an anti pattern. I
>>> hwve tables with 2000 partitions a day and that is rewlly to many. Hive
>>> needs go load that informqtion into memory to plan the query.
>>>
>>>
>>> On Saturday, February 22, 2014, Terje Marthinussen <
>>> tmarthinussen@gmail.com> wrote:
>>> > Query optimizer in hive is awful on memory consumption. 15k partitions
>>> sounds a bit early for it to fail though..
>>> >
>>> > What is your heap size?
>>> >
>>> > Regards,
>>> > Terje
>>> >
>>> >> On 22 Feb 2014, at 12:05, Norbert Burger <no...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hi folks,
>>> >>
>>> >> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
>>> >>
>>> >> In Hive, we have an external table backed by HDFS which has a 3-level
>>> partitioning scheme that currently has 15000+ partitions.
>>> >>
>>> >> Within the last day or so, queries against this table have started
>>> failing.  A simple query which shouldn't take very long at all (select *
>>> from ... limit 10) fails after several minutes with a client OOME.  I get
>>> the same outcome on count(*) queries (which I thought wouldn't send any
>>> data back to the client).  Increasing heap on both client and server JVMs
>>> (via HADOOP_HEAPSIZE) doesn't have any impact.
>>> >>
>>> >> We were only able to work around the client OOMEs by reducing the
>>> number of partitions in the table.
>>> >>
>>> >> Looking at the MySQL querylog, my thought is that the Hive client is
>>> quite busy making requests for partitions that doesn't contribute to the
>>> query.  Has anyone else had similar experience against tables this size?
>>> >>
>>> >> Thanks,
>>> >> Norbert
>>> >
>>>
>>> --
>>> Sorry this was sent from mobile. Will do less grammar and spell check
>>> than usual.
>>>
>>
>>
>

Re: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by Stephen Sprague <sp...@gmail.com>.
yeah. That traceback pretty much spells it out - its metastore related and
that's where the partitions are stored.

I'm with the others on this. HiveServer2 is still a little jankey on memory
management.  I bounce mine once a day at midnight just to play it safe (and
because i can.)

Again, for me, i use the hive local client for production jobs and remote
client for adhoc stuff.

you may wish to confirm the local hive client has no problem with your
query.

other than that you either increase your heap size on the HS2 process and
hope for the best and/or file a bug report.

bottom line hiveserver2 isn't production bullet proof just yet, IMHO.
Others may disagree.

Regards,
Stephen.



On Sat, Feb 22, 2014 at 9:50 AM, Norbert Burger <no...@gmail.com>wrote:

> Thanks all for the quick feedback.
>
> I'm a bit surprised to learn 15k is considered too much, but we can work
> around it.  I guess I'm also curious why the query planner needs to know
> about all partitions even in the case of simple select/limit queries, where
> the query might target only a single partition.
>
> Here's the client-side OOME with HADOOP_HEAPSIZE=2048:
>
>
> https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt
>
> This was from a CDH4.3.0 client hitting HIveServer2.  Any idea what's
> consuming the heap?
>
> Norbert
>
>
> On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> Dont make tbales with that many partitions. It is an anti pattern. I hwve
>> tables with 2000 partitions a day and that is rewlly to many. Hive needs go
>> load that informqtion into memory to plan the query.
>>
>>
>> On Saturday, February 22, 2014, Terje Marthinussen <
>> tmarthinussen@gmail.com> wrote:
>> > Query optimizer in hive is awful on memory consumption. 15k partitions
>> sounds a bit early for it to fail though..
>> >
>> > What is your heap size?
>> >
>> > Regards,
>> > Terje
>> >
>> >> On 22 Feb 2014, at 12:05, Norbert Burger <no...@gmail.com>
>> wrote:
>> >>
>> >> Hi folks,
>> >>
>> >> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
>> >>
>> >> In Hive, we have an external table backed by HDFS which has a 3-level
>> partitioning scheme that currently has 15000+ partitions.
>> >>
>> >> Within the last day or so, queries against this table have started
>> failing.  A simple query which shouldn't take very long at all (select *
>> from ... limit 10) fails after several minutes with a client OOME.  I get
>> the same outcome on count(*) queries (which I thought wouldn't send any
>> data back to the client).  Increasing heap on both client and server JVMs
>> (via HADOOP_HEAPSIZE) doesn't have any impact.
>> >>
>> >> We were only able to work around the client OOMEs by reducing the
>> number of partitions in the table.
>> >>
>> >> Looking at the MySQL querylog, my thought is that the Hive client is
>> quite busy making requests for partitions that doesn't contribute to the
>> query.  Has anyone else had similar experience against tables this size?
>> >>
>> >> Thanks,
>> >> Norbert
>> >
>>
>> --
>> Sorry this was sent from mobile. Will do less grammar and spell check
>> than usual.
>>
>
>

Re: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by Norbert Burger <no...@gmail.com>.
Thanks all for the quick feedback.

I'm a bit surprised to learn 15k is considered too much, but we can work
around it.  I guess I'm also curious why the query planner needs to know
about all partitions even in the case of simple select/limit queries, where
the query might target only a single partition.

Here's the client-side OOME with HADOOP_HEAPSIZE=2048:

https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt

This was from a CDH4.3.0 client hitting HIveServer2.  Any idea what's
consuming the heap?

Norbert


On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo <ed...@gmail.com>wrote:

> Dont make tbales with that many partitions. It is an anti pattern. I hwve
> tables with 2000 partitions a day and that is rewlly to many. Hive needs go
> load that informqtion into memory to plan the query.
>
>
> On Saturday, February 22, 2014, Terje Marthinussen <
> tmarthinussen@gmail.com> wrote:
> > Query optimizer in hive is awful on memory consumption. 15k partitions
> sounds a bit early for it to fail though..
> >
> > What is your heap size?
> >
> > Regards,
> > Terje
> >
> >> On 22 Feb 2014, at 12:05, Norbert Burger <no...@gmail.com>
> wrote:
> >>
> >> Hi folks,
> >>
> >> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
> >>
> >> In Hive, we have an external table backed by HDFS which has a 3-level
> partitioning scheme that currently has 15000+ partitions.
> >>
> >> Within the last day or so, queries against this table have started
> failing.  A simple query which shouldn't take very long at all (select *
> from ... limit 10) fails after several minutes with a client OOME.  I get
> the same outcome on count(*) queries (which I thought wouldn't send any
> data back to the client).  Increasing heap on both client and server JVMs
> (via HADOOP_HEAPSIZE) doesn't have any impact.
> >>
> >> We were only able to work around the client OOMEs by reducing the
> number of partitions in the table.
> >>
> >> Looking at the MySQL querylog, my thought is that the Hive client is
> quite busy making requests for partitions that doesn't contribute to the
> query.  Has anyone else had similar experience against tables this size?
> >>
> >> Thanks,
> >> Norbert
> >
>
> --
> Sorry this was sent from mobile. Will do less grammar and spell check than
> usual.
>

Re: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by Edward Capriolo <ed...@gmail.com>.
Dont make tbales with that many partitions. It is an anti pattern. I hwve
tables with 2000 partitions a day and that is rewlly to many. Hive needs go
load that informqtion into memory to plan the query.

On Saturday, February 22, 2014, Terje Marthinussen <tm...@gmail.com>
wrote:
> Query optimizer in hive is awful on memory consumption. 15k partitions
sounds a bit early for it to fail though..
>
> What is your heap size?
>
> Regards,
> Terje
>
>> On 22 Feb 2014, at 12:05, Norbert Burger <no...@gmail.com>
wrote:
>>
>> Hi folks,
>>
>> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
>>
>> In Hive, we have an external table backed by HDFS which has a 3-level
partitioning scheme that currently has 15000+ partitions.
>>
>> Within the last day or so, queries against this table have started
failing.  A simple query which shouldn't take very long at all (select *
from ... limit 10) fails after several minutes with a client OOME.  I get
the same outcome on count(*) queries (which I thought wouldn't send any
data back to the client).  Increasing heap on both client and server JVMs
(via HADOOP_HEAPSIZE) doesn't have any impact.
>>
>> We were only able to work around the client OOMEs by reducing the number
of partitions in the table.
>>
>> Looking at the MySQL querylog, my thought is that the Hive client is
quite busy making requests for partitions that doesn't contribute to the
query.  Has anyone else had similar experience against tables this size?
>>
>> Thanks,
>> Norbert
>

-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Re: Metastore performance on HDFS-backed table with 15000+ partitions

Posted by Terje Marthinussen <tm...@gmail.com>.
Query optimizer in hive is awful on memory consumption. 15k partitions sounds a bit early for it to fail though.. 

What is your heap size?

Regards,
Terje

> On 22 Feb 2014, at 12:05, Norbert Burger <no...@gmail.com> wrote:
> 
> Hi folks,
> 
> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
> 
> In Hive, we have an external table backed by HDFS which has a 3-level partitioning scheme that currently has 15000+ partitions.
> 
> Within the last day or so, queries against this table have started failing.  A simple query which shouldn't take very long at all (select * from ... limit 10) fails after several minutes with a client OOME.  I get the same outcome on count(*) queries (which I thought wouldn't send any data back to the client).  Increasing heap on both client and server JVMs (via HADOOP_HEAPSIZE) doesn't have any impact.
> 
> We were only able to work around the client OOMEs by reducing the number of partitions in the table.
> 
> Looking at the MySQL querylog, my thought is that the Hive client is quite busy making requests for partitions that doesn't contribute to the query.  Has anyone else had similar experience against tables this size?
> 
> Thanks,
> Norbert