You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Ryan Harris <Ry...@zionsbancorp.com> on 2016/05/02 21:24:34 UTC

RE: Container out of memory: ORC format with many dynamic partitions

reading this:
"but when I add 2000 new titles with 300 rows each"
I'm thinking that you are over-partitioning your data....
I'm not sure exactly how that relates to the OOM error you are getting (it may not)....I'd test things out partitioning by date-only.... maybe date + title_type, but adding 2000+ dynamic partitions that each have 300 rows of data in them is asking for problems in Hive IMO...

From: Matt Olson [mailto:maolson42@gmail.com]
Sent: Friday, April 29, 2016 7:50 PM
To: user@hive.apache.org
Subject: Container out of memory: ORC format with many dynamic partitions

Hi all,

I am using Hive 1.0.1 and trying to do a simple insert into an ORC table, creating dynamic partitions. I am selecting from a table partitioned by dt and category, and inserting into a table partitioned by dt, title, and title_type. Other than the partitioning, the tables have the same schemas. Both title and title_type are fields in the first table, and when I insert into the second table, I am using them to create dynamic partitions. The .q file with the CREATE and INSERT statements is copied below.

SET hive.optimize.sort.dynamic.partition=true;
SET hive.exec.orc.memory.pool=1.0;
SET hive.exec.max.dynamic.partitions = 5000;
SET hive.exec.max.dynamic.partitions.pernode = 5000;
SET hive.merge.mapfiles = true;
SET mapred.min.split.size=134217728;
SET mapred.min.split.size.per.node=134217728;
SET mapred.min.split.size.per.rack=134217728;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.max.split.size=134217728;
SET hive.map.aggr.hash.percentmemory=0.125;
SET hive.exec.parallel=true;
SET hive.exec.compress.intermediate=true;
SET hive.exec.compress.output=true;
SET mapred.map.child.java.opts=-Xmx2048M;
SET mapred.child.java.opts=-Xmx2048M;
SET mapred.task.profile=false;

CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
field1 string,
field2 string,
...
field26 string
)
PARTITIONED BY (dt string, title string, title_type string)
STORED AS ORC
LOCATION '/hive/warehouse/partitioned_table'
TBLPROPERTIES ("orc.compress.size"="16000");

INSERT OVERWRITE TABLE dynamic_partition_table PARTITION (dt="2016-04-05", title, title_type)
SELECT
field1,
field2,
...
title,
title_type
FROM original_table
WHERE dt = "2016-04-05";

The original table has about 250 GB of data for 2016-04-05, and about 260 different titles (some titles have very little data, some have ~20 GB). There is generally only one title_type per title. The INSERT action succeeds on that data set, but when I add 2000 new titles with 300 rows each to the original table, I get the following error during the INSERT:

Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.

I've found a couple questions online about this same error message for ORC files with lots of dynamic partitions, on an older version of Hive:
https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write

Based on that and the information about configuration properties at https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat, I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much heap space as possible to the ORC file writers. As you can see from the CREATE TABLE statement, I also decreased the orc.compress.size from the default 256 kb to 16 kb. After making these changes, the INSERT is still failing with the "beyond physical memory limits" error.

I've tried inserting into a table stored as RCFile rather than ORC, and in that case the action succeeds even with the additional 2000 titles.

Can anyone explain how exactly the two ORC parameters above affect the writing of dynamic partitions in ORC files, and why I'm not getting the OOM error when I use the RCFile format instead?  I'd also appreciate any suggestions for other tuning I could do to fix the memory management when using ORC.

Thanks for any help,
Matt

======================================================================
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately.  Thank you.

Re: Container out of memory: ORC format with many dynamic partitions

Posted by Matt Olson <ma...@gmail.com>.

Yes, I've reverted to the default there. Thanks!

On Mon, May 2, 2016 at 7:47 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Hi Matt
>
> As Gopal mentioned below you might have to unset hive.exec.orc.memory.pool=1.0;
> since some memory is required for sorting. Try running with the defaults
> for hive.exec.orc.memory.pool.
>
> Thanks
> Prasanth
>
> On May 2, 2016, at 9:41 PM, Matt Olson <ma...@gmail.com> wrote:
>
> Hi Prasanth,
>
> Thank you for the helpful information. I have been using the default ORC
> stripe size, which I believe is 67,108,864 bytes.
>
> I was able to remove the constant value for dt as you suggested, and set
> hive.optimize.sort.dynamic.partition=true. I saw in the new explain plan
> that the partitions are now being sorted, and rather than a map-only job I
> now have a map-reduce job as expected.
>
> For some reason, though, I'm still getting the same error as before. The
> logs say that the map is 100% complete and the reduce is 99% complete, and
> then it fails with
>
> Container
> [pid=8962,containerID=container_e26_1460661845156_55073_01_005311] is
> running beyond physical memory limits. Current usage: 2.0 GB of 2 GB
> physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing
> container.
>
> This seems strange since there should be plenty of memory for 30 * 5 * 256
> KB, which is only about 38 MB. I have tried reducing the compression buffer
> size from 256 KB to 16 KB and even 1 KB by setting "orc.compress.size" in
> the dynamic_partitioned_table properties, but it still fails with the same
> error.
>
> Thanks again for your explanations so far. I'll keep working on this and
> let you know if I work it out.
>
> Matt
>
> On Mon, May 2, 2016 at 3:43 PM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Hi Matt
>>
>> So it looks like you are hitting the issue that I had mentioned
>> previously.
>> You might need to apply the patch from HIVE-12893. Alternatively, if dt
>> has only one possible value then
>> its better to remove the constant value for dt and the where condition.
>> This will enable sorted dynamic partition optimization which
>> is more scalable when the number of combined partitioned count is huge.
>>
>> What is the stripe size that you are using?
>>
>> The reason why it is causing OOM for ORC is
>>
>> ORC needs to buffer the incoming rows in columnar way before writing it
>> to the file. It buffers until configured
>> stripe size is reached and the entire stripe gets flushed. This is
>> usually not a problem when there few ORC writers.
>> When there are multiple concurrent writers then the available memory is
>> shared across all writers. In case of dynamic
>> partitioning, there will 1 writer per partition and per bucket in each
>> mapper/reducer.
>>
>>  If there are 100 partition, 4 buckets, 25 columns then memory
>> requirement will be
>> 100 * 4 * 25 * 5 (approx. number of internal streams per column) * 256KB
>> (compression buffer size).
>> This can get really huge if the number of partition increases. The way
>> around this memory requirement is to reduce
>> the number of orc writers. hive.optimize.sort.dynamic.partition sort the
>> data on partition column and bucket number
>> so there will be only 1 writer per mapper/reducer reducing the memory
>> requirement to 25 * 5 * 256Kb which is more
>> manageable. If this value needs to be further reduce, reduce the
>> compression buffer size.
>>
>> Because of the bug outlined in HIVE-12893, in your case the optimization
>> to have single orc writer is not kicking in causing OOM.
>>
>> Thanks
>> Prasanth
>>
>> On May 2, 2016, at 3:30 PM, Matt Olson <ma...@gmail.com> wrote:
>>
>> Hi Prasanth,
>>
>> Here is the explain plan for the insert query:
>>
>> OK
>> STAGE DEPENDENCIES:
>>   Stage-1 is a root stage
>>   Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3,
>> Stage-5
>>   Stage-4
>>   Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
>>   Stage-2 depends on stages: Stage-0
>>   Stage-3
>>   Stage-5
>>   Stage-6 depends on stages: Stage-5
>>
>> STAGE PLANS:
>>   Stage: Stage-1
>>     Map Reduce
>>       Map Operator Tree:
>>           TableScan
>>             alias: original_table
>>             Statistics: Num rows: 44962613 Data size: 264560040271 Basic
>> stats: COMPLETE Column stats: NONE
>>             Select Operator
>>               expressions: ...
>>               outputColumnNames: _col0, _col1, _col2, _col3, _col4,
>> _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14,
>> _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23,
>> _col24, _col25, _col26, _col27
>>               Statistics: Num rows: 44962613 Data size: 264560040271
>> Basic stats: COMPLETE Column stats: NONE
>>               File Output Operator
>>                 compressed: true
>>                 Statistics: Num rows: 44962613 Data size: 264560040271
>> Basic stats: COMPLETE Column stats: NONE
>>                 table:
>>                     input format:
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>>                     output format:
>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>>                     serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>>                     name: pin.dynamic_partitioned_table
>>
>>   Stage: Stage-7
>>     Conditional Operator
>>
>>   Stage: Stage-4
>>     Move Operator
>>       files:
>>           hdfs directory: true
>>           destination:
>> hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000
>>
>>   Stage: Stage-0
>>     Move Operator
>>       tables:
>>           partition:
>>             dt 2016-04-05
>>             title_id
>>             title_id_type
>>           replace: true
>>           table:
>>               input format:
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>>               output format:
>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>>               serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>>               name: pin.dynamic_partitioned_table
>>
>>   Stage: Stage-2
>>     Stats-Aggr Operator
>>
>>   Stage: Stage-3
>>     Merge File Operator
>>       Map Operator Tree:
>>           ORC File Merge Operator
>>       merge level: stripe
>>       input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>>
>>   Stage: Stage-5
>>     Merge File Operator
>>       Map Operator Tree:
>>           ORC File Merge Operator
>>       merge level: stripe
>>       input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>>
>>   Stage: Stage-6
>>     Move Operator
>>       files:
>>           hdfs directory: true
>>           destination:
>> hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000
>>
>> Thank you,
>> Matt
>>
>>
>> On Mon, May 2, 2016 at 12:48 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> Hi
>>>
>>> Can you please post explain plan for your insert query? I suspect sorted
>>> dynamic partition optimization is bailing out because of
>>> the constant value for ‘dt' column. If you are not seeing a reducer then
>>> its likely not using the sorted dynamic partition optimization.
>>> You are probably hitting this bug
>>> https://issues.apache.org/jira/browse/HIVE-12893
>>> I can confirm if thats the case by looking at the explain plan.
>>>
>>> Thanks
>>> Prasanth
>>>
>>> On May 2, 2016, at 2:24 PM, Ryan Harris <Ry...@zionsbancorp.com>
>>> wrote:
>>>
>>> reading this:
>>> "but when I add 2000 new titles with 300 rows each"
>>> I'm thinking that you are over-partitioning your data....
>>> I'm not sure exactly how that relates to the OOM error you are getting
>>> (it may not)....I'd test things out partitioning by date-only.... maybe
>>> date + title_type, but adding 2000+ dynamic partitions that each have 300
>>> rows of data in them is asking for problems in Hive IMO...
>>>
>>>
>>> *From:* Matt Olson [mailto:maolson42@gmail.com <ma...@gmail.com>]
>>> *Sent:* Friday, April 29, 2016 7:50 PM
>>> *To:* user@hive.apache.org
>>> *Subject:* Container out of memory: ORC format with many dynamic
>>> partitions
>>>
>>> Hi all,
>>>
>>> I am using Hive 1.0.1 and trying to do a simple insert into an ORC
>>> table, creating dynamic partitions. I am selecting from a table partitioned
>>> by dt and category, and inserting into a table partitioned by dt, title,
>>> and title_type. Other than the partitioning, the tables have the same
>>> schemas. Both title and title_type are fields in the first table, and when
>>> I insert into the second table, I am using them to create dynamic
>>> partitions. The .q file with the CREATE and INSERT statements is copied
>>> below.
>>>
>>> SET hive.optimize.sort.dynamic.partition=true;
>>> SET hive.exec.orc.memory.pool=1.0;
>>> SET hive.exec.max.dynamic.partitions = 5000;
>>> SET hive.exec.max.dynamic.partitions.pernode = 5000;
>>> SET hive.merge.mapfiles = true;
>>> SET mapred.min.split.size=134217728;
>>> SET mapred.min.split.size.per.node=134217728;
>>> SET mapred.min.split.size.per.rack=134217728;
>>> SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>>> SET
>>> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>>> SET mapred.max.split.size=134217728;
>>> SET hive.map.aggr.hash.percentmemory=0.125;
>>> SET hive.exec.parallel=true;
>>> SET hive.exec.compress.intermediate=true;
>>> SET hive.exec.compress.output=true;
>>> SET mapred.map.child.java.opts=-Xmx2048M;
>>> SET mapred.child.java.opts=-Xmx2048M;
>>> SET mapred.task.profile=false;
>>>
>>> CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
>>>
>>> field1 string,
>>>
>>> field2 string,
>>>
>>> ...
>>>
>>> field26 string
>>>
>>> )
>>> PARTITIONED BY (dt string, title string, title_type string)
>>> STORED AS ORC
>>> LOCATION '/hive/warehouse/partitioned_table'
>>> TBLPROPERTIES ("orc.compress.size"="16000");
>>>
>>> INSERT OVERWRITE TABLE dynamic_partition_table PARTITION
>>> (dt="2016-04-05", title, title_type)
>>> SELECT
>>>
>>> field1,
>>>
>>> field2,
>>> ...
>>>
>>> title,
>>>
>>> title_type
>>>
>>> FROM original_table
>>> WHERE dt = "2016-04-05";
>>>
>>> The original table has about 250 GB of data for 2016-04-05, and about
>>> 260 different titles (some titles have very little data, some have ~20 GB).
>>> There is generally only one title_type per title. The INSERT action
>>> succeeds on that data set, but when I add 2000 new titles with 300 rows
>>> each to the original table, I get the following error during the INSERT:
>>>
>>>
>>> Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.
>>>
>>>
>>> I've found a couple questions online about this same error message for
>>> ORC files with lots of dynamic partitions, on an older version of Hive:
>>> https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write
>>>
>>> Based on that and the information about configuration properties at
>>> https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat,
>>> I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much
>>> heap space as possible to the ORC file writers. As you can see from the
>>> CREATE TABLE statement, I also decreased the orc.compress.size from the
>>> default 256 kb to 16 kb. After making these changes, the INSERT is still
>>> failing with the "beyond physical memory limits" error.
>>>
>>> I've tried inserting into a table stored as RCFile rather than ORC, and
>>> in that case the action succeeds even with the additional 2000 titles.
>>>
>>> Can anyone explain how exactly the two ORC parameters above affect the
>>> writing of dynamic partitions in ORC files, and why I'm not getting the OOM
>>> error when I use the RCFile format instead?  I'd also appreciate any
>>> suggestions for other tuning I could do to fix the memory management when
>>> using ORC.
>>>
>>> Thanks for any help,
>>> Matt
>>> ------------------------------
>>> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
>>> CONFIDENTIAL and may contain information that is privileged and exempt from
>>> disclosure under applicable law. If you are neither the intended recipient
>>> nor responsible for delivering the message to the intended recipient,
>>> please note that any dissemination, distribution, copying or the taking of
>>> any action in reliance upon the message is strictly prohibited. If you have
>>> received this communication in error, please notify the sender immediately.
>>> Thank you.
>>>
>>>
>>>
>>
>>
>
>

Re: Container out of memory: ORC format with many dynamic partitions

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.

Hi Matt

As Gopal mentioned below you might have to unset hive.exec.orc.memory.pool=1.0; since some memory is required for sorting. Try running with the defaults for hive.exec.orc.memory.pool.

Thanks
Prasanth
On May 2, 2016, at 9:41 PM, Matt Olson <ma...@gmail.com>> wrote:

Hi Prasanth,

Thank you for the helpful information. I have been using the default ORC stripe size, which I believe is 67,108,864 bytes.

I was able to remove the constant value for dt as you suggested, and set hive.optimize.sort.dynamic.partition=true. I saw in the new explain plan that the partitions are now being sorted, and rather than a map-only job I now have a map-reduce job as expected.

For some reason, though, I'm still getting the same error as before. The logs say that the map is 100% complete and the reduce is 99% complete, and then it fails with

Container [pid=8962,containerID=container_e26_1460661845156_55073_01_005311] is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.

This seems strange since there should be plenty of memory for 30 * 5 * 256 KB, which is only about 38 MB. I have tried reducing the compression buffer size from 256 KB to 16 KB and even 1 KB by setting "orc.compress.size" in the dynamic_partitioned_table properties, but it still fails with the same error.

Thanks again for your explanations so far. I'll keep working on this and let you know if I work it out.

Matt

On Mon, May 2, 2016 at 3:43 PM, Prasanth Jayachandran <pj...@hortonworks.com>> wrote:
Hi Matt

So it looks like you are hitting the issue that I had mentioned previously.
You might need to apply the patch from HIVE-12893. Alternatively, if dt has only one possible value then
its better to remove the constant value for dt and the where condition. This will enable sorted dynamic partition optimization which
is more scalable when the number of combined partitioned count is huge.

What is the stripe size that you are using?

The reason why it is causing OOM for ORC is

ORC needs to buffer the incoming rows in columnar way before writing it to the file. It buffers until configured
stripe size is reached and the entire stripe gets flushed. This is usually not a problem when there few ORC writers.
When there are multiple concurrent writers then the available memory is shared across all writers. In case of dynamic
partitioning, there will 1 writer per partition and per bucket in each mapper/reducer.

 If there are 100 partition, 4 buckets, 25 columns then memory requirement will be
100 * 4 * 25 * 5 (approx. number of internal streams per column) * 256KB (compression buffer size).
This can get really huge if the number of partition increases. The way around this memory requirement is to reduce
the number of orc writers. hive.optimize.sort.dynamic.partition sort the data on partition column and bucket number
so there will be only 1 writer per mapper/reducer reducing the memory requirement to 25 * 5 * 256Kb which is more
manageable. If this value needs to be further reduce, reduce the compression buffer size.

Because of the bug outlined in HIVE-12893, in your case the optimization to have single orc writer is not kicking in causing OOM.

Thanks
Prasanth

On May 2, 2016, at 3:30 PM, Matt Olson <ma...@gmail.com>> wrote:

Hi Prasanth,

Here is the explain plan for the insert query:

OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
  Stage-4
  Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
  Stage-2 depends on stages: Stage-0
  Stage-3
  Stage-5
  Stage-6 depends on stages: Stage-5

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: original_table
            Statistics: Num rows: 44962613 Data size: 264560040271 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: ...
              outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24, _col25, _col26, _col27
              Statistics: Num rows: 44962613 Data size: 264560040271 Basic stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: true
                Statistics: Num rows: 44962613 Data size: 264560040271 Basic stats: COMPLETE Column stats: NONE
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: pin.dynamic_partitioned_table

  Stage: Stage-7
    Conditional Operator

  Stage: Stage-4
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000

  Stage: Stage-0
    Move Operator
      tables:
          partition:
            dt 2016-04-05
            title_id
            title_id_type
          replace: true
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: pin.dynamic_partitioned_table

  Stage: Stage-2
    Stats-Aggr Operator

  Stage: Stage-3
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-5
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-6
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000

Thank you,
Matt


On Mon, May 2, 2016 at 12:48 PM, Prasanth Jayachandran <pj...@hortonworks.com>> wrote:
Hi

Can you please post explain plan for your insert query? I suspect sorted dynamic partition optimization is bailing out because of
the constant value for ‘dt' column. If you are not seeing a reducer then its likely not using the sorted dynamic partition optimization.
You are probably hitting this bug https://issues.apache.org/jira/browse/HIVE-12893
I can confirm if thats the case by looking at the explain plan.

Thanks
Prasanth

On May 2, 2016, at 2:24 PM, Ryan Harris <Ry...@zionsbancorp.com>> wrote:

reading this:
"but when I add 2000 new titles with 300 rows each"
I'm thinking that you are over-partitioning your data....
I'm not sure exactly how that relates to the OOM error you are getting (it may not)....I'd test things out partitioning by date-only.... maybe date + title_type, but adding 2000+ dynamic partitions that each have 300 rows of data in them is asking for problems in Hive IMO...


From: Matt Olson [mailto:maolson42@gmail.com]
Sent: Friday, April 29, 2016 7:50 PM
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Container out of memory: ORC format with many dynamic partitions

Hi all,

I am using Hive 1.0.1 and trying to do a simple insert into an ORC table, creating dynamic partitions. I am selecting from a table partitioned by dt and category, and inserting into a table partitioned by dt, title, and title_type. Other than the partitioning, the tables have the same schemas. Both title and title_type are fields in the first table, and when I insert into the second table, I am using them to create dynamic partitions. The .q file with the CREATE and INSERT statements is copied below.

SET hive.optimize.sort.dynamic.partition=true;
SET hive.exec.orc.memory.pool=1.0;
SET hive.exec.max.dynamic.partitions = 5000;
SET hive.exec.max.dynamic.partitions.pernode = 5000;
SET hive.merge.mapfiles = true;
SET mapred.min.split.size=134217728;
SET mapred.min.split.size.per.node=134217728;
SET mapred.min.split.size.per.rack=134217728;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.max.split.size=134217728;
SET hive.map.aggr.hash.percentmemory=0.125;
SET hive.exec.parallel=true;
SET hive.exec.compress.intermediate=true;
SET hive.exec.compress.output=true;
SET mapred.map.child.java.opts=-Xmx2048M;
SET mapred.child.java.opts=-Xmx2048M;
SET mapred.task.profile=false;

CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
field1 string,
field2 string,
...
field26 string
)
PARTITIONED BY (dt string, title string, title_type string)
STORED AS ORC
LOCATION '/hive/warehouse/partitioned_table'
TBLPROPERTIES ("orc.compress.size"="16000");

INSERT OVERWRITE TABLE dynamic_partition_table PARTITION (dt="2016-04-05", title, title_type)
SELECT
field1,
field2,
...
title,
title_type
FROM original_table
WHERE dt = "2016-04-05";

The original table has about 250 GB of data for 2016-04-05, and about 260 different titles (some titles have very little data, some have ~20 GB). There is generally only one title_type per title. The INSERT action succeeds on that data set, but when I add 2000 new titles with 300 rows each to the original table, I get the following error during the INSERT:


Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.


I've found a couple questions online about this same error message for ORC files with lots of dynamic partitions, on an older version of Hive:
https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write

Based on that and the information about configuration properties at https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat, I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much heap space as possible to the ORC file writers. As you can see from the CREATE TABLE statement, I also decreased the orc.compress.size from the default 256 kb to 16 kb. After making these changes, the INSERT is still failing with the "beyond physical memory limits" error.

I've tried inserting into a table stored as RCFile rather than ORC, and in that case the action succeeds even with the additional 2000 titles.

Can anyone explain how exactly the two ORC parameters above affect the writing of dynamic partitions in ORC files, and why I'm not getting the OOM error when I use the RCFile format instead?  I'd also appreciate any suggestions for other tuning I could do to fix the memory management when using ORC.

Thanks for any help,
Matt
________________________________
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.

Re: Container out of memory: ORC format with many dynamic partitions

Posted by Matt Olson <ma...@gmail.com>.

Hi Prasanth,

Thank you for the helpful information. I have been using the default ORC
stripe size, which I believe is 67,108,864 bytes.

I was able to remove the constant value for dt as you suggested, and set
hive.optimize.sort.dynamic.partition=true. I saw in the new explain plan
that the partitions are now being sorted, and rather than a map-only job I
now have a map-reduce job as expected.

For some reason, though, I'm still getting the same error as before. The
logs say that the map is 100% complete and the reduce is 99% complete, and
then it fails with

Container
[pid=8962,containerID=container_e26_1460661845156_55073_01_005311] is
running beyond physical memory limits. Current usage: 2.0 GB of 2 GB
physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing
container.

This seems strange since there should be plenty of memory for 30 * 5 * 256
KB, which is only about 38 MB. I have tried reducing the compression buffer
size from 256 KB to 16 KB and even 1 KB by setting "orc.compress.size" in
the dynamic_partitioned_table properties, but it still fails with the same
error.

Thanks again for your explanations so far. I'll keep working on this and
let you know if I work it out.

Matt

On Mon, May 2, 2016 at 3:43 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Hi Matt
>
> So it looks like you are hitting the issue that I had mentioned
> previously.
> You might need to apply the patch from HIVE-12893. Alternatively, if dt
> has only one possible value then
> its better to remove the constant value for dt and the where condition.
> This will enable sorted dynamic partition optimization which
> is more scalable when the number of combined partitioned count is huge.
>
> What is the stripe size that you are using?
>
> The reason why it is causing OOM for ORC is
>
> ORC needs to buffer the incoming rows in columnar way before writing it to
> the file. It buffers until configured
> stripe size is reached and the entire stripe gets flushed. This is usually
> not a problem when there few ORC writers.
> When there are multiple concurrent writers then the available memory is
> shared across all writers. In case of dynamic
> partitioning, there will 1 writer per partition and per bucket in each
> mapper/reducer.
>
>  If there are 100 partition, 4 buckets, 25 columns then memory requirement
> will be
> 100 * 4 * 25 * 5 (approx. number of internal streams per column) * 256KB
> (compression buffer size).
> This can get really huge if the number of partition increases. The way
> around this memory requirement is to reduce
> the number of orc writers. hive.optimize.sort.dynamic.partition sort the
> data on partition column and bucket number
> so there will be only 1 writer per mapper/reducer reducing the memory
> requirement to 25 * 5 * 256Kb which is more
> manageable. If this value needs to be further reduce, reduce the
> compression buffer size.
>
> Because of the bug outlined in HIVE-12893, in your case the optimization
> to have single orc writer is not kicking in causing OOM.
>
> Thanks
> Prasanth
>
> On May 2, 2016, at 3:30 PM, Matt Olson <ma...@gmail.com> wrote:
>
> Hi Prasanth,
>
> Here is the explain plan for the insert query:
>
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3,
> Stage-5
>   Stage-4
>   Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
>   Stage-2 depends on stages: Stage-0
>   Stage-3
>   Stage-5
>   Stage-6 depends on stages: Stage-5
>
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: original_table
>             Statistics: Num rows: 44962613 Data size: 264560040271 Basic
> stats: COMPLETE Column stats: NONE
>             Select Operator
>               expressions: ...
>               outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5,
> _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15,
> _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24,
> _col25, _col26, _col27
>               Statistics: Num rows: 44962613 Data size: 264560040271 Basic
> stats: COMPLETE Column stats: NONE
>               File Output Operator
>                 compressed: true
>                 Statistics: Num rows: 44962613 Data size: 264560040271
> Basic stats: COMPLETE Column stats: NONE
>                 table:
>                     input format:
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                     output format:
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>                     serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>                     name: pin.dynamic_partitioned_table
>
>   Stage: Stage-7
>     Conditional Operator
>
>   Stage: Stage-4
>     Move Operator
>       files:
>           hdfs directory: true
>           destination:
> hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000
>
>   Stage: Stage-0
>     Move Operator
>       tables:
>           partition:
>             dt 2016-04-05
>             title_id
>             title_id_type
>           replace: true
>           table:
>               input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>               output format:
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>               serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>               name: pin.dynamic_partitioned_table
>
>   Stage: Stage-2
>     Stats-Aggr Operator
>
>   Stage: Stage-3
>     Merge File Operator
>       Map Operator Tree:
>           ORC File Merge Operator
>       merge level: stripe
>       input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>
>   Stage: Stage-5
>     Merge File Operator
>       Map Operator Tree:
>           ORC File Merge Operator
>       merge level: stripe
>       input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>
>   Stage: Stage-6
>     Move Operator
>       files:
>           hdfs directory: true
>           destination:
> hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000
>
> Thank you,
> Matt
>
>
> On Mon, May 2, 2016 at 12:48 PM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Hi
>>
>> Can you please post explain plan for your insert query? I suspect sorted
>> dynamic partition optimization is bailing out because of
>> the constant value for ‘dt' column. If you are not seeing a reducer then
>> its likely not using the sorted dynamic partition optimization.
>> You are probably hitting this bug
>> https://issues.apache.org/jira/browse/HIVE-12893
>> I can confirm if thats the case by looking at the explain plan.
>>
>> Thanks
>> Prasanth
>>
>> On May 2, 2016, at 2:24 PM, Ryan Harris <Ry...@zionsbancorp.com>
>> wrote:
>>
>> reading this:
>> "but when I add 2000 new titles with 300 rows each"
>> I'm thinking that you are over-partitioning your data....
>> I'm not sure exactly how that relates to the OOM error you are getting
>> (it may not)....I'd test things out partitioning by date-only.... maybe
>> date + title_type, but adding 2000+ dynamic partitions that each have 300
>> rows of data in them is asking for problems in Hive IMO...
>>
>>
>> *From:* Matt Olson [mailto:maolson42@gmail.com <ma...@gmail.com>]
>> *Sent:* Friday, April 29, 2016 7:50 PM
>> *To:* user@hive.apache.org
>> *Subject:* Container out of memory: ORC format with many dynamic
>> partitions
>>
>> Hi all,
>>
>> I am using Hive 1.0.1 and trying to do a simple insert into an ORC table,
>> creating dynamic partitions. I am selecting from a table partitioned by dt
>> and category, and inserting into a table partitioned by dt, title, and
>> title_type. Other than the partitioning, the tables have the same schemas.
>> Both title and title_type are fields in the first table, and when I insert
>> into the second table, I am using them to create dynamic partitions. The .q
>> file with the CREATE and INSERT statements is copied below.
>>
>> SET hive.optimize.sort.dynamic.partition=true;
>> SET hive.exec.orc.memory.pool=1.0;
>> SET hive.exec.max.dynamic.partitions = 5000;
>> SET hive.exec.max.dynamic.partitions.pernode = 5000;
>> SET hive.merge.mapfiles = true;
>> SET mapred.min.split.size=134217728;
>> SET mapred.min.split.size.per.node=134217728;
>> SET mapred.min.split.size.per.rack=134217728;
>> SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>> SET
>> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>> SET mapred.max.split.size=134217728;
>> SET hive.map.aggr.hash.percentmemory=0.125;
>> SET hive.exec.parallel=true;
>> SET hive.exec.compress.intermediate=true;
>> SET hive.exec.compress.output=true;
>> SET mapred.map.child.java.opts=-Xmx2048M;
>> SET mapred.child.java.opts=-Xmx2048M;
>> SET mapred.task.profile=false;
>>
>> CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
>>
>> field1 string,
>>
>> field2 string,
>>
>> ...
>>
>> field26 string
>>
>> )
>> PARTITIONED BY (dt string, title string, title_type string)
>> STORED AS ORC
>> LOCATION '/hive/warehouse/partitioned_table'
>> TBLPROPERTIES ("orc.compress.size"="16000");
>>
>> INSERT OVERWRITE TABLE dynamic_partition_table PARTITION
>> (dt="2016-04-05", title, title_type)
>> SELECT
>>
>> field1,
>>
>> field2,
>> ...
>>
>> title,
>>
>> title_type
>>
>> FROM original_table
>> WHERE dt = "2016-04-05";
>>
>> The original table has about 250 GB of data for 2016-04-05, and about 260
>> different titles (some titles have very little data, some have ~20 GB).
>> There is generally only one title_type per title. The INSERT action
>> succeeds on that data set, but when I add 2000 new titles with 300 rows
>> each to the original table, I get the following error during the INSERT:
>>
>>
>> Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.
>>
>>
>> I've found a couple questions online about this same error message for
>> ORC files with lots of dynamic partitions, on an older version of Hive:
>> https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write
>>
>> Based on that and the information about configuration properties at
>> https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat,
>> I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much
>> heap space as possible to the ORC file writers. As you can see from the
>> CREATE TABLE statement, I also decreased the orc.compress.size from the
>> default 256 kb to 16 kb. After making these changes, the INSERT is still
>> failing with the "beyond physical memory limits" error.
>>
>> I've tried inserting into a table stored as RCFile rather than ORC, and
>> in that case the action succeeds even with the additional 2000 titles.
>>
>> Can anyone explain how exactly the two ORC parameters above affect the
>> writing of dynamic partitions in ORC files, and why I'm not getting the OOM
>> error when I use the RCFile format instead?  I'd also appreciate any
>> suggestions for other tuning I could do to fix the memory management when
>> using ORC.
>>
>> Thanks for any help,
>> Matt
>> ------------------------------
>> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
>> CONFIDENTIAL and may contain information that is privileged and exempt from
>> disclosure under applicable law. If you are neither the intended recipient
>> nor responsible for delivering the message to the intended recipient,
>> please note that any dissemination, distribution, copying or the taking of
>> any action in reliance upon the message is strictly prohibited. If you have
>> received this communication in error, please notify the sender immediately.
>> Thank you.
>>
>>
>>
>
>

Re: Container out of memory: ORC format with many dynamic partitions

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.

Hi Matt

So it looks like you are hitting the issue that I had mentioned previously.
You might need to apply the patch from HIVE-12893. Alternatively, if dt has only one possible value then
its better to remove the constant value for dt and the where condition. This will enable sorted dynamic partition optimization which
is more scalable when the number of combined partitioned count is huge.

What is the stripe size that you are using?

The reason why it is causing OOM for ORC is

ORC needs to buffer the incoming rows in columnar way before writing it to the file. It buffers until configured
stripe size is reached and the entire stripe gets flushed. This is usually not a problem when there few ORC writers.
When there are multiple concurrent writers then the available memory is shared across all writers. In case of dynamic
partitioning, there will 1 writer per partition and per bucket in each mapper/reducer.

 If there are 100 partition, 4 buckets, 25 columns then memory requirement will be
100 * 4 * 25 * 5 (approx. number of internal streams per column) * 256KB (compression buffer size).
This can get really huge if the number of partition increases. The way around this memory requirement is to reduce
the number of orc writers. hive.optimize.sort.dynamic.partition sort the data on partition column and bucket number
so there will be only 1 writer per mapper/reducer reducing the memory requirement to 25 * 5 * 256Kb which is more
manageable. If this value needs to be further reduce, reduce the compression buffer size.

Because of the bug outlined in HIVE-12893, in your case the optimization to have single orc writer is not kicking in causing OOM.

Thanks
Prasanth

On May 2, 2016, at 3:30 PM, Matt Olson <ma...@gmail.com>> wrote:

Hi Prasanth,

Here is the explain plan for the insert query:

OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
  Stage-4
  Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
  Stage-2 depends on stages: Stage-0
  Stage-3
  Stage-5
  Stage-6 depends on stages: Stage-5

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: original_table
            Statistics: Num rows: 44962613 Data size: 264560040271 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: ...
              outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24, _col25, _col26, _col27
              Statistics: Num rows: 44962613 Data size: 264560040271 Basic stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: true
                Statistics: Num rows: 44962613 Data size: 264560040271 Basic stats: COMPLETE Column stats: NONE
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: pin.dynamic_partitioned_table

  Stage: Stage-7
    Conditional Operator

  Stage: Stage-4
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000

  Stage: Stage-0
    Move Operator
      tables:
          partition:
            dt 2016-04-05
            title_id
            title_id_type
          replace: true
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: pin.dynamic_partitioned_table

  Stage: Stage-2
    Stats-Aggr Operator

  Stage: Stage-3
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-5
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-6
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000

Thank you,
Matt


On Mon, May 2, 2016 at 12:48 PM, Prasanth Jayachandran <pj...@hortonworks.com>> wrote:
Hi

Can you please post explain plan for your insert query? I suspect sorted dynamic partition optimization is bailing out because of
the constant value for ‘dt' column. If you are not seeing a reducer then its likely not using the sorted dynamic partition optimization.
You are probably hitting this bug https://issues.apache.org/jira/browse/HIVE-12893
I can confirm if thats the case by looking at the explain plan.

Thanks
Prasanth

On May 2, 2016, at 2:24 PM, Ryan Harris <Ry...@zionsbancorp.com>> wrote:

reading this:
"but when I add 2000 new titles with 300 rows each"
I'm thinking that you are over-partitioning your data....
I'm not sure exactly how that relates to the OOM error you are getting (it may not)....I'd test things out partitioning by date-only.... maybe date + title_type, but adding 2000+ dynamic partitions that each have 300 rows of data in them is asking for problems in Hive IMO...


From: Matt Olson [mailto:maolson42@gmail.com]
Sent: Friday, April 29, 2016 7:50 PM
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Container out of memory: ORC format with many dynamic partitions

Hi all,

I am using Hive 1.0.1 and trying to do a simple insert into an ORC table, creating dynamic partitions. I am selecting from a table partitioned by dt and category, and inserting into a table partitioned by dt, title, and title_type. Other than the partitioning, the tables have the same schemas. Both title and title_type are fields in the first table, and when I insert into the second table, I am using them to create dynamic partitions. The .q file with the CREATE and INSERT statements is copied below.

SET hive.optimize.sort.dynamic.partition=true;
SET hive.exec.orc.memory.pool=1.0;
SET hive.exec.max.dynamic.partitions = 5000;
SET hive.exec.max.dynamic.partitions.pernode = 5000;
SET hive.merge.mapfiles = true;
SET mapred.min.split.size=134217728;
SET mapred.min.split.size.per.node=134217728;
SET mapred.min.split.size.per.rack=134217728;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.max.split.size=134217728;
SET hive.map.aggr.hash.percentmemory=0.125;
SET hive.exec.parallel=true;
SET hive.exec.compress.intermediate=true;
SET hive.exec.compress.output=true;
SET mapred.map.child.java.opts=-Xmx2048M;
SET mapred.child.java.opts=-Xmx2048M;
SET mapred.task.profile=false;

CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
field1 string,
field2 string,
...
field26 string
)
PARTITIONED BY (dt string, title string, title_type string)
STORED AS ORC
LOCATION '/hive/warehouse/partitioned_table'
TBLPROPERTIES ("orc.compress.size"="16000");

INSERT OVERWRITE TABLE dynamic_partition_table PARTITION (dt="2016-04-05", title, title_type)
SELECT
field1,
field2,
...
title,
title_type
FROM original_table
WHERE dt = "2016-04-05";

The original table has about 250 GB of data for 2016-04-05, and about 260 different titles (some titles have very little data, some have ~20 GB). There is generally only one title_type per title. The INSERT action succeeds on that data set, but when I add 2000 new titles with 300 rows each to the original table, I get the following error during the INSERT:


Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.


I've found a couple questions online about this same error message for ORC files with lots of dynamic partitions, on an older version of Hive:
https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write

Based on that and the information about configuration properties at https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat, I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much heap space as possible to the ORC file writers. As you can see from the CREATE TABLE statement, I also decreased the orc.compress.size from the default 256 kb to 16 kb. After making these changes, the INSERT is still failing with the "beyond physical memory limits" error.

I've tried inserting into a table stored as RCFile rather than ORC, and in that case the action succeeds even with the additional 2000 titles.

Can anyone explain how exactly the two ORC parameters above affect the writing of dynamic partitions in ORC files, and why I'm not getting the OOM error when I use the RCFile format instead?  I'd also appreciate any suggestions for other tuning I could do to fix the memory management when using ORC.

Thanks for any help,
Matt
________________________________
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.

Re: Container out of memory: ORC format with many dynamic partitions

Posted by Matt Olson <ma...@gmail.com>.

Hi Prasanth,

Here is the explain plan for the insert query:

OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
  Stage-4
  Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
  Stage-2 depends on stages: Stage-0
  Stage-3
  Stage-5
  Stage-6 depends on stages: Stage-5

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: original_table
            Statistics: Num rows: 44962613 Data size: 264560040271 Basic
stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: ...
              outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5,
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15,
_col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24,
_col25, _col26, _col27
              Statistics: Num rows: 44962613 Data size: 264560040271 Basic
stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: true
                Statistics: Num rows: 44962613 Data size: 264560040271
Basic stats: COMPLETE Column stats: NONE
                table:
                    input format:
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format:
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: pin.dynamic_partitioned_table

  Stage: Stage-7
    Conditional Operator

  Stage: Stage-4
    Move Operator
      files:
          hdfs directory: true
          destination:
hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000

  Stage: Stage-0
    Move Operator
      tables:
          partition:
            dt 2016-04-05
            title_id
            title_id_type
          replace: true
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format:
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: pin.dynamic_partitioned_table

  Stage: Stage-2
    Stats-Aggr Operator

  Stage: Stage-3
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-5
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-6
    Move Operator
      files:
          hdfs directory: true
          destination:
hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000

Thank you,
Matt


On Mon, May 2, 2016 at 12:48 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Hi
>
> Can you please post explain plan for your insert query? I suspect sorted
> dynamic partition optimization is bailing out because of
> the constant value for ‘dt' column. If you are not seeing a reducer then
> its likely not using the sorted dynamic partition optimization.
> You are probably hitting this bug
> https://issues.apache.org/jira/browse/HIVE-12893
> I can confirm if thats the case by looking at the explain plan.
>
> Thanks
> Prasanth
>
> On May 2, 2016, at 2:24 PM, Ryan Harris <Ry...@zionsbancorp.com>
> wrote:
>
> reading this:
> "but when I add 2000 new titles with 300 rows each"
> I'm thinking that you are over-partitioning your data....
> I'm not sure exactly how that relates to the OOM error you are getting (it
> may not)....I'd test things out partitioning by date-only.... maybe date +
> title_type, but adding 2000+ dynamic partitions that each have 300 rows of
> data in them is asking for problems in Hive IMO...
>
>
> *From:* Matt Olson [mailto:maolson42@gmail.com <ma...@gmail.com>]
> *Sent:* Friday, April 29, 2016 7:50 PM
> *To:* user@hive.apache.org
> *Subject:* Container out of memory: ORC format with many dynamic
> partitions
>
> Hi all,
>
> I am using Hive 1.0.1 and trying to do a simple insert into an ORC table,
> creating dynamic partitions. I am selecting from a table partitioned by dt
> and category, and inserting into a table partitioned by dt, title, and
> title_type. Other than the partitioning, the tables have the same schemas.
> Both title and title_type are fields in the first table, and when I insert
> into the second table, I am using them to create dynamic partitions. The .q
> file with the CREATE and INSERT statements is copied below.
>
> SET hive.optimize.sort.dynamic.partition=true;
> SET hive.exec.orc.memory.pool=1.0;
> SET hive.exec.max.dynamic.partitions = 5000;
> SET hive.exec.max.dynamic.partitions.pernode = 5000;
> SET hive.merge.mapfiles = true;
> SET mapred.min.split.size=134217728;
> SET mapred.min.split.size.per.node=134217728;
> SET mapred.min.split.size.per.rack=134217728;
> SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
> SET
> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
> SET mapred.max.split.size=134217728;
> SET hive.map.aggr.hash.percentmemory=0.125;
> SET hive.exec.parallel=true;
> SET hive.exec.compress.intermediate=true;
> SET hive.exec.compress.output=true;
> SET mapred.map.child.java.opts=-Xmx2048M;
> SET mapred.child.java.opts=-Xmx2048M;
> SET mapred.task.profile=false;
>
> CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
>
> field1 string,
>
> field2 string,
>
> ...
>
> field26 string
>
> )
> PARTITIONED BY (dt string, title string, title_type string)
> STORED AS ORC
> LOCATION '/hive/warehouse/partitioned_table'
> TBLPROPERTIES ("orc.compress.size"="16000");
>
> INSERT OVERWRITE TABLE dynamic_partition_table PARTITION (dt="2016-04-05",
> title, title_type)
> SELECT
>
> field1,
>
> field2,
> ...
>
> title,
>
> title_type
>
> FROM original_table
> WHERE dt = "2016-04-05";
>
> The original table has about 250 GB of data for 2016-04-05, and about 260
> different titles (some titles have very little data, some have ~20 GB).
> There is generally only one title_type per title. The INSERT action
> succeeds on that data set, but when I add 2000 new titles with 300 rows
> each to the original table, I get the following error during the INSERT:
>
>
> Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.
>
>
> I've found a couple questions online about this same error message for ORC
> files with lots of dynamic partitions, on an older version of Hive:
> https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write
>
> Based on that and the information about configuration properties at
> https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat,
> I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much
> heap space as possible to the ORC file writers. As you can see from the
> CREATE TABLE statement, I also decreased the orc.compress.size from the
> default 256 kb to 16 kb. After making these changes, the INSERT is still
> failing with the "beyond physical memory limits" error.
>
> I've tried inserting into a table stored as RCFile rather than ORC, and in
> that case the action succeeds even with the additional 2000 titles.
>
> Can anyone explain how exactly the two ORC parameters above affect the
> writing of dynamic partitions in ORC files, and why I'm not getting the OOM
> error when I use the RCFile format instead?  I'd also appreciate any
> suggestions for other tuning I could do to fix the memory management when
> using ORC.
>
> Thanks for any help,
> Matt
> ------------------------------
> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
> CONFIDENTIAL and may contain information that is privileged and exempt from
> disclosure under applicable law. If you are neither the intended recipient
> nor responsible for delivering the message to the intended recipient,
> please note that any dissemination, distribution, copying or the taking of
> any action in reliance upon the message is strictly prohibited. If you have
> received this communication in error, please notify the sender immediately.
> Thank you.
>
>
>

Re: Container out of memory: ORC format with many dynamic partitions

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.

Hi

Can you please post explain plan for your insert query? I suspect sorted dynamic partition optimization is bailing out because of
the constant value for ‘dt' column. If you are not seeing a reducer then its likely not using the sorted dynamic partition optimization.
You are probably hitting this bug https://issues.apache.org/jira/browse/HIVE-12893
I can confirm if thats the case by looking at the explain plan.

Thanks
Prasanth

On May 2, 2016, at 2:24 PM, Ryan Harris <Ry...@zionsbancorp.com>> wrote:

reading this:
"but when I add 2000 new titles with 300 rows each"
I'm thinking that you are over-partitioning your data....
I'm not sure exactly how that relates to the OOM error you are getting (it may not)....I'd test things out partitioning by date-only.... maybe date + title_type, but adding 2000+ dynamic partitions that each have 300 rows of data in them is asking for problems in Hive IMO...


From: Matt Olson [mailto:maolson42@gmail.com]
Sent: Friday, April 29, 2016 7:50 PM
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Container out of memory: ORC format with many dynamic partitions

Hi all,

I am using Hive 1.0.1 and trying to do a simple insert into an ORC table, creating dynamic partitions. I am selecting from a table partitioned by dt and category, and inserting into a table partitioned by dt, title, and title_type. Other than the partitioning, the tables have the same schemas. Both title and title_type are fields in the first table, and when I insert into the second table, I am using them to create dynamic partitions. The .q file with the CREATE and INSERT statements is copied below.

SET hive.optimize.sort.dynamic.partition=true;
SET hive.exec.orc.memory.pool=1.0;
SET hive.exec.max.dynamic.partitions = 5000;
SET hive.exec.max.dynamic.partitions.pernode = 5000;
SET hive.merge.mapfiles = true;
SET mapred.min.split.size=134217728;
SET mapred.min.split.size.per.node=134217728;
SET mapred.min.split.size.per.rack=134217728;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.max.split.size=134217728;
SET hive.map.aggr.hash.percentmemory=0.125;
SET hive.exec.parallel=true;
SET hive.exec.compress.intermediate=true;
SET hive.exec.compress.output=true;
SET mapred.map.child.java.opts=-Xmx2048M;
SET mapred.child.java.opts=-Xmx2048M;
SET mapred.task.profile=false;

CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
field1 string,
field2 string,
...
field26 string
)
PARTITIONED BY (dt string, title string, title_type string)
STORED AS ORC
LOCATION '/hive/warehouse/partitioned_table'
TBLPROPERTIES ("orc.compress.size"="16000");

INSERT OVERWRITE TABLE dynamic_partition_table PARTITION (dt="2016-04-05", title, title_type)
SELECT
field1,
field2,
...
title,
title_type
FROM original_table
WHERE dt = "2016-04-05";

The original table has about 250 GB of data for 2016-04-05, and about 260 different titles (some titles have very little data, some have ~20 GB). There is generally only one title_type per title. The INSERT action succeeds on that data set, but when I add 2000 new titles with 300 rows each to the original table, I get the following error during the INSERT:


Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing container.


I've found a couple questions online about this same error message for ORC files with lots of dynamic partitions, on an older version of Hive:
https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write

Based on that and the information about configuration properties at https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat, I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much heap space as possible to the ORC file writers. As you can see from the CREATE TABLE statement, I also decreased the orc.compress.size from the default 256 kb to 16 kb. After making these changes, the INSERT is still failing with the "beyond physical memory limits" error.

I've tried inserting into a table stored as RCFile rather than ORC, and in that case the action succeeds even with the additional 2000 titles.

Can anyone explain how exactly the two ORC parameters above affect the writing of dynamic partitions in ORC files, and why I'm not getting the OOM error when I use the RCFile format instead?  I'd also appreciate any suggestions for other tuning I could do to fix the memory management when using ORC.

Thanks for any help,
Matt
________________________________
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.