You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2013/08/29 22:07:51 UTC

[jira] [Updated] (HIVE-5169) Sorted Bucketed Partitioned Insert does not sort by dynamic partition column causing reducer OOMs/lease-expiry errors

     [ https://issues.apache.org/jira/browse/HIVE-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gopal V updated HIVE-5169:
--------------------------

    Attachment: orc2.sql

Scale=2 ORC loader.

To generate TPC-h text tables, you can use https://github.com/t3rmin4t0r/tpch-gen

And for the text DDL, you can find it in the ddl/text.sql file.
                
> Sorted Bucketed Partitioned Insert does not sort by dynamic partition column causing reducer OOMs/lease-expiry errors
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-5169
>                 URL: https://issues.apache.org/jira/browse/HIVE-5169
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>         Environment: Ubuntu LXC, hadoop-2
>            Reporter: Gopal V
>         Attachments: orc2.sql
>
>
> When a bulk-ETL operation is in progress, the query plan only sorts based on the SORTED BY key.
> This means that the FileSinkOperator in the reducer has to keep all the dynamic partition RecordWriters open till the end of the reducer lifetime.
> A more MR-friendly approach would be to sort by <partition_col,sorted_col> so that the data entering the reducer will not require to keep exactly one partition and bucket open at any given time.
> As a test-case a partitioned insert for the TPC-h benchmark's lineitem table will suffice
> {code}
> create table lineitem
> (L_ORDERKEY INT,
> ...
> partitioned by (L_SHIPDATE STRING)
> clustered by (l_orderkey)
> sorted by (l_orderkey)
> into 4 buckets
> stored as ORC;
> explain from (select
> L_ORDERKEY ,
> ...) tbl 
> insert overwrite table lineitem partition (L_SHIPDATE)
> select *
> ;
> {code}
> The generated plan very clearly has 
> {code}
>          Reduce Output Operator
>                 key expressions:
>                       expr: _col0
>                       type: int
>                 sort order: +
>                 Map-reduce partition columns:
>                       expr: _col0
>                       type: int
>                 tag: -1
> {code}
> And col0 being L_ORDERKEY.
> In the FileSinkOperator over at the reducer side, this results in a larger than usual number of open files.
> This causes memory pressure due to the compression buffers used by ORC/RCFile and really slows down the reducers.
> A side-effect of this is that I had to pump 350Gb of TPC-h data through 4 reducers, which on occasion took > 1 hour to get from opening a file in the FS to writing the first ORC stripe.
> This caused HDFS lease expiry and the task dying from that error.
> All of these can be avoided by adding the partition column to the sort keys as well as the partition keys & keeping only one writer open in the FileSinkOperator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira