You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sungwoo Park (Jira)" <ji...@apache.org> on 2022/11/13 13:02:00 UTC

[jira] [Updated] (HIVE-26732) Iceberg uses "null" and does not use the configuration key "hive.exec.default.partition.name" for default partitions.

     [ https://issues.apache.org/jira/browse/HIVE-26732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sungwoo Park updated HIVE-26732:
--------------------------------
    Description: 
When creating an Iceberg table from an existing ORC table with "insert overwrite", the directory corresponding to the default partition uses "null" instead of the value for the configuration key "hive.exec.default.partition.name".

For example, we create an Iceberg table from an existing ORC table tpcds_bin_partitioned_orc_1000.catalog_sales:
{code:java}
create table catalog_sales ( cs_sold_time_sk     bigint, cs_ship_date_sk     bigint, cs_bill_customer_sk   bigint, cs_bill_cdemo_sk    bigint, cs_bill_hdemo_sk    bigint, cs_bill_addr_sk     bigint, cs_ship_customer_sk   bigint, cs_ship_cdemo_sk    bigint, cs_ship_hdemo_sk    bigint, cs_ship_addr_sk     bigint, cs_call_center_sk   bigint, cs_catalog_page_sk    bigint, cs_ship_mode_sk     bigint, cs_warehouse_sk     bigint, cs_item_sk      bigint, cs_promo_sk     bigint, cs_order_number     bigint, cs_quantity     int, cs_wholesale_cost   double, cs_list_price     double, cs_sales_price    double, cs_ext_discount_amt   double, cs_ext_sales_price    double, cs_ext_wholesale_cost   double, cs_ext_list_price   double, cs_ext_tax      double, cs_coupon_amt     double, cs_ext_ship_cost    double, cs_net_paid     double, cs_net_paid_inc_tax   double, cs_net_paid_inc_ship  double, cs_net_paid_inc_ship_tax  double, cs_net_profit     double) partitioned by (cs_sold_date_sk bigint) STORED BY ICEBERG stored as orc;
insert overwrite table catalog_sales select * from tpcds_bin_partitioned_orc_1000.catalog_sales;
{code}
Iceberg creates a directory for the default partition like:

/hive/warehouse/tpcds_bin_partitioned_orc_1000_iceberg.db/catalog_sales/data/cs_sold_date_sk=null

which should be:

/hive/warehouse/tpcds_bin_partitioned_orc_1000_iceberg.db/catalog_sales/data/cs_sold_date_sk=_{_}HIVE_DEFAULT_PARTITION{_}_

 

  was:
When creating an Iceberg table from an existing ORC table with "insert overwrite", the directory corresponding to the default partition uses "null" instead of the value for the configuration key "hive.exec.default.partition.name".

For example, we create a Iceberg table from an existing ORC table tpcds_bin_partitioned_orc_1000.catalog_sales:
{code:java}
create table catalog_sales ( cs_sold_time_sk     bigint, cs_ship_date_sk     bigint, cs_bill_customer_sk   bigint, cs_bill_cdemo_sk    bigint, cs_bill_hdemo_sk    bigint, cs_bill_addr_sk     bigint, cs_ship_customer_sk   bigint, cs_ship_cdemo_sk    bigint, cs_ship_hdemo_sk    bigint, cs_ship_addr_sk     bigint, cs_call_center_sk   bigint, cs_catalog_page_sk    bigint, cs_ship_mode_sk     bigint, cs_warehouse_sk     bigint, cs_item_sk      bigint, cs_promo_sk     bigint, cs_order_number     bigint, cs_quantity     int, cs_wholesale_cost   double, cs_list_price     double, cs_sales_price    double, cs_ext_discount_amt   double, cs_ext_sales_price    double, cs_ext_wholesale_cost   double, cs_ext_list_price   double, cs_ext_tax      double, cs_coupon_amt     double, cs_ext_ship_cost    double, cs_net_paid     double, cs_net_paid_inc_tax   double, cs_net_paid_inc_ship  double, cs_net_paid_inc_ship_tax  double, cs_net_profit     double) partitioned by (cs_sold_date_sk bigint) STORED BY ICEBERG stored as orc;
insert overwrite table catalog_sales select * from tpcds_bin_partitioned_orc_1000.catalog_sales;
{code}
Iceberg creates a directory for the default partition like:

/hive/warehouse/tpcds_bin_partitioned_orc_1000_iceberg.db/catalog_sales/data/cs_sold_date_sk=null

which should be:

/hive/warehouse/tpcds_bin_partitioned_orc_1000_iceberg.db/catalog_sales/data/cs_sold_date_sk=__HIVE_DEFAULT_PARTITION__

 


> Iceberg uses "null" and does not use the configuration key "hive.exec.default.partition.name" for default partitions.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-26732
>                 URL: https://issues.apache.org/jira/browse/HIVE-26732
>             Project: Hive
>          Issue Type: Bug
>          Components: Iceberg integration
>    Affects Versions: 4.0.0-alpha-1
>            Reporter: Sungwoo Park
>            Priority: Major
>
> When creating an Iceberg table from an existing ORC table with "insert overwrite", the directory corresponding to the default partition uses "null" instead of the value for the configuration key "hive.exec.default.partition.name".
> For example, we create an Iceberg table from an existing ORC table tpcds_bin_partitioned_orc_1000.catalog_sales:
> {code:java}
> create table catalog_sales ( cs_sold_time_sk     bigint, cs_ship_date_sk     bigint, cs_bill_customer_sk   bigint, cs_bill_cdemo_sk    bigint, cs_bill_hdemo_sk    bigint, cs_bill_addr_sk     bigint, cs_ship_customer_sk   bigint, cs_ship_cdemo_sk    bigint, cs_ship_hdemo_sk    bigint, cs_ship_addr_sk     bigint, cs_call_center_sk   bigint, cs_catalog_page_sk    bigint, cs_ship_mode_sk     bigint, cs_warehouse_sk     bigint, cs_item_sk      bigint, cs_promo_sk     bigint, cs_order_number     bigint, cs_quantity     int, cs_wholesale_cost   double, cs_list_price     double, cs_sales_price    double, cs_ext_discount_amt   double, cs_ext_sales_price    double, cs_ext_wholesale_cost   double, cs_ext_list_price   double, cs_ext_tax      double, cs_coupon_amt     double, cs_ext_ship_cost    double, cs_net_paid     double, cs_net_paid_inc_tax   double, cs_net_paid_inc_ship  double, cs_net_paid_inc_ship_tax  double, cs_net_profit     double) partitioned by (cs_sold_date_sk bigint) STORED BY ICEBERG stored as orc;
> insert overwrite table catalog_sales select * from tpcds_bin_partitioned_orc_1000.catalog_sales;
> {code}
> Iceberg creates a directory for the default partition like:
> /hive/warehouse/tpcds_bin_partitioned_orc_1000_iceberg.db/catalog_sales/data/cs_sold_date_sk=null
> which should be:
> /hive/warehouse/tpcds_bin_partitioned_orc_1000_iceberg.db/catalog_sales/data/cs_sold_date_sk=_{_}HIVE_DEFAULT_PARTITION{_}_
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)