You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Araceli Henley <3a...@gmail.com> on 2013/12/07 00:45:39 UTC
Does Pig support HCatalogStorer table with buckets
Hi
:::::::::
QUESTION:
:::::::::
Can anyone confirm if HCatalogStore works with a hive table that was
declared with buckets?
:::::::::
DETAILS:
:::::::::
I have a table in hive that was created with buckets. But when I tried to
load the data with HCatalogStorer it fails with the following error.
Store into a partition with bucket definition from Pig/Mapreduce is not
supported.
I have a table declaration in hive:
......
PARTITIONED BY(dtStr STRING)
CLUSTERED BY(sessionid) SORTED BY(timestr) INTO 32 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '1'
COLLECTION ITEMS TERMINATED BY '2'
MAP KEYS TERMINATED BY '3'
STORED AS ORC;
>From pig, I load the data with HCatStorer:
STORE sessnz_all INTO '$DB.allPocData' USING
org.apache.hcatalog.pig.HCatStorer();
Details at logfile:
/home/araceli/src/bigdata/projects/cisco_webanalytics_poc/src/server/pig/scripts/pig_1386373152479.log
[araceli@greenhost03 scripts]$ pig -version
Apache Pig version 0.11.2-mapr (rexported)
compiled Aug 27 2013, 13:50:32
[araceli@greenhost03 scripts]$ hive -version
Logging initialized using configuration in
jar:file:/opt/mapr/hive/hive-0.11/lib/hive-common-0.11-mapr.jar!/hive-log4j.properties
Hive history
I have a table declaration in hive:
......
PARTITIONED BY(dtStr STRING)
CLUSTERED BY(sessionid) SORTED BY(timestr) INTO 32 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '1'
COLLECTION ITEMS TERMINATED BY '2'
MAP KEYS TERMINATED BY '3'
STORED AS ORC;
>From pig, I load the data with HCatStorer:
STORE sessnz_all INTO '$DB.allPocData' USING
org.apache.hcatalog.pig.HCatStorer();
Details at logfile:
/home/araceli/src/bigdata/projects/cisco_webanalytics_poc/src/server/pig/scripts/pig_1386373152479.log
[araceli@greenhost03 scripts]$ pig -version
Apache Pig version 0.11.2-mapr (rexported)
compiled Aug 27 2013, 13:50:32
[araceli@greenhost03 scripts]$ hive -version
Logging initialized using configuration in
jar:file:/opt/mapr/hive/hive-0.11/lib/hive-common-0.11-mapr.jar!/hive-log4j.properties
Hive history
Re: Does Pig support HCatalogStorer table with buckets
Posted by Alan Gates <ga...@hortonworks.com>.
No. HCat explicitly checks if a table is bucketed, and if so disable storing to it to avoid writing to the table in a destructive way.
Alan.
On Dec 6, 2013, at 3:45 PM, Araceli Henley wrote:
> Hi
>
>
> :::::::::
>
> QUESTION:
>
> :::::::::
>
> Can anyone confirm if HCatalogStore works with a hive table that was
> declared with buckets?
>
>
> :::::::::
>
> DETAILS:
>
> :::::::::
>
>
> I have a table in hive that was created with buckets. But when I tried to
> load the data with HCatalogStorer it fails with the following error.
>
>
> Store into a partition with bucket definition from Pig/Mapreduce is not
> supported.
>
>
> I have a table declaration in hive:
>
>
> ......
>
> PARTITIONED BY(dtStr STRING)
>
> CLUSTERED BY(sessionid) SORTED BY(timestr) INTO 32 BUCKETS
>
> ROW FORMAT DELIMITED
>
> FIELDS TERMINATED BY '1'
>
> COLLECTION ITEMS TERMINATED BY '2'
>
> MAP KEYS TERMINATED BY '3'
>
> STORED AS ORC;
>
>
> From pig, I load the data with HCatStorer:
>
>
> STORE sessnz_all INTO '$DB.allPocData' USING
> org.apache.hcatalog.pig.HCatStorer();
>
>
>
> Details at logfile:
> /home/araceli/src/bigdata/projects/cisco_webanalytics_poc/src/server/pig/scripts/pig_1386373152479.log
>
> [araceli@greenhost03 scripts]$ pig -version
>
> Apache Pig version 0.11.2-mapr (rexported)
>
> compiled Aug 27 2013, 13:50:32
>
> [araceli@greenhost03 scripts]$ hive -version
>
>
> Logging initialized using configuration in
> jar:file:/opt/mapr/hive/hive-0.11/lib/hive-common-0.11-mapr.jar!/hive-log4j.properties
>
> Hive history
>
> I have a table declaration in hive:
>
>
> ......
>
> PARTITIONED BY(dtStr STRING)
>
> CLUSTERED BY(sessionid) SORTED BY(timestr) INTO 32 BUCKETS
>
> ROW FORMAT DELIMITED
>
> FIELDS TERMINATED BY '1'
>
> COLLECTION ITEMS TERMINATED BY '2'
>
> MAP KEYS TERMINATED BY '3'
>
> STORED AS ORC;
>
>
> From pig, I load the data with HCatStorer:
>
>
> STORE sessnz_all INTO '$DB.allPocData' USING
> org.apache.hcatalog.pig.HCatStorer();
>
>
>
> Details at logfile:
> /home/araceli/src/bigdata/projects/cisco_webanalytics_poc/src/server/pig/scripts/pig_1386373152479.log
>
> [araceli@greenhost03 scripts]$ pig -version
>
> Apache Pig version 0.11.2-mapr (rexported)
>
> compiled Aug 27 2013, 13:50:32
>
> [araceli@greenhost03 scripts]$ hive -version
>
>
> Logging initialized using configuration in
> jar:file:/opt/mapr/hive/hive-0.11/lib/hive-common-0.11-mapr.jar!/hive-log4j.properties
>
> Hive history
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.