You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by michael <mi...@gmail.com> on 2013/11/22 02:03:11 UTC
Script produces incorrect output unless ColumnMapKeyPrune
optimisation is disabled.
Hello,
My script it gives incorrect output when ColumnMapKeyPrune optimisation is
enabled (as it is by default).
"pig -x local myscript.pig" produces incorrect output in output/e.csv.
However "pig -x local -t ColumnMapKeyPrune myscript.pig" works correctly.
I checked the bug list but couldn't find anything related.
Am I doing something wrong? Is this a known issue or should I raise a new
bug report?
I am running Apache Pig 0.11.1 on Linux.
Regards,
Michael
To reproduce here are the script and data file contents:
---------------------------------------------------------------------------------------
myscript.pig
register /usr/local/pig/contrib/piggybank/java/piggybank.jar;
define CSVLoader org.apache.pig.piggybank.storage.CSVLoader;
define CSVExcelStorage org.apache.pig.piggybank.storage.CSVExcelStorage;
a1 = load 'test1.csv' using CSVExcelStorage(',') as (
A:chararray,
B:chararray,
C:int,
D:chararray,
E:chararray,
F:chararray,
G:int,
H:chararray);
split a1 into
a2 if B != '' and F != '',
e0 otherwise;
e1 = foreach e0 generate A, B, F, H, D, E, G;
x1 = foreach a2 generate A, G;
a4 = load 'test2.csv' using CSVExcelStorage(',') as (A:chararray);
x2 = foreach a4 generate A;
x3 = join x1 by A left, x2 by A;
STORE e1 into './output/e' USING PigStorage(',', '-schema');
STORE x3 INTO './output/x' USING PigStorage(',', '-schema');
fs -rm output/e/.pig_schema;
fs -getmerge output/e output/e.csv;
---------------------------------------------------------------------------------------
test1.csv
a,x,1,,,x
b,x,1,,,x
c,x,1,,,x
d,x,1,,,x
e,x,1,,,x
f,x,1,,,x
g,x,1,,,x
---------------------------------------------------------------------------------------
test2.csv
a
b
c
d
e