You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Alex Rovner <al...@gmail.com> on 2011/06/23 16:26:47 UTC

Pig ignores fields and dumps the whole tuple

I encountered a fairly strange issue with PIG the other day.  I was trying
out the sample Swap UDF (Attached to this email) from PIG documentation:
http://wiki.apache.org/pig/UDFManual (Schema section)

I then tried to run the following script:

REGISTER
/home/arovner/Documents/workspace-sts-2.6.0.RELEASE/pig-bank/target/pig-bank-0.0.1-SNAPSHOT-jar-with-dependencies.jar;

A = LOAD '/wec/incoming' USING PigStorage() AS (timestamp:chararray,
ip:chararray, country:chararray, state:chararray, event:chararray,
url:chararray, agent:chararray, geo_country:chararray, geo_dma:chararray,
geo_region:chararray, geo_city:chararray, geo_zip:chararray,
browser:chararray, os:chararray, uuid:chararray, segment_id:chararray,
guid:chararray, action:chararray);

B = FOREACH A {
generate myudfs.Swap(geo_region, geo_zip).geo_zip;
}

STORE B INTO '/wec/output' USING PigStorage();


I would expect to see in the output only the information contained in
"geo_zip" but instead I see the following:

(10019,NY)
(10019,NY)
(10019,NY)

Why is PIG not selecting a specific field but instead spitting out the whole
tuple?

I am using pig 0.80 from Clouderas chd3u0 package.

Thanks
Alex