You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Charles Menguy <cm...@proclivitymedia.com> on 2012/03/13 19:32:08 UTC

"Non-linear" data flow split with 1 leg

Hi all,

I have a question about PIG regarding non-linear data flows.

I'm using the SPLIT command to be able to do different behavior based on my
data, but I noticed something unexpected.

When I do a SPLIT with only 1 leg, for some reason that doesn't work, as it
seems to be expecting at least a 2nd leg.
Something like
SPLIT X INTO X1 if event == 'E1';
will give me the following error :
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
during parsing. Encountered " ";" "; "" at line ...

Given the fact that it works fine with more than 1 leg, and that the splits
don't have to cover the whole space (a record can go to no leg), and that
it's basically equivalent to doing multiple FILTER ... BY, i'm wondering if
this is a bug or if there is a good reason for expecting at least 2 legs
with SPLIT. I agree that doing a SPLIT with only 1 leg is not really a
non-linear data flow, but I find this behavior somewhat confusing and
inconsistent. Any thoughts?

Thanks,

Charles

-- 
Proclivity® | We Value Your Customers™ 

This message is the property of Proclivity Systems, Inc. and is intended 
only for the use of the addressee(s), and may contain material that is 
confidential and privileged for the sole use of the intended recipient. If 
you are not the intended recipient, reliance or forwarding without express 
permission is strictly prohibited; please contact the sender and delete all 
copies.

Re: "Non-linear" data flow split with 1 leg

Posted by Alan Gates <ga...@hortonworks.com>.
This looks like a parser bug.  

Alan.

On Mar 13, 2012, at 11:32 AM, Charles Menguy wrote:

> Hi all,
> 
> I have a question about PIG regarding non-linear data flows.
> 
> I'm using the SPLIT command to be able to do different behavior based on my
> data, but I noticed something unexpected.
> 
> When I do a SPLIT with only 1 leg, for some reason that doesn't work, as it
> seems to be expecting at least a 2nd leg.
> Something like
> SPLIT X INTO X1 if event == 'E1';
> will give me the following error :
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
> during parsing. Encountered " ";" "; "" at line ...
> 
> Given the fact that it works fine with more than 1 leg, and that the splits
> don't have to cover the whole space (a record can go to no leg), and that
> it's basically equivalent to doing multiple FILTER ... BY, i'm wondering if
> this is a bug or if there is a good reason for expecting at least 2 legs
> with SPLIT. I agree that doing a SPLIT with only 1 leg is not really a
> non-linear data flow, but I find this behavior somewhat confusing and
> inconsistent. Any thoughts?
> 
> Thanks,
> 
> Charles
> 
> -- 
> Proclivity® | We Value Your Customers™ 
> 
> This message is the property of Proclivity Systems, Inc. and is intended 
> only for the use of the addressee(s), and may contain material that is 
> confidential and privileged for the sole use of the intended recipient. If 
> you are not the intended recipient, reliance or forwarding without express 
> permission is strictly prohibited; please contact the sender and delete all 
> copies.