You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Joshua Hartman (JIRA)" <ji...@apache.org> on 2012/07/12 06:20:34 UTC

[jira] [Commented] (PIG-2751) Allow macros in FOREACH

    [ https://issues.apache.org/jira/browse/PIG-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412502#comment-13412502 ] 

Joshua Hartman commented on PIG-2751:
-------------------------------------

I have this exact use case for trying to do things like filter out nulls using a FOREACH. It's a pain to use a ternary operator every time - it would be much nicer to have a sort of getOrElse macro that can run.
                
> Allow macros in FOREACH
> -----------------------
>
>                 Key: PIG-2751
>                 URL: https://issues.apache.org/jira/browse/PIG-2751
>             Project: Pig
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 0.10.0
>         Environment: Kubuntu 12.04 64Bit
>            Reporter: Johannes Schwenk
>
> I would like to be able to use macros within the GENERATE of an FOREACH.
> Example:
> {code}
> define test_macro(param1, param2) returns ret_val {
>   $ret_val = (param1 == 0 ? param2 : param1);
> };
> a = LOAD ('data') AS (id, val1, val2);
> b = FOREACH a GENERATE id, test_macro(val1, val2);
> DUMP b;
> {code}
> This would be most useful for having only a single point to edit (the macro) if a definition for a special computation changes. Lets say, you have raw log data and several scripts loading it. All scripts need to filter out specific unused columns. Most (but not all) of the scripts are dealing with a field that needs to be handled in a special way. So I cannot just use two different LOAD functions (one with the special computation and one without) because that would make a second FOREACH ... GENERATE necessary to filter out the unused columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira