You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org> on 2011/07/15 16:01:00 UTC

[jira] [Updated] (PIG-1904) Default split destination

     [ https://issues.apache.org/jira/browse/PIG-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated PIG-1904:
------------------------------------------------

    Attachment: PIG-1904.1.patch

PIG-1904.1.patch contains the first working implementation of the feature.

The grammar now recognizes statements like:
    SPLIT a INTO b IF x1 < 0, c OTHERWISE;
but also like:
    SPLIT a INTO b IF x1 < 0;
This is a side-effect of making the otherwise branch optional and is a change from past behavior.
It shouldn't be a problem as the Split maps to a Filter in any case.

Implemented by copying of the other LOSplitOutput plans, and building a negated disjunction (OR) of the expressions.

Added unit test for Split-Otherwise

TODO:
Disable the feature if the expression contains a @NonDeterministic UDF.
I plan to do it by spawning a visitor on the expression.
The visitor will throw an error and explain the reason in the error message.
Is this a reasonable approach?

> Default split destination
> -------------------------
>
>                 Key: PIG-1904
>                 URL: https://issues.apache.org/jira/browse/PIG-1904
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Daniel Dai
>              Labels: gsoc2011
>             Fix For: 0.10
>
>         Attachments: PIG-1904.1.patch
>
>
> "split" statement is better to have a default destination, eg:
> {code}
> SPLIT A INTO X IF f1<7, Y IF f2==5, Z IF (f3<6 OR f3>6), OTHER otherwise; -- OTHERS has all tuples with f1>=7 && f2!=5 && f3==6
> {code}
> This is a candidate project for Google summer of code 2011. More information about the program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira