You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Russell Jurney (JIRA)" <ji...@apache.org> on 2013/12/21 21:10:09 UTC

[jira] [Commented] (PIG-3269) In operator support

    [ https://issues.apache.org/jira/browse/PIG-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13854917#comment-13854917 ] 

Russell Jurney commented on PIG-3269:
-------------------------------------

Does this work for searching IN a bag of tuples?

> In operator support
> -------------------
>
>                 Key: PIG-3269
>                 URL: https://issues.apache.org/jira/browse/PIG-3269
>             Project: Pig
>          Issue Type: New Feature
>          Components: internal-udfs, parser
>    Affects Versions: 0.11
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12.0
>
>         Attachments: PIG-3269-2.patch, PIG-3269-3.patch, PIG-3269-4.patch, PIG-3269-5.patch, PIG-3269.patch
>
>
> This is another language improvement using the same approach as in PIG-3268.
> Currently, Pig has no support for IN operator. To mimic it, users often have to concatenate several OR operators.
> For example,
> {code}
> a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
> b = FILTER a BY 
>    (i == 1) OR
>    (i == 22) OR
>    (i == 333) OR
>    (i == 4444) OR
>    (i == 55555);
> {code}
> But this can be re-rewritten in a more compact manner using IN operator as follows: 
> {code}
> a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
> b = FILTER a BY i IN (1,22,333,4444,55555);
> {code}
> I propose that we implement IN operator in the following manner:
> * Add built-in UDFs that take expressions as args. Take for example the aforementioned IN operator, we can define a UDF such as {{builtInUdf(i, 1, 22, 333, 4444, 55555)}}.
> * Add syntactical sugar for these built-in UDFs.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)