You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org> on 2011/06/01 13:27:47 UTC

[jira] [Updated] (PIG-1926) Sample/Limit should take scalar

     [ https://issues.apache.org/jira/browse/PIG-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated PIG-1926:
------------------------------------------------

    Attachment: PIG-1926.patch

Addressed TestLogToPhyCompiler failure.
    TestLogToPhyCompiler.testLimit():
      Modified grammar file LogicalPlanGenerator.g to preserve previous AST structure.
      The previous modification disrupted tests that directly interacted with ASTs.
      The new grammar just adds expr() as a side option to the original limit_clause() rule.

Addressed TestGrunt failure.
    Modified TypeCheckingRelVisitor in order to correctly cast expression in LOLimit.
    Visit the expression plan only if present.


Disabled optimizer for variable LIMIT case.


Added simple test for variable LIMIT.


Attempt to fix things in MR layer:
    Added a limitPlan to MapReduceOper (not sure it's useful).
    Adapted the check on the existence of a limit in MROper
    Copy also the plan when creating a new POLimit in MRCompiler.


TODO:
Test whether it works in non-local mode (how should I do that without a cluster?)
Evaluate what else to change in MR layer.

> Sample/Limit should take scalar
> -------------------------------
>
>                 Key: PIG-1926
>                 URL: https://issues.apache.org/jira/browse/PIG-1926
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Daniel Dai
>              Labels: gsoc2011
>         Attachments: PIG-1926.patch, PIG-1926.patch, PIG-1926.patch, PIG-1926.patch
>
>
> Currently, Limit, Sample only takes a constant. It would be better we can use a scalar in the place of constant. Eg:
> {code}
> a = load 'a.txt';
> b = group a all;
> c = foreach b generate COUNT(a) as sum;
> d = order a by $0;
> e = limit d c.sum/100;
> {code}
> This is a candidate project for Google summer of code 2011. More information about the program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira