You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/01/27 03:08:59 UTC

[jira] Created: (PIG-637) limit with order by is broken in local mode

limit with order by is broken in local mode
-------------------------------------------

                 Key: PIG-637
                 URL: https://issues.apache.org/jira/browse/PIG-637
             Project: Pig
          Issue Type: Bug
            Reporter: Olga Natkovich
            Assignee: Shubham Chopra


Shubham, could you take a look.

The following script when ran in local mode just ignores the limit and outputs the entire data set:

a = load 'studenttab10k' as (name, age,gpa);
b = order a by name;
c = limit b 10;
dump c;

The same script works fine in MR mode

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-637) limit with order by is broken in local mode

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671328#action_12671328 ] 

Olga Natkovich commented on PIG-637:
------------------------------------

I am reviewing this patch.

> limit with order by is broken in local mode
> -------------------------------------------
>
>                 Key: PIG-637
>                 URL: https://issues.apache.org/jira/browse/PIG-637
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Shubham Chopra
>         Attachments: 637.patch
>
>
> Shubham, could you take a look.
> The following script when ran in local mode just ignores the limit and outputs the entire data set:
> a = load 'studenttab10k' as (name, age,gpa);
> b = order a by name;
> c = limit b 10;
> dump c;
> The same script works fine in MR mode

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-637) limit with order by is broken in local mode

Posted by "Shubham Chopra (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shubham Chopra updated PIG-637:
-------------------------------

    Attachment: 637.patch

This happens because the optimizer eliminates the limit after a sort and puts an attribute in POSort/LOSort instead. This attribute is not used in the local mode sorting as this would adversely affect the MR sorting of the samples.

I have modified the code to avoid that optimization happening when executing in the local mode. I have also added a couple of test cases that verify the plans in both local and MR mode.

> limit with order by is broken in local mode
> -------------------------------------------
>
>                 Key: PIG-637
>                 URL: https://issues.apache.org/jira/browse/PIG-637
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Shubham Chopra
>         Attachments: 637.patch
>
>
> Shubham, could you take a look.
> The following script when ran in local mode just ignores the limit and outputs the entire data set:
> a = load 'studenttab10k' as (name, age,gpa);
> b = order a by name;
> c = limit b 10;
> dump c;
> The same script works fine in MR mode

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-637) limit with order by is broken in local mode

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich resolved PIG-637.
--------------------------------

    Resolution: Fixed

patch committed, thanks, Shubham!

> limit with order by is broken in local mode
> -------------------------------------------
>
>                 Key: PIG-637
>                 URL: https://issues.apache.org/jira/browse/PIG-637
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Shubham Chopra
>         Attachments: 637.patch
>
>
> Shubham, could you take a look.
> The following script when ran in local mode just ignores the limit and outputs the entire data set:
> a = load 'studenttab10k' as (name, age,gpa);
> b = order a by name;
> c = limit b 10;
> dump c;
> The same script works fine in MR mode

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.