You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2008/07/25 05:23:31 UTC

[jira] Created: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

If limit size exceeds number of records in the file, a few records get dropped
------------------------------------------------------------------------------

                 Key: PIG-337
                 URL: https://issues.apache.org/jira/browse/PIG-337
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Alan Gates
             Fix For: types_branch


Given a file with 10k records, the following script returned 9996 records:

a = load 'studenttab10k';
b = limit a 100000;
dump b;

It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616772#action_12616772 ] 

Daniel Dai commented on PIG-337:
--------------------------------

More general, all input file with duplicate records is potentially affected, even if the limit size is within the number of records in file. Will submit patch shortly. 

> If limit size exceeds number of records in the file, a few records get dropped
> ------------------------------------------------------------------------------
>
>                 Key: PIG-337
>                 URL: https://issues.apache.org/jira/browse/PIG-337
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>
> Given a file with 10k records, the following script returned 9996 records:
> a = load 'studenttab10k';
> b = limit a 100000;
> dump b;
> It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616802#action_12616802 ] 

Daniel Dai commented on PIG-337:
--------------------------------

Please hold on for a while, there is still some issue in this patch

> If limit size exceeds number of records in the file, a few records get dropped
> ------------------------------------------------------------------------------
>
>                 Key: PIG-337
>                 URL: https://issues.apache.org/jira/browse/PIG-337
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-337.patch
>
>
> Given a file with 10k records, the following script returned 9996 records:
> a = load 'studenttab10k';
> b = limit a 100000;
> dump b;
> It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617006#action_12617006 ] 

Daniel Dai commented on PIG-337:
--------------------------------

Should be fine now.

> If limit size exceeds number of records in the file, a few records get dropped
> ------------------------------------------------------------------------------
>
>                 Key: PIG-337
>                 URL: https://issues.apache.org/jira/browse/PIG-337
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-337.patch
>
>
> Given a file with 10k records, the following script returned 9996 records:
> a = load 'studenttab10k';
> b = limit a 100000;
> dump b;
> It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-337:
---------------------------

    Attachment: PIG-337.patch

> If limit size exceeds number of records in the file, a few records get dropped
> ------------------------------------------------------------------------------
>
>                 Key: PIG-337
>                 URL: https://issues.apache.org/jira/browse/PIG-337
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-337.patch
>
>
> Given a file with 10k records, the following script returned 9996 records:
> a = load 'studenttab10k';
> b = limit a 100000;
> dump b;
> It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-337:
---------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

PIG-337.patch committed.

> If limit size exceeds number of records in the file, a few records get dropped
> ------------------------------------------------------------------------------
>
>                 Key: PIG-337
>                 URL: https://issues.apache.org/jira/browse/PIG-337
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-337.patch
>
>
> Given a file with 10k records, the following script returned 9996 records:
> a = load 'studenttab10k';
> b = limit a 100000;
> dump b;
> It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-337) If limit size exceeds number of records in the file, a few records get dropped

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-337:
---------------------------

    Status: Patch Available  (was: Open)

> If limit size exceeds number of records in the file, a few records get dropped
> ------------------------------------------------------------------------------
>
>                 Key: PIG-337
>                 URL: https://issues.apache.org/jira/browse/PIG-337
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-337.patch
>
>
> Given a file with 10k records, the following script returned 9996 records:
> a = load 'studenttab10k';
> b = limit a 100000;
> dump b;
> It looks like maybe the limit operator isn't returning its last record or something.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.