You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Yan Zhou (JIRA)" <ji...@apache.org> on 2010/02/25 01:17:28 UTC

[jira] Created: (PIG-1258) [zebra] Number of sorted input splits is unusually high

[zebra] Number of sorted input splits is unusually high
-------------------------------------------------------

                 Key: PIG-1258
                 URL: https://issues.apache.org/jira/browse/PIG-1258
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.6.0
            Reporter: Yan Zhou


Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Zhou updated PIG-1258:
--------------------------

    Status: Patch Available  (was: Open)

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Zhou updated PIG-1258:
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.7.0
           Status: Resolved  (was: Patch Available)

Patch committed to the trunk.

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847875#action_12847875 ] 

Yan Zhou commented on PIG-1258:
-------------------------------

Hudson's rerun appears to be hanging. Here is the result from my private run:

     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 9 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847560#action_12847560 ] 

Gaurav Jain commented on PIG-1258:
----------------------------------


+1

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Zhou updated PIG-1258:
--------------------------

    Attachment: PIG-1258.patch

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai closed PIG-1258.
---------------------------


> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Zhou updated PIG-1258:
--------------------------

    Status: Patch Available  (was: Open)

Resumbit so hudson will rerun.

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Zhou updated PIG-1258:
--------------------------

    Status: Open  (was: Patch Available)

The test report page having the claimed failures of some core tests is not available on the web. Will resubmit.

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1258) [zebra] Number of sorted input splits is unusually high

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847260#action_12847260 ] 

Hadoop QA commented on PIG-1258:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438944/PIG-1258.patch
  against trunk revision 925034.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/244/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/244/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/244/console

This message is automatically generated.

> [zebra] Number of sorted input splits is unusually high
> -------------------------------------------------------
>
>                 Key: PIG-1258
>                 URL: https://issues.apache.org/jira/browse/PIG-1258
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>         Attachments: PIG-1258.patch
>
>
> Number of sorted input splits is unusually high if the projections are on multiple column groups, or a union of tables, or column group(s) that hold many small tfiles. In one test, the number is about 100 times bigger that from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.