You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Sriranjan Manjunath (JIRA)" <ji...@apache.org> on 2010/01/29 02:01:43 UTC

[jira] Created: (PIG-1209) Port POJoinPackage to proactively spill

Port POJoinPackage to proactively spill
---------------------------------------

                 Key: PIG-1209
                 URL: https://issues.apache.org/jira/browse/PIG-1209
             Project: Pig
          Issue Type: Bug
            Reporter: Sriranjan Manjunath


POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashutosh Chauhan updated PIG-1209:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Patch checked-in.

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829415#action_12829415 ] 

Ashutosh Chauhan commented on PIG-1209:
---------------------------------------

Did manual testing on this. With large enough dataset some reducers fail with "Error: GC overhead limit exceeded". After applying this patch, those failures didn't happen. This patch is ready for review. 

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806525#action_12806525 ] 

Pradeep Kamath commented on PIG-1209:
-------------------------------------

For testing, the data for the first input of the join can be a large number of records (so that size in memory > 500 MB or so) with the same join key. This will hopefully spill and fail with old code and not fail with new code.

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828540#action_12828540 ] 

Hadoop QA commented on PIG-1209:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12434483/pig-1209.patch
  against trunk revision 905377.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/196/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/196/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/196/console

This message is automatically generated.

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashutosh Chauhan updated PIG-1209:
----------------------------------

    Status: Patch Available  (was: Open)

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashutosh Chauhan updated PIG-1209:
----------------------------------

    Attachment: pig-1209.patch

Attached patch which makes POJoinPackage to use InternalCachedBag instead of DefaultBag. Will be testing it as Pradeep suggested. Running through hudson to make sure it doesn't fail existing test cases. 

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-1209:
-----------------------------------

    Assignee: Ashutosh Chauhan

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831578#action_12831578 ] 

Olga Natkovich commented on PIG-1209:
-------------------------------------

The current unit tests adequately cover the testing of this internal change. Additionally, Ashutosh ran several e2e tests and also verified that this change fixed user problem. User script no longer ran out of memory

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-1209:
--------------------------------

    Fix Version/s: 0.7.0

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>             Fix For: 0.7.0
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai closed PIG-1209.
---------------------------


> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829691#action_12829691 ] 

Olga Natkovich commented on PIG-1209:
-------------------------------------

+1. Changes look good

> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.7.0
>
>         Attachments: pig-1209.patch
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1209) Port POJoinPackage to proactively spill

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806504#action_12806504 ] 

Olga Natkovich commented on PIG-1209:
-------------------------------------

The change would require stitching from old style spillable bags to the new one implemented by Ying. Testing would be a bit tricky. I think Pradeep had some ideas



> Port POJoinPackage to proactively spill
> ---------------------------------------
>
>                 Key: PIG-1209
>                 URL: https://issues.apache.org/jira/browse/PIG-1209
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>             Fix For: 0.7.0
>
>
> POPackage proactively spills the bag whereas POJoinPackage still uses the SpillableMemoryManager. We should port this to use InternalCacheBag which proactively spills.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.