You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2012/09/13 07:41:07 UTC

[jira] [Created] (PIG-2918) Avoid Spillable bag overhead where possible

Dmitriy V. Ryaboy created PIG-2918:
--------------------------------------

             Summary: Avoid Spillable bag overhead where possible
                 Key: PIG-2918
                 URL: https://issues.apache.org/jira/browse/PIG-2918
             Project: Pig
          Issue Type: Bug
            Reporter: Dmitriy V. Ryaboy


We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2918) Avoid Spillable bag overhead where possible

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456125#comment-13456125 ] 

Alan Gates commented on PIG-2918:
---------------------------------

Patch looks good.  I'm running it through some of the e2e tests.
                
> Avoid Spillable bag overhead where possible
> -------------------------------------------
>
>                 Key: PIG-2918
>                 URL: https://issues.apache.org/jira/browse/PIG-2918
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2918.patch
>
>
> We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
> In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2918) Avoid Spillable bag overhead where possible

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2918:
-----------------------------------

    Attachment: PIG-2918.patch

Attaching a quick pass over builtin functions.
                
> Avoid Spillable bag overhead where possible
> -------------------------------------------
>
>                 Key: PIG-2918
>                 URL: https://issues.apache.org/jira/browse/PIG-2918
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2918.patch
>
>
> We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
> In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2918) Avoid Spillable bag overhead where possible

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2918:
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.11
           Status: Resolved  (was: Patch Available)

Committed to 0.11 (trunk). Thanks for the review Alan.

Btw, I profiled this change -- prior to this patch, calling TOBAG() on about 60,000 tuples resulted in 530Kb worth of DefaultDataBags, and 2Mb of WeakReferences (!). After the patch, practically no WeakReferences, and only 175K NonSpillableDataBags. Turns out this sort of thing adds up fast.
                
> Avoid Spillable bag overhead where possible
> -------------------------------------------
>
>                 Key: PIG-2918
>                 URL: https://issues.apache.org/jira/browse/PIG-2918
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.11
>
>         Attachments: PIG-2918.patch
>
>
> We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
> In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2918) Avoid Spillable bag overhead where possible

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2918:
-----------------------------------

    Status: Patch Available  (was: Open)
    
> Avoid Spillable bag overhead where possible
> -------------------------------------------
>
>                 Key: PIG-2918
>                 URL: https://issues.apache.org/jira/browse/PIG-2918
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2918.patch
>
>
> We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
> In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2918) Avoid Spillable bag overhead where possible

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456269#comment-13456269 ] 

Alan Gates commented on PIG-2918:
---------------------------------

+1, tests pass.
                
> Avoid Spillable bag overhead where possible
> -------------------------------------------
>
>                 Key: PIG-2918
>                 URL: https://issues.apache.org/jira/browse/PIG-2918
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2918.patch
>
>
> We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
> In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2918) Avoid Spillable bag overhead where possible

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy reassigned PIG-2918:
--------------------------------------

    Assignee: Dmitriy V. Ryaboy
    
> Avoid Spillable bag overhead where possible
> -------------------------------------------
>
>                 Key: PIG-2918
>                 URL: https://issues.apache.org/jira/browse/PIG-2918
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2918.patch
>
>
> We use BagFactory.newDefaultBag() liberally, and pay a price -- each such bag registers with the spillable memory manager, and if we allocate a lot of tiny bags, we wind up paying for maintaining and cleaning up the internal linked list of weak references. 
> In many cases, we know a-priori that the bags are smal, and should probably be creating non-spillable bags for those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira