You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Ashutosh Chauhan (JIRA)" <ji...@apache.org> on 2010/06/12 19:02:13 UTC

[jira] Created: (PIG-1448) Detach tuple from inner plans of physical operator

Detach tuple from inner plans of physical operator 
---------------------------------------------------

                 Key: PIG-1448
                 URL: https://issues.apache.org/jira/browse/PIG-1448
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.7.0, 0.6.0, 0.5.0, 0.4.0, 0.3.0, 0.2.0, 0.1.0
            Reporter: Ashutosh Chauhan
             Fix For: 0.8.0


This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898325#action_12898325 ] 

Thejas M Nair commented on PIG-1448:
------------------------------------

Pasting result of test-patch . No new tests are included because the patch only changes the time at which memory is freed.

     [exec]
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]


> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig, PIG-1448.1.patch
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878294#action_12878294 ] 

Ashutosh Chauhan commented on PIG-1448:
---------------------------------------

Problem here is not as bad as it may sound. All the physical operator already detaches the input tuple after they are done with it. In the getNext() phy op first calls processInput() which first attaches the input tuple and then detaches it at the end. So, physical operators contained within inner plans will also do that. Problem is when there is a Bin Cond, Pig short circuits one of the branches of the inner plan, in which case getNext() of the operator is never called and thus tuple is never detached. Note in these cases, tuple was already attached by the operator which had this inner plan to all the roots of the plan. So, in this particular use case tuple got attached but was never detached and thus had the stray reference which cannot be GC'ed. This still will not be a problem if there is only a single pipeline in mapper or reducer since the next time new key/value pair is read and is run through pipeline, the reference will be overwritten and thus tuple which was not detached in previous run can now be GC'ed. Only if you have Multi Query optimized script the same pipeline may not be run when the next key/value pair is read in map() or reduce() and then stray reference will not be overwritten. If all of these conditions are met and if tuple  itself is large or contains large bags, we may end up with OOME. 

> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>             Fix For: 0.8.0
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1448:
-------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

Patch committed to trunk.


> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig, PIG-1448.1.patch
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-1448:
-----------------------------------

    Assignee: Thejas M Nair

> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1448:
-------------------------------

    Attachment: PIG-1448.1.patch

> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig, PIG-1448.1.patch
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1448:
-------------------------------

    Status: Patch Available  (was: Open)

> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.6.0, 0.5.0, 0.4.0, 0.3.0, 0.2.0, 0.1.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig, PIG-1448.1.patch
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898450#action_12898450 ] 

Richard Ding commented on PIG-1448:
-----------------------------------

+1. Looks good.

> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig, PIG-1448.1.patch
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898404#action_12898404 ] 

Thejas M Nair commented on PIG-1448:
------------------------------------

All tests are successful. Patch is ready for review.


> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig, PIG-1448.1.patch
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1448) Detach tuple from inner plans of physical operator

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1448:
-------------------------------

    Attachment: multi_oom_filt.pig

multi_oom_filt.pig is a query that reproduces this problem for a filter query. There is a bincond in the filtercondition, and one side of the bincond does not get evaluated and detached. The query runs out of memory in reduce. With fix to call detach from POFilter, the query succeeds.


> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: multi_oom_filt.pig
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.