You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Ashutosh Chauhan (JIRA)" <ji...@apache.org> on 2010/06/12 19:58:13 UTC

[jira] Commented: (PIG-1448) Detach tuple from inner plans of physical operator

    [ https://issues.apache.org/jira/browse/PIG-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878294#action_12878294 ] 

Ashutosh Chauhan commented on PIG-1448:
---------------------------------------

Problem here is not as bad as it may sound. All the physical operator already detaches the input tuple after they are done with it. In the getNext() phy op first calls processInput() which first attaches the input tuple and then detaches it at the end. So, physical operators contained within inner plans will also do that. Problem is when there is a Bin Cond, Pig short circuits one of the branches of the inner plan, in which case getNext() of the operator is never called and thus tuple is never detached. Note in these cases, tuple was already attached by the operator which had this inner plan to all the roots of the plan. So, in this particular use case tuple got attached but was never detached and thus had the stray reference which cannot be GC'ed. This still will not be a problem if there is only a single pipeline in mapper or reducer since the next time new key/value pair is read and is run through pipeline, the reference will be overwritten and thus tuple which was not detached in previous run can now be GC'ed. Only if you have Multi Query optimized script the same pipeline may not be run when the next key/value pair is read in map() or reduce() and then stray reference will not be overwritten. If all of these conditions are met and if tuple  itself is large or contains large bags, we may end up with OOME. 

> Detach tuple from inner plans of physical operator 
> ---------------------------------------------------
>
>                 Key: PIG-1448
>                 URL: https://issues.apache.org/jira/browse/PIG-1448
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Ashutosh Chauhan
>             Fix For: 0.8.0
>
>
> This is a follow-up on PIG-1446 which only addresses this general problem for a specific instance of For Each. In general, all the physical operators which can have inner plans are vulnerable to this. Few of them include POLocalRearrange, POFilter, POCollectedGroup etc.  Need to fix all of these.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.