You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2019/07/05 16:17:00 UTC

[jira] [Updated] (PIG-5390) Avoid adding self-spilling bags to SpillableMemoryManager

     [ https://issues.apache.org/jira/browse/PIG-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-5390:
------------------------------
      Priority: Minor  (was: Major)
       Summary: Avoid adding self-spilling bags to SpillableMemoryManager   (was: Possible race condition from Self-spilling bags registering with SpillableMemoryManager )
    Issue Type: Improvement  (was: Bug)

Given synchronization was added in PIG-3212 and PIG-3466 , I'm changing the summary of this Jira and lowering severity.  Question here would be, shall we stop adding  InternalSortedBag and  InternalDistinctBag to SpillableMemoryManager?

> Avoid adding self-spilling bags to SpillableMemoryManager 
> ----------------------------------------------------------
>
>                 Key: PIG-5390
>                 URL: https://issues.apache.org/jira/browse/PIG-5390
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Minor
>
> This is a follow up from PIG-5380 where [~rohini] pointed out 
> {quote}
> I think same change is required in InternalSortedBag as well as code is exactly same and it can spill too - https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/InternalSortedBag.java#L133 . We most likely haven't seen issues with it as the probability could be very less as it will proactively spill if it exceeds cached memory limit.
> {quote}
> Looking at the history and the source, this is a critical bug given all these self-spilling bags are designed on the premise that no other threads would touch them.  Comment in the source clearly say
> {code}
>  * This bag is not registered with SpillableMemoryManager. It calculates
>  * the number of tuples to hold in memory and spill pro-actively into files."
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)