You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2019/07/03 21:36:00 UTC

[jira] [Commented] (PIG-5380) SortedDataBag hitting ConcurrentModificationException or producing incorrect output in a corner-case

    [ https://issues.apache.org/jira/browse/PIG-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878176#comment-16878176 ] 

Koji Noguchi commented on PIG-5380:
-----------------------------------

{quote}
 I think same change is required in InternalSortedBag as well as code is exactly same and it can spill too - https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/InternalSortedBag.java#L133 . We most likely haven't seen issues with it as the probability could be very less as it will proactively spill if it exceeds cached memory limit.
{quote}

As we discussed offline, I was confused since the comment on the source code clearly said 
{code}
 * This bag is not registered with SpillableMemoryManager. It calculates
 * the number of tuples to hold in memory and spill pro-actively into files."
{code}
Looking back on changes, I think I understand that this is a bigger bug than this jira given all these self-spilling bags are designed on the premise that no other threads would touch them (and thus lockings are dropped).   Created PIG-5390 for follow up.

> SortedDataBag hitting ConcurrentModificationException or producing incorrect output in a corner-case 
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PIG-5380
>                 URL: https://issues.apache.org/jira/browse/PIG-5380
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Major
>         Attachments: pig-5380-v01.patch
>
>
> User had a UDF that created large SortedDataBag.  This UDF was failing with 
> {noformat}
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
>   at java.util.ArrayList$Itr.next(ArrayList.java:851)
>   at org.apache.pig.data.SortedDataBag$SortedDataBagIterator.readFromPriorityQ(SortedDataBag.java:346)
>   at org.apache.pig.data.SortedDataBag$SortedDataBagIterator.next(SortedDataBag.java:322)
>   at org.apache.pig.data.SortedDataBag$SortedDataBagIterator.hasNext(SortedDataBag.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)