You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/06/01 01:08:17 UTC

[jira] [Commented] (FLINK-2076) Bug in re-openable hash join

    [ https://issues.apache.org/jira/browse/FLINK-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566817#comment-14566817 ] 

ASF GitHub Bot commented on FLINK-2076:
---------------------------------------

Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/751#issuecomment-107255722
  
    Thanks, this looks like some seriously great debugging! Very nice :-)
    
    It would be great if you could add a test that produces the error without the fix, and validates that the fix resolved it. I would guess that you have a setup that produced this error (for debugging). Can you add this as a test?
    
    Also, can we change the fix such that it adds a second memory segment, if it is non-null? That would help maintain the performance characteristics of the current code. I vaguely remember that there was a reason to add two memory segments (that code was written quite a while ago and I should have put more comments into the code).



> Bug in re-openable hash join
> ----------------------------
>
>                 Key: FLINK-2076
>                 URL: https://issues.apache.org/jira/browse/FLINK-2076
>             Project: Flink
>          Issue Type: Bug
>          Components: Local Runtime
>    Affects Versions: 0.9
>            Reporter: Stephan Ewen
>            Assignee: Chiwan Park
>
> It happens deterministically in my machine with the following setup:
> TaskManager:
>   - heap size: 512m
>   - network buffers: 4096
>   - slots: 32
> Job:
>   - ConnectedComponents
>   - 100k vertices
>   - 1.2m edges
> --> this gives around 260 m Flink managed memory, across 32 slots is 8MB per slot, with several mem consumers in the job, makes the iterative hash join out-of-core
> {code}
> java.lang.RuntimeException: Hash Join bug in memory management: 
> Memory buffers leaked.
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:733)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
> 	at org.apache.flink.runtime.operators.hash.ReOpenableMutableHashTable.prepareNextPartition(ReOpenableMutableHashTable.java:167)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:541)
> 	at org.apache.flink.runtime.operators.hash.NonReusingBuildSecondHashMatchIterator.callWithNextKey(NonReusingBuildSecondHashMatchIterator.java:102)
> 	at org.apache.flink.runtime.operators.AbstractCachedBuildSideMatchDriver.run(AbstractCachedBuildSideMatchDriver.java:155)
> 	at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
> 	at org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139)
> 	at org.apache.flink.runtime.iterative.task.IterationIntermediatePactTask.run(IterationIntermediatePactTask.java:92)
> 	at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:560)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)