You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Robert Waury (JIRA)" <ji...@apache.org> on 2014/10/16 13:47:33 UTC

[jira] [Comment Edited] (FLINK-1141) Selfjoin fails after DataSet exceeds certain size

    [ https://issues.apache.org/jira/browse/FLINK-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173637#comment-14173637 ] 

Robert Waury edited comment on FLINK-1141 at 10/16/14 11:46 AM:
----------------------------------------------------------------

I tried using the REPARTITION_SORT_MERGE hint but even in a LocalExecutionEnvironment with DOP set to 1 it still doesn't work.

So far the only method that worked was creating two new data sets for every join (i.e. no data set is used for a join twice).

But this also caused some issues on the cluster (YARN). It would work fine with 14 machines but deadlock with 16.

But this workaround becomes less feasible the more joins I add (most of them are just broadcast joins to add data from different sources to my main data set).


was (Author: rwaury):
I tried using the REPARTITION_SORT_MERGE hint but even in a LocalExecutionEnvironment with DOP set to 1 it still doesn't work.

So far the only method that worked was creating two new data sets for every join (i.e. no data set is used for a join twice).

But this also caused some issues on the cluster (YARN). It would work fine with 14 machines but deadlock with 16.

But this workaround becomes less feasible the more joins I add (most of them are just broadcast joins to add data from different to my main data set).

> Selfjoin fails after DataSet exceeds certain size
> -------------------------------------------------
>
>                 Key: FLINK-1141
>                 URL: https://issues.apache.org/jira/browse/FLINK-1141
>             Project: Flink
>          Issue Type: Bug
>          Components: Local Runtime
>    Affects Versions: 0.6.1-incubating
>         Environment: LocalExecutionEnvironment (dop=4)
>            Reporter: Robert Waury
>            Priority: Minor
>         Attachments: LargeSelfJoin.java
>
>
> As soon as a DataSet exceeds a certain size (1000000 tuples in my example) a Selfjoin with a FlatJoinFunction no longer works. After around a second the Join, DataSource and DataSink threads are all in Wait and don't perform any work (no output files are created) and the job never finishes.
> If I cut the input size in half it works fine.
> My current workaround is to create the DataSet twice and join the two identical DataSets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)