You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Robert Waury (JIRA)" <ji...@apache.org> on 2014/10/08 13:01:33 UTC
[jira] [Created] (FLINK-1141) Selfjoin fails after DataSet exceeds
certain size
Robert Waury created FLINK-1141:
-----------------------------------
Summary: Selfjoin fails after DataSet exceeds certain size
Key: FLINK-1141
URL: https://issues.apache.org/jira/browse/FLINK-1141
Project: Flink
Issue Type: Bug
Components: Local Runtime
Affects Versions: 0.6.1-incubating
Environment: LocalExecutionEnvironment (dop=4)
Reporter: Robert Waury
Priority: Minor
As soon as a DataSet exceeds a certain size (1000000 tuples in my example) a Selfjoin with a FlatJoinFunction no longer works. After around a second the Join, DataSource and DataSink threads are all in Wait and don't perform any work (no output files are created) and the job never finishes.
If I cut the input size in half it works fine.
My current workaround is to create the DataSet twice and join the two identical DataSets.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)