You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Daniel Bali (JIRA)" <ji...@apache.org> on 2015/03/02 18:53:04 UTC
[jira] [Created] (FLINK-1628) Strange behavior of "where" function
during a join
Daniel Bali created FLINK-1628:
----------------------------------
Summary: Strange behavior of "where" function during a join
Key: FLINK-1628
URL: https://issues.apache.org/jira/browse/FLINK-1628
Project: Flink
Issue Type: Bug
Affects Versions: 0.9
Reporter: Daniel Bali
Hello!
If I use the `where` function with a field list during a join, it exhibits strange behavior.
Here is the sample code that triggers the error: https://gist.github.com/balidani/d9789b713e559d867d5c
This example joins a DataSet with itself, then counts the number of rows. If I use `.where(0, 1)` the result is (22), which is not correct. If I use `EdgeKeySelector`, I get the correct result (101).
When I pass a field list to the `equalTo` function (but not `where`), everything works again.
If I don't include the `groupBy` and `reduceGroup` parts, everything works.
Also, when working with large DataSets, passing a field list to `where` makes it incredibly slow, even though I don't see any exceptions in the log (in DEBUG mode).
Does anybody know what might cause this problem?
Thanks!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)