You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Xingcan Cui (JIRA)" <ji...@apache.org> on 2017/10/01 14:20:00 UTC

[jira] [Commented] (FLINK-7730) TableFunction LEFT OUTER joins with ON predicates are broken

    [ https://issues.apache.org/jira/browse/FLINK-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187397#comment-16187397 ] 

Xingcan Cui commented on FLINK-7730:
------------------------------------

Hi [~fhueske], I've checked the implementation and got some clues in the SQL level. For instance, given a simple table {{WordCount(word:String, frequency:Int)}}, a table function {{split: word:String => (letter:String, length:String)}}, and a SQL like that
{code:sql}
SELECT word, name, length
FROM WordCount
LEFT JOIN LATERAL TABLE(split(word)) AS T (letter, length) ON frequency = length OR length < 5
{code}
, the query will be translated to the logical plan below.
{code}
LogicalProject(word=[$0], name=[$2], length=[$3])
  LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
    LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0}])
      LogicalTableScan(table=[[WordCount]])
      LogicalTableFunctionScan(invocation=[split($cor0.word)], rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class [Ljava.lang.Object;])
{code}
Apparently, this logical plan will lead to an improper physical plan, which first correlates each row with its table function results in the {{CorrelateFlatMapRunner}} (just like performing a cartesian product) and then filters the rows in a {{FlatMapFunction}}. It only works for inner join, but not for left outer join.

Under this circumstance, could you give some suggestions on fixing this? BTW, I find it's hard to debug the dynamically generated functions (the IDE can not locate the source), is there some special techniques that can be used to deal with them?

> TableFunction LEFT OUTER joins with ON predicates are broken
> ------------------------------------------------------------
>
>                 Key: FLINK-7730
>                 URL: https://issues.apache.org/jira/browse/FLINK-7730
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>    Affects Versions: 1.4.0, 1.3.2
>            Reporter: Fabian Hueske
>            Assignee: Xingcan Cui
>
> TableFunction left outer joins with predicates in the ON clause are broken. Apparently, the are no tests for this and it has never worked. I observed issues on several layers:
> - Table Function does not correctly validate equality predicate: {{leftOuterJoin(func1('c) as 'd,  'a.cast(Types.STRING) === 'd)}} is rejected because the predicate is not considered as an equality predicate (the cast needs to be pushed down).
> - Plans cannot be correctly translated: {{leftOuterJoin(func1('c) as 'd,  'c === 'd)}} gives an optimizer exception.
> - SQL queries get translated but produce incorrect results. For example {{SELECT a, b, c, d FROM MyTable LEFT OUTER JOIN LATERAL TABLE(tfunc(c)) AS T(d) ON d = c}} returns an empty result if the condition {{d = c}} never returns true. However, the outer side should be preserved and padded with nulls.
> So there seem to be many issues with table function outer joins. Especially, the wrong result produced by SQL queries need to be quickly fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)