You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Runkang He (Jira)" <ji...@apache.org> on 2022/04/28 00:36:00 UTC

[jira] [Commented] (FLINK-26929) Introduce adaptive hash join for batch sql optimization

    [ https://issues.apache.org/jira/browse/FLINK-26929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529131#comment-17529131 ] 

Runkang He commented on FLINK-26929:
------------------------------------

Hi, dalongliu, is this Jira try to solve the problem occurred in hashjoin operator when recursion number exceed the max number?

The original log is:

_Hash join exceeded maximum number of recursions, without reducing partitions enough to be memory resident. Probably cause: Too many duplicate keys_

> Introduce adaptive hash join for batch sql optimization
> -------------------------------------------------------
>
>                 Key: FLINK-26929
>                 URL: https://issues.apache.org/jira/browse/FLINK-26929
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / Runtime
>            Reporter: dalongliu
>            Priority: Major
>             Fix For: 1.16.0
>
>
> We propose an optimization method adaptive hash join for the batch join scenario, hoping to integrate the advantages of sorted-merge join and hash join according to the characteristics of runtime data. The adaptive hash join will try to use hash join strategy firstly, if it failed, will fall back to sort merge join.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)