You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Konstantin Orlov (Jira)" <ji...@apache.org> on 2024/01/17 14:03:00 UTC

[jira] [Created] (IGNITE-21286) Sql. Enable correlated join

Konstantin Orlov created IGNITE-21286:
-----------------------------------------

Summary: Sql. Enable correlated join
Key: IGNITE-21286
URL: https://issues.apache.org/jira/browse/IGNITE-21286
Project: Ignite
Issue Type: Improvement
Components: sql
Reporter: Konstantin Orlov

As for now, implementation of correlated join has a number of performance problems:
# Opening a cursor over store is quite expensive. Given that table split on partitions, actual number of lookups should be multiplied by number of partitions. This should be accounted by cost function.
# Integration with storage (not quite a problem of particular implementation of correlated join, but indirectly affects it): every lookup to a storage actually schedules a task in different thread pool. When the scan result is ready, it schedules a task in sql query task executor. Given that we process only one correlate at a time, we are scheduling now `partCount * 2` tasks per every row from left shoulder of join. This is very inefficient for single-row lookups of a small table on the right shoulder (we spent significantly more time on tasks coordination rather than on an actual job).

We need to improve performance of correlated join in general, or at least find out cases where it performs better that other types of joins and enable correlated join only for those cases.

--
This message was sent by Atlassian Jira
(v8.20.10#820010)