You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ádám Szita (Jira)" <ji...@apache.org> on 2021/10/26 09:24:00 UTC

[jira] [Created] (HIVE-25651) Enable LLAP cache affinity for Iceberg ORC splits

Ádám Szita created HIVE-25651:
---------------------------------

             Summary: Enable LLAP cache affinity for Iceberg ORC splits
                 Key: HIVE-25651
                 URL: https://issues.apache.org/jira/browse/HIVE-25651
             Project: Hive
          Issue Type: Improvement
            Reporter: Ádám Szita
            Assignee: Ádám Szita


Since HiveIcebergInputformat doesn't implement any LLAP marker interfaces, cache affinity is never tried, and so any split containing ORC file parts may go to a random LLAP daemon, causing subpar hit ratio later.

So we should:
 * let HS2 know that cache affinity is required for this inputformat
 * prevent Iceberg from grouping separate files together in one combined split in case of LLAP execution
 * provide proper getPath() result for HiveIcebergSplit, so that HostAffinitySplitLocationProvider calculates different hashes for different files (right now getPath() returns table location only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)