You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ádám Szita (Jira)" <ji...@apache.org> on 2021/10/26 09:24:00 UTC
[jira] [Created] (HIVE-25651) Enable LLAP cache affinity for
Iceberg ORC splits
Ádám Szita created HIVE-25651:
---------------------------------
Summary: Enable LLAP cache affinity for Iceberg ORC splits
Key: HIVE-25651
URL: https://issues.apache.org/jira/browse/HIVE-25651
Project: Hive
Issue Type: Improvement
Reporter: Ádám Szita
Assignee: Ádám Szita
Since HiveIcebergInputformat doesn't implement any LLAP marker interfaces, cache affinity is never tried, and so any split containing ORC file parts may go to a random LLAP daemon, causing subpar hit ratio later.
So we should:
* let HS2 know that cache affinity is required for this inputformat
* prevent Iceberg from grouping separate files together in one combined split in case of LLAP execution
* provide proper getPath() result for HiveIcebergSplit, so that HostAffinitySplitLocationProvider calculates different hashes for different files (right now getPath() returns table location only)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)