You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/11/29 01:25:00 UTC

[jira] [Commented] (IMPALA-11022) Impala uses wrong file descriptors for Iceberg tables in local catalog mode

    [ https://issues.apache.org/jira/browse/IMPALA-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450158#comment-17450158 ] 

ASF subversion and git services commented on IMPALA-11022:
----------------------------------------------------------

Commit da53428abc84ee351367258ff26d20fecd4c37c9 in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=da53428 ]

IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

When local catalog mode is used, Impala retrieves the Iceberg
snapshot from CatalogD. The response contains a map of the file
descriptors. The file descriptors contain block location information,
but the hosts are only referred to by indexes. In the Coordinator's
local catalog the host indexes might refer to different hosts than in
CatalogD. This might lead to unnecessary remote reads as scan ranges
are scheduled to random hosts.

This patch properly translates the host index to the coordinators host
list, so block locations remain consistent.

Testing:
 * tested manually on a 6-node cluster, and verified that the file
   locations are consistent with HDFS
 * added unit test to LocalCatalogTest

Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Reviewed-on: http://gerrit.cloudera.org:8080/18041
Reviewed-by: Qifan Chen <qc...@cloudera.com>
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Impala uses wrong file descriptors for Iceberg tables in local catalog mode
> ---------------------------------------------------------------------------
>
>                 Key: IMPALA-11022
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11022
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> When local catalog mode is used, Impala retrieves the Iceberg snapshot from CatalogD. The response contains a map of the file descriptors.
> https://github.com/apache/impala/blob/b692a92fa2a2277a185fb5823592609b4603c0d8/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L1006
> The file descriptors contain block location information, but the hosts are only referred by indexes.
> https://github.com/apache/impala/blob/b692a92fa2a2277a185fb5823592609b4603c0d8/common/fbs/CatalogObjects.fbs#L50
> In the Coordinator's local catalog the host indexes might refer to different hosts than in CatalogD. We should translate the host indexes to the coordinators host list. Similarly to the LocalFsTable:
> https://github.com/apache/impala/blob/b692a92fa2a2277a185fb5823592609b4603c0d8/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L983
> https://github.com/apache/impala/blob/b692a92fa2a2277a185fb5823592609b4603c0d8/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L1020-L1024



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org