You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (JIRA)" <ji...@apache.org> on 2019/07/25 22:07:00 UTC
[jira] [Resolved] (IMPALA-8606) GET_TABLES performance in local
catalog mode
[ https://issues.apache.org/jira/browse/IMPALA-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang resolved IMPALA-8606.
------------------------------------
Resolution: Fixed
> GET_TABLES performance in local catalog mode
> --------------------------------------------
>
> Key: IMPALA-8606
> URL: https://issues.apache.org/jira/browse/IMPALA-8606
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 3.2.0
> Reporter: Balazs Jeszenszky
> Assignee: Quanlong Huang
> Priority: Blocker
> Labels: catalog-v2
>
> With local catalog mode enabled, GET_TABLES JDBC requests will return more than the always available table information. Any request for more metadata about a table will trigger a full load of that table on the catalogd side, meaning that GET_TABLES triggers the load of the entire catalog. Also, as far as I can see, the requests for more metadata are made one table at a time.
> Once the tables are loaded on the catalogd-side, a coordinator needs 3 roundtrips to the catalog to fetch all the details about a single table. My test case had around 57k tables, 1700 DBs, and ~120k partitions.
> GET_TABLES on a cold catalog takes 18 minutes. With a warm catalog, but cold impalad, it still takes ~70 seconds.
> Many tools use GET_TABLES to populate dropdowns, etc. so this is bad for both end user experience and catalog memory usage.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)