You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Thomas Tauber-Marshall (JIRA)" <ji...@apache.org> on 2017/07/24 14:37:00 UTC
[jira] [Resolved] (IMPALA-5167) Reduce number of Kudu clients that
get created
[ https://issues.apache.org/jira/browse/IMPALA-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Tauber-Marshall resolved IMPALA-5167.
--------------------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.10.0
commit 399b184bbcf5a1fb06b5afbebf9062e69d02beed
Author: Thomas Tauber-Marshall <tm...@cloudera.com>
Date: Tue May 16 09:37:03 2017 -0700
IMPALA-5167: Reduce the number of Kudu clients created (FE)
Creating Kudu clients is very expensive as each will fetch
metadata from the Kudu master, so we should minimize the
number of Kudu clients that get created.
This patch stores a map from Kudu master addressed to Kudu
clients in KuduUtil to be used across the FE and catalog.
Another patch has already addressed the BE.
Future work will consider providing a way to invalidate
the stored Kudu clients in case something goes wrong
(IMPALA-5685)
This relies on two changes on the Kudu side: one that clears
non-covered range entries from the client's cache on table
open (d07ecd6ded01201c912d2e336611a6a941f48d98), and one
that automatically refreshes auth tokens when they expire
(603c1578c78c0377ffafdd9c427ebfd8a206bda3).
This patch disables some tests that no longer work as
they relied on Kudu metadata loading operations timing out,
but since we're reusing clients the metadata is already
loaded when the test is run.
Testing:
- Ran a stress test on a 10 node cluster: scan of a small
Kudu table, 1000 concurrent queries, load on the Kudu
master was reduced signficantly, from ~50% cpu to ~5%.
(with the BE changes included)
- Ran the Kudu e2e tests.
- Manually ran a test with concurrent INSERTs and
'ALTER TABLE ADD PARTITION' (which is affected by the
Kudu side change mentiond above) and verified
correctness.
Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e
Reviewed-on: http://gerrit.cloudera.org:8080/6898
Reviewed-by: Thomas Tauber-Marshall <tm...@cloudera.com>
Tested-by: Impala Public Jenkins
> Reduce number of Kudu clients that get created
> ----------------------------------------------
>
> Key: IMPALA-5167
> URL: https://issues.apache.org/jira/browse/IMPALA-5167
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 2.8.0
> Reporter: Matthew Jacobs
> Assignee: Thomas Tauber-Marshall
> Labels: kudu
> Fix For: Impala 2.10.0
>
>
> Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master. We can reduce the load on the Kudu master by reusing Kudu clients when possible. To start, we can use a single client for the entire BE and another for the entire FE.
> This is dependent on a metadata invalidation improvement from Kudu (https://gerrit.cloudera.org/#/c/6719/)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)