You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "bharath v (JIRA)" <ji...@apache.org> on 2017/05/24 15:46:04 UTC
[jira] [Resolved] (IMPALA-1972) Queries that take a long time to
plan can cause webserver to block other queries
[ https://issues.apache.org/jira/browse/IMPALA-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
bharath v resolved IMPALA-1972.
-------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.8.0
Author: Bharath Vissapragada <bh...@cloudera.com>
Date: 2017-05-23 (Tue, 23 May 2017)
Changed paths:
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
A tests/custom_cluster/test_query_concurrency.py
M www/query_plan.tmpl
Log Message:
-----------
IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ contention
Holding client_request_state_map_lock_ and CRS::lock_ together in certain
paths could potentially block the impalad from registering new queries.
The most common occurrence of this is while loading the webpage of a
query while the query planning is still in progress. Since we hold the
CRS::lock_ during planning, it blocks the web page from loading which
inturn blocks incoming queries by holding client_request_state_map_lock_.
This patch makes client_request_state_map_lock_ a terminal lock so that
we don't have interleaving locking with CRS::lock_.
Testing: Tested it locally by adding a long sleep in
JniFrontend.createExecRequest() and still was able to refresh the web UI
and run parallel queries. Also added a custom cluster test that does the
same sequence of actions by injecting a metadata loading pause.
Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
Reviewed-on: http://gerrit.cloudera.org:8080/6707
Reviewed-by: Bharath Vissapragada <bh...@cloudera.com>
Tested-by: Impala Public Jenkins
> Queries that take a long time to plan can cause webserver to block other queries
> --------------------------------------------------------------------------------
>
> Key: IMPALA-1972
> URL: https://issues.apache.org/jira/browse/IMPALA-1972
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.2, Impala 2.3.0
> Reporter: Henry Robinson
> Assignee: bharath v
> Labels: hang, performance
> Fix For: Impala 2.8.0
>
>
> h3. Summary
> Trying to get the details of a query through the debug web page while the query is planning will block other queries (and the UI itself), because {{query_exec_state_map_lock_}} will be held for the duration of planning.
> h3. Details
> While a query is planning, it holds onto its query exec state's lock:
> {code}
> lock_guard<mutex> l(*(*exec_state)->lock());
> // register exec state as early as possible so that queries that
> // take a long time to plan show up, and to handle incoming status
> // reports before execution starts.
> RETURN_IF_ERROR(RegisterQuery(session_state, *exec_state));
> *registered_exec_state = true;
> // GetExecRequest() does planning
> RETURN_IF_ERROR((*exec_state)->UpdateQueryStatus(
> exec_env_->frontend()->GetExecRequest(query_ctx, &result)));
> {code}
> *Query details callback*
> {{ImpalaServer::QuerySummaryCallback}}, which handles {{/query_plan}}, tries to get the same exec state's lock (see {{true}} argument to {{GetQueryExecState()}}.
> {code}
> shared_ptr<QueryExecState> exec_state = GetQueryExecState(query_id, true);
> {code}
> {{GetQueryExecState()}} holds {{query_exec_state_map_lock_}} while it waits to get the {{QueryExecState}}'s lock:
> {code}
> shared_ptr<ImpalaServer::QueryExecState> ImpalaServer::GetQueryExecState(
> const TUniqueId& query_id, bool lock) {
> lock_guard<mutex> l(query_exec_state_map_lock_);
> QueryExecStateMap::iterator i = query_exec_state_map_.find(query_id);
> if (i == query_exec_state_map_.end()) {
> return shared_ptr<QueryExecState>();
> } else {
> if (lock) i->second->lock()->lock();
> return i->second;
> }
> }
> {code}
> So until planning is finished, no query can get {{query_exec_state_map_lock_}}, which it needs to execute.
> h3. What can we do?
> In the short term, maybe we can add {{TryGetQueryExecState()}} which will indicate if the query exists but its lock can't be taken.
> Or we might be able to let go of {{query_exec_state_map_lock_}} as soon as we find the entry, and before taking its lock:
> {code}
> shared_ptr<ImpalaServer::QueryExecState> ImpalaServer::GetQueryExecState(
> const TUniqueId& query_id, bool lock) {
> shared_ptr<QueryExecState> ret;
> {
> lock_guard<mutex> l(query_exec_state_map_lock_);
> QueryExecStateMap::iterator i = query_exec_state_map_.find(query_id);
> if (i == query_exec_state_map_.end()) {
> return shared_ptr<QueryExecState>();
> } else {
> ret = i->second;
> }
> } // give up query_exec_state_map_lock_
> if (lock) ret->lock()->lock();
> return ret;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)