You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Pranay Singh (JIRA)" <ji...@apache.org> on 2018/06/13 17:46:00 UTC
[jira] [Updated] (IMPALA-7168) DML query may hang if
CatalogUpdateCallback() encounters repeated error
[ https://issues.apache.org/jira/browse/IMPALA-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pranay Singh updated IMPALA-7168:
---------------------------------
Description:
DML queries or INSERT will encounter a hang, if exec_env_->frontend()->UpdateCatalogCache() in ImpalaServer::CatalogUpdateCallback encounters repeated error like ENOMEM.
This happens with SYNC_DDL set to 1 when the coordinator node is waiting for it's catalog version to become current.
The scenario shows up like this, lets say there are two coordinator nodes , Node A, Node B
and catalogd and statestored are running on Node C.
a) CREATE TABLE is executed on Node A, with SYNC_DDL set to 1, the thread running the query is going to block in impala::ImpalaServer::ProcessCatalogUpdateResult(), waiting for it's catalog version to become current.
b) Meanwhile statestored running on Node C would call ImpalaServer::CatalogUpdateCallback on Node B via thrift RPC to do a delta topic update, which would not happen if we encounter repeated errors, say front end is low on memory (low JVM heap situation).
c) In such case Node A will wait indefinitely waiting for it's catalog version to become current, till Node B is shutdown voluntarily.
Note: This is a case where Node B is reachable (hearbeat is fine, but bad node) since
was:
DML queries or INSERT will encounter a hang, if exec_env_->frontend()->UpdateCatalogCache() in ImpalaServer::CatalogUpdateCallback encounters repeated error like ENOMEM.
This happens with SYNC_DDL set to 1 when the coordinator node is waiting for it's catalog version to become current.
The scenario shows up like this, lets say there are two coordinator nodes , Node A, Node B
and catalogd and statestored are running on Node C.
a) CREATE TABLE is executed on Node A, with SYNC_DDL set to 1, the thread running the query is going to block in impala::ImpalaServer::ProcessCatalogUpdateResult(), waiting for it's catalog version to become current.
b) Meanwhile statestored running on Node C would call ImpalaServer::CatalogUpdateCallback on Node B via thrift RPC to do a delta topic update, which would not happen if we encounter repeated errors, say front end is low on memory (low JVM heap situation).
c) In such case Node A will wait indefinitely, till Node B is shutdown voluntarily.Note this is case where Node B is reachable (hearbeat is fine, but bad node)
> DML query may hang if CatalogUpdateCallback() encounters repeated error
> -----------------------------------------------------------------------
>
> Key: IMPALA-7168
> URL: https://issues.apache.org/jira/browse/IMPALA-7168
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
> Reporter: Pranay Singh
> Priority: Major
>
> DML queries or INSERT will encounter a hang, if exec_env_->frontend()->UpdateCatalogCache() in ImpalaServer::CatalogUpdateCallback encounters repeated error like ENOMEM.
> This happens with SYNC_DDL set to 1 when the coordinator node is waiting for it's catalog version to become current.
> The scenario shows up like this, lets say there are two coordinator nodes , Node A, Node B
> and catalogd and statestored are running on Node C.
> a) CREATE TABLE is executed on Node A, with SYNC_DDL set to 1, the thread running the query is going to block in impala::ImpalaServer::ProcessCatalogUpdateResult(), waiting for it's catalog version to become current.
> b) Meanwhile statestored running on Node C would call ImpalaServer::CatalogUpdateCallback on Node B via thrift RPC to do a delta topic update, which would not happen if we encounter repeated errors, say front end is low on memory (low JVM heap situation).
> c) In such case Node A will wait indefinitely waiting for it's catalog version to become current, till Node B is shutdown voluntarily.
> Note: This is a case where Node B is reachable (hearbeat is fine, but bad node) since
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org