You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2019/01/12 01:39:00 UTC

[jira] [Commented] (IMPALA-7738) Implement timeouts for hdfsOpenFile() calls

    [ https://issues.apache.org/jira/browse/IMPALA-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740948#comment-16740948 ] 

Joe McDonnell commented on IMPALA-7738:
---------------------------------------

Originally, this Jira covered implementing timeouts for all HDFS operations. Functionality to add timeouts for hdfsOpenFile() has been merged. To better track which parts end up in each Impala release, I'm splitting this into multiple JIRAs. This Jira covers timeouts for hdfsOpenFile(). IMPALA-8074 covers timeouts for hdfsRead(). IMPALA-8075 tracks timeouts for remaining HDFS operations (such as mkdir, move, etc).

> Implement timeouts for hdfsOpenFile() calls
> -------------------------------------------
>
>                 Key: IMPALA-7738
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7738
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>            Reporter: Michael Ho
>            Assignee: Joe McDonnell
>            Priority: Critical
>
> Currently, there is no timeout with the various HDFS calls (e.g. hdfsOpen(), hdfsRead()) we made in libhdfs.so in either the disk-io-mgr thread or scanner thread context. Various users of Impala have complaint in the past about hung queries which eventually boiled down to stuck hdfs calls. HDFS maintainers have been slow to find the root cause of those hangs. To make this kind of stuck queries problem easier to identify in the future, we should just enforce a timeout in various hdfs calls so the queries will fail when certain HDFS calls take longer than a designated timeout period.
> There may be multiple layers which this timeout can be enforced:
>  * at Impala level, we can have a fixed sized thread pool which handles all hdfs calls. The existing hdfs calls will be a wrapper with a timeout.
>  * at libhdfs.so, enforce a timeout at places in the HDFS client code which may block forever.
> The second option is probably beyond the charter of Apache Impala project.
> cc'ing [~tarmstrong@cloudera.com], [~joemcdonnell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org