You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/03/04 03:11:00 UTC

[jira] [Commented] (IMPALA-9458) Improve runtime profile counters for slow IO from remote stores

    [ https://issues.apache.org/jira/browse/IMPALA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050734#comment-17050734 ] 

Tim Armstrong commented on IMPALA-9458:
---------------------------------------

I did a couple of things recently that might cover or overlap with this. Linked the JIRAs.

> Improve runtime profile counters for slow IO from remote stores
> ---------------------------------------------------------------
>
>                 Key: IMPALA-9458
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9458
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Sahil Takiar
>            Priority: Major
>
> Remote storage systems (e.g. cloud stores like S3 and ABFS) often have long tail latencies. Most I/O finishes relatively quickly, but some calls make take significantly longer. Even for HDFS, this is an issue (e.g. hedged reads were developed to help mitigate tail latencies, although no such feature exists for cloud storage connectors).
> Currently, scan nodes just track the total amount of time spent reading data. It would be good to have a summary stats counter that tracks the min, avg, and max time spent reading data. This should at least allow us to identify when calls to remote storage services are taking longer than usual.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org