You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/09/18 16:33:00 UTC

[jira] [Resolved] (IMPALA-5199) Impala may hang on empty row batch exchange

     [ https://issues.apache.org/jira/browse/IMPALA-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-5199.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.11.0


IMPALA-5199: prevent hang on empty row batch exchange

The error path where delivery of "eos" fails now behaves
the same as if delivery of a row batch fails.

Testing:
Added a timeout test where the leaf fragments return 0 rows. Before
the change this reproduced the hang.

Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Reviewed-on: http://gerrit.cloudera.org:8080/8005
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins

> Impala may hang on empty row batch exchange
> -------------------------------------------
>
>                 Key: IMPALA-5199
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5199
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>    Affects Versions: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>            Reporter: Sailesh Mukil
>            Assignee: Sailesh Mukil
>            Priority: Critical
>              Labels: hang
>             Fix For: Impala 2.11.0
>
>
> Hang reproduction:
> Start impala with the following stress timeouts (the same as those in test_exchange_delays):
> {code}
> bin/start-impala-cluster.py --impalad_args=--stress_datastream_recvr_delay_ms=10000 --impalad_args=--datastream_sender_timeout_ms=5000
> {code}
> Do a join where 0 rows will be exchanged:
> {code}
> select count(*) from tpch.lineitem inner join tpch.orders on l_orderkey=o_orderkey where o_orderkey < 0
> {code}
> When the data-stream-sender sends 0 rows, it sends it with params.row_batch.num_rows=0 and params.eos=true.
> The receiver hits the following case:
> https://github.com/apache/incubator-impala/blob/master/be/src/service/impala-server.cc#L1133
> It ultimately times out from FindRecvrOrWait() after datastream_sender_timeout_ms.
> https://github.com/apache/incubator-impala/blob/0d0c93ec8c4949940ec113192731f2adb66a0c5e/be/src/runtime/data-stream-mgr.cc#L123
> Due to stress_datastream_recvr_delay_ms, the exchange node is Prepare()'d even later.
> https://github.com/apache/incubator-impala/blob/a50c344077f6c9bbea3d3cbaa2e9146ba20ac9a9/be/src/exec/exchange-node.cc#L74
> Although this is a testing flag, this behavior is reasonable to expect in a real world setting under heavy stress, since the stress_datastream_recvr_delay_ms flag was added into testing after some users experienced nodes receiving data from DataStreamSenders and timing out after datastream_sender_timeout_ms, and before the ExchangeNodes were created. Thus, the CloseSender() function is called before the exchange node is created.
> After stress_datastream_recvr_delay_ms, the ExchangeNode is created and it calls CreateRecvr().
> https://github.com/apache/incubator-impala/blob/a50c344077f6c9bbea3d3cbaa2e9146ba20ac9a9/be/src/exec/exchange-node.cc#L80
> Then ExchangeNode::Open() calls DataStreamRecvr::CreateMerger() which makes a few threads wait on DataStreamRecvr::SenderQueue::GetBatch().
> https://github.com/apache/incubator-impala/blob/0d0c93ec8c4949940ec113192731f2adb66a0c5e/be/src/runtime/data-stream-recvr.cc#L124
> However, since the DataStreamSender has already received the timeout error from earlier, the receiver threads in GetBatch() will wait indefinitely until the query is cancelled by the user.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)