You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2019/01/30 22:46:00 UTC

[jira] [Commented] (IMPALA-6159) DataStreamSender should transparently handle some connection reset by peer

    [ https://issues.apache.org/jira/browse/IMPALA-6159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756632#comment-16756632 ] 

Todd Lipcon commented on IMPALA-6159:
-------------------------------------

Ran into a similar issue today on a cluster where a couple nodes had been hard-rebooted (power cycled rather than graceful restart). The Impala cluster here was idle for a couple days, but then once I started running queries against it, they would fail with TransmitData RPC errors trying to talk to the nodes that had been power-cycled. After some investigation we diagnosed the issue as the following:

- prior to the power cycle, every impalad had a connection to the target node
- when the power was cycled, the target node didn't send any TCP RST, because the process (and the kernel) never got a chance to shut down cleanly
- when the machine was back up, other hosts still believed to be connected to it, but the sockets were not open on the cycled machine
- because impalad was idle with no RPCs, this state persists indefinitely
- when I run a query, eventually one node wants to exchange some data to the cycled node. The sender thinks an RPC connection is open, so sends the packet on that existing connection. It immediately gets an RST, which fails the query (because there's no retry, as observed in this JIRA). If there were a retry, the KRPC subsystem would happily re-establish a new connection and proceed with the query.

Unfortunately, each time I run a query, only one node gets as far as sending data to the cycled node, and "realizes" its been cycled. So, with 100 nodes in the cluster, we need to run 100 queries and let them fail before we'll have gotten everyone to realize there's been a power cycled node. Pretty bad stuff.

As for solutions:
- TransmitData should probably retry, and use sequence numbers to ensure we don't end up with a duplicate in odd failure scenarios.
- We should enable SO_KEEPALIVE and probably SO_USER_TIMEOUT on the TCP streams, so that when a node is cycled, the other nodes figure it out within some bounded amount of time. We should probably also set the keepalive idle time with some kind of jitter so that we don't have thundering herds of keepalive packets at regular intervals across the cluster (since most of the connections likely go idle at the same time as each other)

> DataStreamSender should transparently handle some connection reset by peer
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-6159
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6159
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>            Reporter: Michael Ho
>            Priority: Major
>
> A client to server KRPC connection can become stale if the socket was closed on the server side due to various reasons such as idle connection removal or remote Impalad restart. Currently, the KRPC code will invoke the callback of all RPCs using that stale connection with the failed status (e.g. "Connection reset by peer"). DataStreamSender should pattern match against certain error string (as they are mostly output from strerror()) and retry the RPC transparently. This may be also be useful for KUDU-2192 which tracks the effort to detect stuck connection and close them. In which case, we may also want to transparently retry the RPC
> FWIW, KUDU-279 is tracking the effort to have a cleaner protocol for connection teardown due to idle client connection removal on the server side. However, Impala still needs to handle other reasons for a stale connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org