You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Ralph Little <rl...@inetco.com> on 2017/06/08 23:47:23 UTC

Native C++ Drill client handshake recovery

Hi,
I have been looking into an issue where the native client hangs after 
failing to handshake during connection.
We have some boundary conditions where a connection is attempted while 
the drill service is starting resulting in handshake failures.

--

DrillClient::connect() ends up ultimately in 
DrillClientImpl::recvHandshake().

This function calls async_read() with a callback to 
DrillClientImpl::handleHandshake to handle the results of handshaking.
However, on error, DrillClientImpl::handleHandshake ends up calling 
handleConnError() which merely calls shutdownSocket() to kill the socket 
and set m_bIsConnected to false.
When all that unfolds,  DrillClientImpl::validateHandshake() ends up 
returning CONN_SUCCESS which is clearly wrong because the handshake failed.

The original caller to DrillClient::connect() thinks everything is 
hunky-dorey.

The following added to DrillClientImpl::recvHandshake() after the 
m_io_service.run() line to check the result seems to fix the problem:

     if (!m_bIsConnected)
     {
         return CONN_HANDSHAKE_FAILED;
     }

...but I wonder if there is a better solution.

Also, although there is the IsActive() member that applications can call 
to determine if the connection is up, I wonder if the front-line drill 
calls (such as submitQuery()) could check the connection status a matter 
of course.
Currently, if you attempt a submitQuery() call when the connection is 
down, it just hangs because m_io_service is not running and 
m_deadlineTimer never triggers as a fall back.

Opinions?

Cheers,
Ralph Little

Re: Native C++ Drill client handshake recovery

Posted by Parth Chandra <pa...@apache.org>.
see inline responses

On Thu, Jun 8, 2017 at 4:47 PM, Ralph Little <rl...@inetco.com> wrote:

> Hi,
> I have been looking into an issue where the native client hangs after
> failing to handshake during connection.
> We have some boundary conditions where a connection is attempted while the
> drill service is starting resulting in handshake failures.
>
> --
>
> DrillClient::connect() ends up ultimately in DrillClientImpl::recvHandshake
> ().
>
> This function calls async_read() with a callback to
> DrillClientImpl::handleHandshake to handle the results of handshaking.
> However, on error, DrillClientImpl::handleHandshake ends up calling
> handleConnError() which merely calls shutdownSocket() to kill the socket
> and set m_bIsConnected to false.
> When all that unfolds,  DrillClientImpl::validateHandshake() ends up
> returning CONN_SUCCESS which is clearly wrong because the handshake failed.
>
> The original caller to DrillClient::connect() thinks everything is
> hunky-dorey.
>

Yes, that would be a problem. From what I remember, the recvHandshake call
blocks in m_ioservice.run. On return from run, the recvHandshake checks if
the error object m_pError is not null. m_pError is not null iff there has
been an error. Do you see this not working correctly?



> The following added to DrillClientImpl::recvHandshake() after the
> m_io_service.run() line to check the result seems to fix the problem:
>
>     if (!m_bIsConnected)
>     {
>         return CONN_HANDSHAKE_FAILED;
>     }
>
>
Certainly if it addresses the problem, then this is a perfectly good fix.



> ...but I wonder if there is a better solution.
>
Also, although there is the IsActive() member that applications can call to
> determine if the connection is up, I wonder if the front-line drill calls
> (such as submitQuery()) could check the connection status a matter of
> course.
>

AFAIK, the IsActive method is an artifact of some old initial
implementation and I'm not sure if it is used much.


> Currently, if you attempt a submitQuery() call when the connection is
> down, it just hangs because m_io_service is not running and m_deadlineTimer
> never triggers as a fall back.
>
> Opinions?
>

It is a good idea to check connection status before sending any message to
the server. LMK if you want to submit a patch :), I can review and merge it
in.


>
> Cheers,
> Ralph Little
>