You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by Antoni Ivanov <ai...@vmware.com> on 2019/08/05 22:40:46 UTC

Query status "Session Closed"

Hi,

I am investigating the most common errors we see in our Impala Cluster.
The most common is with query status = 'Session Closed'

I can see from the code (https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435)
that it is set when Session is closed and this happens when connection is closed (ConnectionEnd<https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L2094>)
and this is called when Thrift transport is closed<https://github.com/apache/impala/blob/82f753e3044bd2482f35d137fbb28516fc0ef86c/be/src/rpc/TAcceptQueueServer.cpp> (and query has not completed or failed in some way it would be marked as Session Closed

Does this mean that the remote end has simply dropped the connection ?
E.g there has been network interruption or someone killed (SIGKILL) the remote process ?
We have (TCP) load balancer (HaProxy) and I am wondering if for example Load Balancer tcp timeout can cause such error. Or can client socket timeout cause it?

I'd be grateful for any insides into the semantics of when "Session Closed" is set.



Thanks,
Antoni

Re: Query status "Session Closed"

Posted by Thomas Tauber-Marshall <tm...@cloudera.com>.
Right, if the connection is being closed you should see that logging.

If the session is being closed due to the idle session timeout you should
see some logging like "Expiring session: <session id>" and the status of
queries that are cancelled as a result should be something like ""Session
expired due to inactivity""

You can also turn on verbose logging at the VLOG_QUERY level to see if
CloseSession() is being explicitly called over hiveserver2

On Tue, Aug 6, 2019 at 11:51 AM Antoni Ivanov <ai...@vmware.com> wrote:

> Thanks
> This helps
>
> Is there any way to figure out which case is which ? Either from the query
> profile or summary ?
> I guess one pointer below was to look in the logs for "Connection from
> client <client
>  <hostname> closed, closing <num> associated sessions "
>
> Thanks,
> Antoni
>
> -----Original Message-----
> From: Tim Armstrong <ta...@cloudera.com>
> Sent: Monday, August 5, 2019 5:51 PM
> To: user@impala.apache.org
> Cc: dev@impala <de...@impala.apache.org>
> Subject: Re: Query status "Session Closed"
>
> You could also be hitting a timeout on an idle session - if the client
> doesn't perform any operations in a session, then you can configure Impala
> to close the session in this fashion. See
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimpala.apache.org%2Fdocs%2Fbuild%2Fhtml%2Ftopics%2Fimpala_timeouts.html&amp;data=02%7C01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=zT%2BlqH0orz4KiyknWbGD%2FvQUSnESBg7byz9cRefbNrs%3D&amp;reserved=0
>
> On Mon, Aug 5, 2019 at 5:15 PM Thomas Tauber-Marshall <
> tmarshall@cloudera.com> wrote:
>
> > Impala has two client interfaces with slightly different session
> behavior:
> >
> > Beewax (default port 21000) - sessions are created when the client
> > connects
> > Hiveserver2 (default port 21050) - sessions are created when
> > OpenSession() is called
> >
> > In both cases, sessions can be closed either if
> > - the connection ends
> > - someone presses "close" on the /sessions page of the debug webui
> >
> > For hiveserver2, sessions can also be closed if CloseSession() is
> > explicitly called by the client
> >
> > So probably the most likely cause of your issue is that the
> > connections are being dropped somehow, possibly due to the load balancer.
> >
> > If sessions are in fact getting closed due to connections being
> > dropped, you should see some lines of the form "Connection from client
> > <client
> > hostname> closed, closing <num> associated sessions"
> >
> > Fwiw, this behavior was recently changed so that hiveserver2 sessions
> > are not closed immediately when the connection ends but timeout if
> > there hasn't been an associated connection for a configurable timeout:
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissu
> > es.apache.org%2Fjira%2Fbrowse%2FIMPALA-1653&amp;data=02%7C01%7Caivanov
> > %40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d
> > 6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=vhbsC4v26p5drWvTG
> > Ozfx44H2kx47DJPkWnv0Ib43Fs%3D&amp;reserved=0 though we haven't done a
> > release since that work went in
> >
> > On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:
> >
> >> Hi,
> >>
> >> I am investigating the most common errors we see in our Impala Cluster.
> >> The most common is with query status = 'Session Closed'
> >>
> >> I can see from the code (
> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
> >> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L1435&amp;data=02%7C
> >> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
> >> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=IQO
> >> RbGfQV8HeTUU%2Fgu6xULUW7mZThm%2FY38hQLLRarug%3D&amp;reserved=0
> >> )
> >> that it is set when Session is closed and this happens when
> >> connection is closed (ConnectionEnd<
> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
> >> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L2094&amp;data=02%7C
> >> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
> >> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=fbH
> >> 1W3zXymjI%2FrGwU5WEnrdwAtLT04PkXmr0zIJoFeo%3D&amp;reserved=0
> >> >)
> >> and this is called when Thrift transport is closed<
> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Fapache%2Fimpala%2Fblob%2F82f753e3044bd2482f35d137fbb28516fc
> >> 0ef86c%2Fbe%2Fsrc%2Frpc%2FTAcceptQueueServer.cpp&amp;data=02%7C01%7Ca
> >> ivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee
> >> 4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=JBZSPzwiX
> >> 4Ta86q9HUl2mC7Bu4fJHBwucqDlu6TmsE0%3D&amp;reserved=0>
> >> (and query has not completed or failed in some way it would be marked
> >> as Session Closed
> >>
> >> Does this mean that the remote end has simply dropped the connection ?
> >> E.g there has been network interruption or someone killed (SIGKILL)
> >> the remote process ?
> >> We have (TCP) load balancer (HaProxy) and I am wondering if for
> >> example Load Balancer tcp timeout can cause such error. Or can client
> >> socket timeout cause it?
> >>
> >> I'd be grateful for any insides into the semantics of when "Session
> >> Closed" is set.
> >>
> >>
> >>
> >> Thanks,
> >> Antoni
> >>
> >
>

Re: Query status "Session Closed"

Posted by Thomas Tauber-Marshall <tm...@cloudera.com>.
Right, if the connection is being closed you should see that logging.

If the session is being closed due to the idle session timeout you should
see some logging like "Expiring session: <session id>" and the status of
queries that are cancelled as a result should be something like ""Session
expired due to inactivity""

You can also turn on verbose logging at the VLOG_QUERY level to see if
CloseSession() is being explicitly called over hiveserver2

On Tue, Aug 6, 2019 at 11:51 AM Antoni Ivanov <ai...@vmware.com> wrote:

> Thanks
> This helps
>
> Is there any way to figure out which case is which ? Either from the query
> profile or summary ?
> I guess one pointer below was to look in the logs for "Connection from
> client <client
>  <hostname> closed, closing <num> associated sessions "
>
> Thanks,
> Antoni
>
> -----Original Message-----
> From: Tim Armstrong <ta...@cloudera.com>
> Sent: Monday, August 5, 2019 5:51 PM
> To: user@impala.apache.org
> Cc: dev@impala <de...@impala.apache.org>
> Subject: Re: Query status "Session Closed"
>
> You could also be hitting a timeout on an idle session - if the client
> doesn't perform any operations in a session, then you can configure Impala
> to close the session in this fashion. See
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimpala.apache.org%2Fdocs%2Fbuild%2Fhtml%2Ftopics%2Fimpala_timeouts.html&amp;data=02%7C01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=zT%2BlqH0orz4KiyknWbGD%2FvQUSnESBg7byz9cRefbNrs%3D&amp;reserved=0
>
> On Mon, Aug 5, 2019 at 5:15 PM Thomas Tauber-Marshall <
> tmarshall@cloudera.com> wrote:
>
> > Impala has two client interfaces with slightly different session
> behavior:
> >
> > Beewax (default port 21000) - sessions are created when the client
> > connects
> > Hiveserver2 (default port 21050) - sessions are created when
> > OpenSession() is called
> >
> > In both cases, sessions can be closed either if
> > - the connection ends
> > - someone presses "close" on the /sessions page of the debug webui
> >
> > For hiveserver2, sessions can also be closed if CloseSession() is
> > explicitly called by the client
> >
> > So probably the most likely cause of your issue is that the
> > connections are being dropped somehow, possibly due to the load balancer.
> >
> > If sessions are in fact getting closed due to connections being
> > dropped, you should see some lines of the form "Connection from client
> > <client
> > hostname> closed, closing <num> associated sessions"
> >
> > Fwiw, this behavior was recently changed so that hiveserver2 sessions
> > are not closed immediately when the connection ends but timeout if
> > there hasn't been an associated connection for a configurable timeout:
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissu
> > es.apache.org%2Fjira%2Fbrowse%2FIMPALA-1653&amp;data=02%7C01%7Caivanov
> > %40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d
> > 6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=vhbsC4v26p5drWvTG
> > Ozfx44H2kx47DJPkWnv0Ib43Fs%3D&amp;reserved=0 though we haven't done a
> > release since that work went in
> >
> > On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:
> >
> >> Hi,
> >>
> >> I am investigating the most common errors we see in our Impala Cluster.
> >> The most common is with query status = 'Session Closed'
> >>
> >> I can see from the code (
> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
> >> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L1435&amp;data=02%7C
> >> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
> >> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=IQO
> >> RbGfQV8HeTUU%2Fgu6xULUW7mZThm%2FY38hQLLRarug%3D&amp;reserved=0
> >> )
> >> that it is set when Session is closed and this happens when
> >> connection is closed (ConnectionEnd<
> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
> >> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L2094&amp;data=02%7C
> >> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
> >> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=fbH
> >> 1W3zXymjI%2FrGwU5WEnrdwAtLT04PkXmr0zIJoFeo%3D&amp;reserved=0
> >> >)
> >> and this is called when Thrift transport is closed<
> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Fapache%2Fimpala%2Fblob%2F82f753e3044bd2482f35d137fbb28516fc
> >> 0ef86c%2Fbe%2Fsrc%2Frpc%2FTAcceptQueueServer.cpp&amp;data=02%7C01%7Ca
> >> ivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee
> >> 4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=JBZSPzwiX
> >> 4Ta86q9HUl2mC7Bu4fJHBwucqDlu6TmsE0%3D&amp;reserved=0>
> >> (and query has not completed or failed in some way it would be marked
> >> as Session Closed
> >>
> >> Does this mean that the remote end has simply dropped the connection ?
> >> E.g there has been network interruption or someone killed (SIGKILL)
> >> the remote process ?
> >> We have (TCP) load balancer (HaProxy) and I am wondering if for
> >> example Load Balancer tcp timeout can cause such error. Or can client
> >> socket timeout cause it?
> >>
> >> I'd be grateful for any insides into the semantics of when "Session
> >> Closed" is set.
> >>
> >>
> >>
> >> Thanks,
> >> Antoni
> >>
> >
>

RE: Query status "Session Closed"

Posted by Antoni Ivanov <ai...@vmware.com>.
Thanks
This helps 

Is there any way to figure out which case is which ? Either from the query profile or summary ?
I guess one pointer below was to look in the logs for "Connection from client <client
 <hostname> closed, closing <num> associated sessions "

Thanks,
Antoni 

-----Original Message-----
From: Tim Armstrong <ta...@cloudera.com> 
Sent: Monday, August 5, 2019 5:51 PM
To: user@impala.apache.org
Cc: dev@impala <de...@impala.apache.org>
Subject: Re: Query status "Session Closed"

You could also be hitting a timeout on an idle session - if the client doesn't perform any operations in a session, then you can configure Impala to close the session in this fashion. See
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimpala.apache.org%2Fdocs%2Fbuild%2Fhtml%2Ftopics%2Fimpala_timeouts.html&amp;data=02%7C01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=zT%2BlqH0orz4KiyknWbGD%2FvQUSnESBg7byz9cRefbNrs%3D&amp;reserved=0

On Mon, Aug 5, 2019 at 5:15 PM Thomas Tauber-Marshall < tmarshall@cloudera.com> wrote:

> Impala has two client interfaces with slightly different session behavior:
>
> Beewax (default port 21000) - sessions are created when the client 
> connects
> Hiveserver2 (default port 21050) - sessions are created when 
> OpenSession() is called
>
> In both cases, sessions can be closed either if
> - the connection ends
> - someone presses "close" on the /sessions page of the debug webui
>
> For hiveserver2, sessions can also be closed if CloseSession() is 
> explicitly called by the client
>
> So probably the most likely cause of your issue is that the 
> connections are being dropped somehow, possibly due to the load balancer.
>
> If sessions are in fact getting closed due to connections being 
> dropped, you should see some lines of the form "Connection from client 
> <client
> hostname> closed, closing <num> associated sessions"
>
> Fwiw, this behavior was recently changed so that hiveserver2 sessions 
> are not closed immediately when the connection ends but timeout if 
> there hasn't been an associated connection for a configurable timeout:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissu
> es.apache.org%2Fjira%2Fbrowse%2FIMPALA-1653&amp;data=02%7C01%7Caivanov
> %40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d
> 6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=vhbsC4v26p5drWvTG
> Ozfx44H2kx47DJPkWnv0Ib43Fs%3D&amp;reserved=0 though we haven't done a 
> release since that work went in
>
> On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:
>
>> Hi,
>>
>> I am investigating the most common errors we see in our Impala Cluster.
>> The most common is with query status = 'Session Closed'
>>
>> I can see from the code (
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
>> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
>> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L1435&amp;data=02%7C
>> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
>> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=IQO
>> RbGfQV8HeTUU%2Fgu6xULUW7mZThm%2FY38hQLLRarug%3D&amp;reserved=0
>> )
>> that it is set when Session is closed and this happens when 
>> connection is closed (ConnectionEnd<
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
>> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
>> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L2094&amp;data=02%7C
>> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
>> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=fbH
>> 1W3zXymjI%2FrGwU5WEnrdwAtLT04PkXmr0zIJoFeo%3D&amp;reserved=0
>> >)
>> and this is called when Thrift transport is closed< 
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
>> hub.com%2Fapache%2Fimpala%2Fblob%2F82f753e3044bd2482f35d137fbb28516fc
>> 0ef86c%2Fbe%2Fsrc%2Frpc%2FTAcceptQueueServer.cpp&amp;data=02%7C01%7Ca
>> ivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee
>> 4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=JBZSPzwiX
>> 4Ta86q9HUl2mC7Bu4fJHBwucqDlu6TmsE0%3D&amp;reserved=0>
>> (and query has not completed or failed in some way it would be marked 
>> as Session Closed
>>
>> Does this mean that the remote end has simply dropped the connection ?
>> E.g there has been network interruption or someone killed (SIGKILL) 
>> the remote process ?
>> We have (TCP) load balancer (HaProxy) and I am wondering if for 
>> example Load Balancer tcp timeout can cause such error. Or can client 
>> socket timeout cause it?
>>
>> I'd be grateful for any insides into the semantics of when "Session 
>> Closed" is set.
>>
>>
>>
>> Thanks,
>> Antoni
>>
>

RE: Query status "Session Closed"

Posted by Antoni Ivanov <ai...@vmware.com>.
Thanks
This helps 

Is there any way to figure out which case is which ? Either from the query profile or summary ?
I guess one pointer below was to look in the logs for "Connection from client <client
 <hostname> closed, closing <num> associated sessions "

Thanks,
Antoni 

-----Original Message-----
From: Tim Armstrong <ta...@cloudera.com> 
Sent: Monday, August 5, 2019 5:51 PM
To: user@impala.apache.org
Cc: dev@impala <de...@impala.apache.org>
Subject: Re: Query status "Session Closed"

You could also be hitting a timeout on an idle session - if the client doesn't perform any operations in a session, then you can configure Impala to close the session in this fashion. See
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimpala.apache.org%2Fdocs%2Fbuild%2Fhtml%2Ftopics%2Fimpala_timeouts.html&amp;data=02%7C01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=zT%2BlqH0orz4KiyknWbGD%2FvQUSnESBg7byz9cRefbNrs%3D&amp;reserved=0

On Mon, Aug 5, 2019 at 5:15 PM Thomas Tauber-Marshall < tmarshall@cloudera.com> wrote:

> Impala has two client interfaces with slightly different session behavior:
>
> Beewax (default port 21000) - sessions are created when the client 
> connects
> Hiveserver2 (default port 21050) - sessions are created when 
> OpenSession() is called
>
> In both cases, sessions can be closed either if
> - the connection ends
> - someone presses "close" on the /sessions page of the debug webui
>
> For hiveserver2, sessions can also be closed if CloseSession() is 
> explicitly called by the client
>
> So probably the most likely cause of your issue is that the 
> connections are being dropped somehow, possibly due to the load balancer.
>
> If sessions are in fact getting closed due to connections being 
> dropped, you should see some lines of the form "Connection from client 
> <client
> hostname> closed, closing <num> associated sessions"
>
> Fwiw, this behavior was recently changed so that hiveserver2 sessions 
> are not closed immediately when the connection ends but timeout if 
> there hasn't been an associated connection for a configurable timeout:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissu
> es.apache.org%2Fjira%2Fbrowse%2FIMPALA-1653&amp;data=02%7C01%7Caivanov
> %40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee4b4aa4d
> 6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=vhbsC4v26p5drWvTG
> Ozfx44H2kx47DJPkWnv0Ib43Fs%3D&amp;reserved=0 though we haven't done a 
> release since that work went in
>
> On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:
>
>> Hi,
>>
>> I am investigating the most common errors we see in our Impala Cluster.
>> The most common is with query status = 'Session Closed'
>>
>> I can see from the code (
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
>> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
>> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L1435&amp;data=02%7C
>> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
>> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=IQO
>> RbGfQV8HeTUU%2Fgu6xULUW7mZThm%2FY38hQLLRarug%3D&amp;reserved=0
>> )
>> that it is set when Session is closed and this happens when 
>> connection is closed (ConnectionEnd<
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
>> hub.com%2Fapache%2Fimpala%2Fblob%2F72c9370856d7436885adbee3e8da7e7d93
>> 36df15%2Fbe%2Fsrc%2Fservice%2Fimpala-server.cc%23L2094&amp;data=02%7C
>> 01%7Caivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138
>> ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=fbH
>> 1W3zXymjI%2FrGwU5WEnrdwAtLT04PkXmr0zIJoFeo%3D&amp;reserved=0
>> >)
>> and this is called when Thrift transport is closed< 
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
>> hub.com%2Fapache%2Fimpala%2Fblob%2F82f753e3044bd2482f35d137fbb28516fc
>> 0ef86c%2Fbe%2Fsrc%2Frpc%2FTAcceptQueueServer.cpp&amp;data=02%7C01%7Ca
>> ivanov%40vmware.com%7C7e725bcae8ef4b220f9d08d71a084a4a%7Cb39138ca3cee
>> 4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637006495239689446&amp;sdata=JBZSPzwiX
>> 4Ta86q9HUl2mC7Bu4fJHBwucqDlu6TmsE0%3D&amp;reserved=0>
>> (and query has not completed or failed in some way it would be marked 
>> as Session Closed
>>
>> Does this mean that the remote end has simply dropped the connection ?
>> E.g there has been network interruption or someone killed (SIGKILL) 
>> the remote process ?
>> We have (TCP) load balancer (HaProxy) and I am wondering if for 
>> example Load Balancer tcp timeout can cause such error. Or can client 
>> socket timeout cause it?
>>
>> I'd be grateful for any insides into the semantics of when "Session 
>> Closed" is set.
>>
>>
>>
>> Thanks,
>> Antoni
>>
>

Re: Query status "Session Closed"

Posted by Tim Armstrong <ta...@cloudera.com>.
You could also be hitting a timeout on an idle session - if the client
doesn't perform any operations in a session, then you can configure Impala
to close the session in this fashion. See
https://impala.apache.org/docs/build/html/topics/impala_timeouts.html

On Mon, Aug 5, 2019 at 5:15 PM Thomas Tauber-Marshall <
tmarshall@cloudera.com> wrote:

> Impala has two client interfaces with slightly different session behavior:
>
> Beewax (default port 21000) - sessions are created when the client connects
> Hiveserver2 (default port 21050) - sessions are created when OpenSession()
> is called
>
> In both cases, sessions can be closed either if
> - the connection ends
> - someone presses "close" on the /sessions page of the debug webui
>
> For hiveserver2, sessions can also be closed if CloseSession() is
> explicitly called by the client
>
> So probably the most likely cause of your issue is that the connections
> are being dropped somehow, possibly due to the load balancer.
>
> If sessions are in fact getting closed due to connections being dropped,
> you should see some lines of the form "Connection from client <client
> hostname> closed, closing <num> associated sessions"
>
> Fwiw, this behavior was recently changed so that hiveserver2 sessions are
> not closed immediately when the connection ends but timeout if there hasn't
> been an associated connection for a configurable timeout:
> https://issues.apache.org/jira/browse/IMPALA-1653 though we haven't done
> a release since that work went in
>
> On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:
>
>> Hi,
>>
>> I am investigating the most common errors we see in our Impala Cluster.
>> The most common is with query status = 'Session Closed'
>>
>> I can see from the code (
>> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435
>> )
>> that it is set when Session is closed and this happens when connection is
>> closed (ConnectionEnd<
>> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L2094
>> >)
>> and this is called when Thrift transport is closed<
>> https://github.com/apache/impala/blob/82f753e3044bd2482f35d137fbb28516fc0ef86c/be/src/rpc/TAcceptQueueServer.cpp>
>> (and query has not completed or failed in some way it would be marked as
>> Session Closed
>>
>> Does this mean that the remote end has simply dropped the connection ?
>> E.g there has been network interruption or someone killed (SIGKILL) the
>> remote process ?
>> We have (TCP) load balancer (HaProxy) and I am wondering if for example
>> Load Balancer tcp timeout can cause such error. Or can client socket
>> timeout cause it?
>>
>> I'd be grateful for any insides into the semantics of when "Session
>> Closed" is set.
>>
>>
>>
>> Thanks,
>> Antoni
>>
>

Re: Query status "Session Closed"

Posted by Tim Armstrong <ta...@cloudera.com>.
You could also be hitting a timeout on an idle session - if the client
doesn't perform any operations in a session, then you can configure Impala
to close the session in this fashion. See
https://impala.apache.org/docs/build/html/topics/impala_timeouts.html

On Mon, Aug 5, 2019 at 5:15 PM Thomas Tauber-Marshall <
tmarshall@cloudera.com> wrote:

> Impala has two client interfaces with slightly different session behavior:
>
> Beewax (default port 21000) - sessions are created when the client connects
> Hiveserver2 (default port 21050) - sessions are created when OpenSession()
> is called
>
> In both cases, sessions can be closed either if
> - the connection ends
> - someone presses "close" on the /sessions page of the debug webui
>
> For hiveserver2, sessions can also be closed if CloseSession() is
> explicitly called by the client
>
> So probably the most likely cause of your issue is that the connections
> are being dropped somehow, possibly due to the load balancer.
>
> If sessions are in fact getting closed due to connections being dropped,
> you should see some lines of the form "Connection from client <client
> hostname> closed, closing <num> associated sessions"
>
> Fwiw, this behavior was recently changed so that hiveserver2 sessions are
> not closed immediately when the connection ends but timeout if there hasn't
> been an associated connection for a configurable timeout:
> https://issues.apache.org/jira/browse/IMPALA-1653 though we haven't done
> a release since that work went in
>
> On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:
>
>> Hi,
>>
>> I am investigating the most common errors we see in our Impala Cluster.
>> The most common is with query status = 'Session Closed'
>>
>> I can see from the code (
>> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435
>> )
>> that it is set when Session is closed and this happens when connection is
>> closed (ConnectionEnd<
>> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L2094
>> >)
>> and this is called when Thrift transport is closed<
>> https://github.com/apache/impala/blob/82f753e3044bd2482f35d137fbb28516fc0ef86c/be/src/rpc/TAcceptQueueServer.cpp>
>> (and query has not completed or failed in some way it would be marked as
>> Session Closed
>>
>> Does this mean that the remote end has simply dropped the connection ?
>> E.g there has been network interruption or someone killed (SIGKILL) the
>> remote process ?
>> We have (TCP) load balancer (HaProxy) and I am wondering if for example
>> Load Balancer tcp timeout can cause such error. Or can client socket
>> timeout cause it?
>>
>> I'd be grateful for any insides into the semantics of when "Session
>> Closed" is set.
>>
>>
>>
>> Thanks,
>> Antoni
>>
>

Re: Query status "Session Closed"

Posted by Thomas Tauber-Marshall <tm...@cloudera.com>.
Impala has two client interfaces with slightly different session behavior:

Beewax (default port 21000) - sessions are created when the client connects
Hiveserver2 (default port 21050) - sessions are created when OpenSession()
is called

In both cases, sessions can be closed either if
- the connection ends
- someone presses "close" on the /sessions page of the debug webui

For hiveserver2, sessions can also be closed if CloseSession() is
explicitly called by the client

So probably the most likely cause of your issue is that the connections are
being dropped somehow, possibly due to the load balancer.

If sessions are in fact getting closed due to connections being dropped,
you should see some lines of the form "Connection from client <client
hostname> closed, closing <num> associated sessions"

Fwiw, this behavior was recently changed so that hiveserver2 sessions are
not closed immediately when the connection ends but timeout if there hasn't
been an associated connection for a configurable timeout:
https://issues.apache.org/jira/browse/IMPALA-1653 though we haven't done a
release since that work went in

On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:

> Hi,
>
> I am investigating the most common errors we see in our Impala Cluster.
> The most common is with query status = 'Session Closed'
>
> I can see from the code (
> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435
> )
> that it is set when Session is closed and this happens when connection is
> closed (ConnectionEnd<
> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L2094
> >)
> and this is called when Thrift transport is closed<
> https://github.com/apache/impala/blob/82f753e3044bd2482f35d137fbb28516fc0ef86c/be/src/rpc/TAcceptQueueServer.cpp>
> (and query has not completed or failed in some way it would be marked as
> Session Closed
>
> Does this mean that the remote end has simply dropped the connection ?
> E.g there has been network interruption or someone killed (SIGKILL) the
> remote process ?
> We have (TCP) load balancer (HaProxy) and I am wondering if for example
> Load Balancer tcp timeout can cause such error. Or can client socket
> timeout cause it?
>
> I'd be grateful for any insides into the semantics of when "Session
> Closed" is set.
>
>
>
> Thanks,
> Antoni
>

Re: Query status "Session Closed"

Posted by Hendry Suwanda <su...@gmail.com>.
Hi,

previously, i got the same situation, the root cause is related to below
haproxy config

> timeout client          1h
>     timeout server          1h
>

maybe, it's same with you condition


On Tue, Aug 6, 2019 at 5:40 AM Antoni Ivanov <ai...@vmware.com> wrote:

> Hi,
>
>
>
> I am investigating the most common errors we see in our Impala Cluster.
>
> The most common is with query status = ‘Session Closed’
>
>
>
> I can see from the code (
> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435)
>
>
> that it is set when Session is closed and this happens when connection is
> closed (ConnectionEnd
> <https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L2094>
> )
>
> and this is called when Thrift transport is closed
> <https://github.com/apache/impala/blob/82f753e3044bd2482f35d137fbb28516fc0ef86c/be/src/rpc/TAcceptQueueServer.cpp>
> (and query has not completed or failed in some way it would be marked as
> Session Closed
>
>
>
> Does this mean that the remote end has simply dropped the connection ?
>
> E.g there has been network interruption or someone killed (SIGKILL) the
> remote process ?
>
> We have (TCP) load balancer (HaProxy) and I am wondering if for example
> Load Balancer tcp timeout can cause such error. Or can client socket
> timeout cause it?
>
>
>
> I’d be grateful for any insides into the semantics of when “Session
> Closed” is set.
>
>
>
>
>
>
>
> Thanks,
>
> Antoni
>


-- 
Regards,


Hendry Suwanda

Github: https://github.com/hendrysuwanda
Blog: http://hendrysuwanda.github.io/

Re: Query status "Session Closed"

Posted by Thomas Tauber-Marshall <tm...@cloudera.com>.
Impala has two client interfaces with slightly different session behavior:

Beewax (default port 21000) - sessions are created when the client connects
Hiveserver2 (default port 21050) - sessions are created when OpenSession()
is called

In both cases, sessions can be closed either if
- the connection ends
- someone presses "close" on the /sessions page of the debug webui

For hiveserver2, sessions can also be closed if CloseSession() is
explicitly called by the client

So probably the most likely cause of your issue is that the connections are
being dropped somehow, possibly due to the load balancer.

If sessions are in fact getting closed due to connections being dropped,
you should see some lines of the form "Connection from client <client
hostname> closed, closing <num> associated sessions"

Fwiw, this behavior was recently changed so that hiveserver2 sessions are
not closed immediately when the connection ends but timeout if there hasn't
been an associated connection for a configurable timeout:
https://issues.apache.org/jira/browse/IMPALA-1653 though we haven't done a
release since that work went in

On Mon, Aug 5, 2019 at 3:40 PM Antoni Ivanov <ai...@vmware.com> wrote:

> Hi,
>
> I am investigating the most common errors we see in our Impala Cluster.
> The most common is with query status = 'Session Closed'
>
> I can see from the code (
> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435
> )
> that it is set when Session is closed and this happens when connection is
> closed (ConnectionEnd<
> https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L2094
> >)
> and this is called when Thrift transport is closed<
> https://github.com/apache/impala/blob/82f753e3044bd2482f35d137fbb28516fc0ef86c/be/src/rpc/TAcceptQueueServer.cpp>
> (and query has not completed or failed in some way it would be marked as
> Session Closed
>
> Does this mean that the remote end has simply dropped the connection ?
> E.g there has been network interruption or someone killed (SIGKILL) the
> remote process ?
> We have (TCP) load balancer (HaProxy) and I am wondering if for example
> Load Balancer tcp timeout can cause such error. Or can client socket
> timeout cause it?
>
> I'd be grateful for any insides into the semantics of when "Session
> Closed" is set.
>
>
>
> Thanks,
> Antoni
>