You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@drill.apache.org by Wesley Chow <we...@chartbeat.com> on 2017/03/27 20:07:21 UTC

JDBC disconnections over remote networks

hi all,

I've been noticing that queries that return large numbers of rows (1M+,
each row maybe around 500 bytes) via the JDBC connector (and thus sqlline)
from our office to drillbits in EC2 consistently disconnect with a
connection error while streaming the results back. The same query initiated
from an EC2 machine works fine. Any thoughts on what I should be looking
at? When the disconnection occurs, none of my other network connections
such as ssh are affected, just the Drill JDBC connector.

Thanks,
Wes

Re: JDBC disconnections over remote networks

Posted by Kunal Khatua <kk...@mapr.com>.

Hi Wesley

I don't believe the G1GC would be the issue here. The Client libraries have an internal buffer which is, most likely, getting full.

Your problem sounds similar to https://issues.apache.org/jira/browse/DRILL-5217 ; though it is JDBC related and not the ODBC (C++ client).

The problem happens due the large size of the result set. Assuming your resultset has 50 columns with 1M rows, and each field is 20 bytes (lots of varchar columns).. you're looking at about 1GB of data being held by the JDBC client. This is not unreasonable, but is still quite a bit.
It is likely that since the JDBC client does not consume the results "fast enough", the underlying result listener thread blocks until memory is available in the result buffer (not sure how much memory is allotted to the application).
The same listener thread is also responsible for sending heartbeats, and the server closes the connection because that thread unable to send heartbeats (blocked). This could result in the JDBC disconnection.
You could suffix a LIMIT to the query to identify the 'sweet spot' where disconnections don't happen.


Kunal Khatua

________________________________
From: rahul challapalli <ch...@gmail.com>
Sent: Thursday, March 30, 2017 9:05:20 AM
To: user
Subject: Re: JDBC disconnections over remote networks

I haven't used G1GC in any of my testing. So I cannot comment much on
whether it would be helpful or not.

On Thu, Mar 30, 2017 at 8:35 AM, Wesley Chow <we...@chartbeat.com> wrote:

> Sorry I haven't had time to look into this much and fix our logging setup,
> but I did try explicitly setting JVM heap values in the client rather than
> relying on the default allocation and after a few runs it does seem that
> fixed it. I'm going to cautiously say that was the issue. Thanks!
>
> Would it be prudent to use G1GC for all our clients, since it's pauses are
> supposed to be far less severe?
>
> Wes
>
>
> On Tue, Mar 28, 2017 at 1:42 PM, rahul challapalli <
> challapallirahul@gmail.com> wrote:
>
> > Also how much memory did you configure your client to use? If the client
> > does not have sufficient memory to run, then garbage collector could
> start
> > running and thereby causing the client to become un-responsive to
> > heartbeats. So also kindly check the sqlline logs as well for any
> > exceptions
> >
> > On Mon, Mar 27, 2017 at 1:43 PM, Wesley Chow <we...@chartbeat.com> wrote:
> >
> > > That's totally possible. The ErrorIds are stored on the drillbit
> machines
> > > right? Our logging is configured incorrectly at the moment so I can't
> > find
> > > the error. Will fix that and report back.
> > >
> > > If I limit to 100,000 rows the query consistently works. If I limit to
> 1M
> > > rows then the query consistently disconnects. If I CTAS on 1M rows then
> > it
> > > works, so it does appear to be an issue only when returning results to
> > the
> > > client. I don't know if there is some value between 100k and 1M for
> which
> > > it sometimes works and sometimes doesn't. Is that useful to know? I can
> > do
> > > a little binary searching on values if that would help.
> > >
> > > Wes
> > >
> > >
> > > On Mon, Mar 27, 2017 at 4:13 PM, rahul challapalli <
> > > challapallirahul@gmail.com> wrote:
> > >
> > > > Do you think that the error you are seeing is related to DRILL-4708
> > > > <https://issues.apache.org/jira/browse/DRILL-4708> ? If not kindly
> > > provide
> > > > more information about the error (message, stack trace etc). And also
> > > does
> > > > the connection error happen consistently after returning X number of
> > > > records or is it random?
> > > >
> > > > - Rahul
> > > >
> > > > On Mon, Mar 27, 2017 at 1:07 PM, Wesley Chow <we...@chartbeat.com>
> > wrote:
> > > >
> > > > > hi all,
> > > > >
> > > > > I've been noticing that queries that return large numbers of rows
> > (1M+,
> > > > > each row maybe around 500 bytes) via the JDBC connector (and thus
> > > > sqlline)
> > > > > from our office to drillbits in EC2 consistently disconnect with a
> > > > > connection error while streaming the results back. The same query
> > > > initiated
> > > > > from an EC2 machine works fine. Any thoughts on what I should be
> > > looking
> > > > > at? When the disconnection occurs, none of my other network
> > connections
> > > > > such as ssh are affected, just the Drill JDBC connector.
> > > > >
> > > > > Thanks,
> > > > > Wes
> > > > >
> > > >
> > >
> >
>

Re: JDBC disconnections over remote networks

Posted by rahul challapalli <ch...@gmail.com>.

I haven't used G1GC in any of my testing. So I cannot comment much on
whether it would be helpful or not.

On Thu, Mar 30, 2017 at 8:35 AM, Wesley Chow <we...@chartbeat.com> wrote:

> Sorry I haven't had time to look into this much and fix our logging setup,
> but I did try explicitly setting JVM heap values in the client rather than
> relying on the default allocation and after a few runs it does seem that
> fixed it. I'm going to cautiously say that was the issue. Thanks!
>
> Would it be prudent to use G1GC for all our clients, since it's pauses are
> supposed to be far less severe?
>
> Wes
>
>
> On Tue, Mar 28, 2017 at 1:42 PM, rahul challapalli <
> challapallirahul@gmail.com> wrote:
>
> > Also how much memory did you configure your client to use? If the client
> > does not have sufficient memory to run, then garbage collector could
> start
> > running and thereby causing the client to become un-responsive to
> > heartbeats. So also kindly check the sqlline logs as well for any
> > exceptions
> >
> > On Mon, Mar 27, 2017 at 1:43 PM, Wesley Chow <we...@chartbeat.com> wrote:
> >
> > > That's totally possible. The ErrorIds are stored on the drillbit
> machines
> > > right? Our logging is configured incorrectly at the moment so I can't
> > find
> > > the error. Will fix that and report back.
> > >
> > > If I limit to 100,000 rows the query consistently works. If I limit to
> 1M
> > > rows then the query consistently disconnects. If I CTAS on 1M rows then
> > it
> > > works, so it does appear to be an issue only when returning results to
> > the
> > > client. I don't know if there is some value between 100k and 1M for
> which
> > > it sometimes works and sometimes doesn't. Is that useful to know? I can
> > do
> > > a little binary searching on values if that would help.
> > >
> > > Wes
> > >
> > >
> > > On Mon, Mar 27, 2017 at 4:13 PM, rahul challapalli <
> > > challapallirahul@gmail.com> wrote:
> > >
> > > > Do you think that the error you are seeing is related to DRILL-4708
> > > > <https://issues.apache.org/jira/browse/DRILL-4708> ? If not kindly
> > > provide
> > > > more information about the error (message, stack trace etc). And also
> > > does
> > > > the connection error happen consistently after returning X number of
> > > > records or is it random?
> > > >
> > > > - Rahul
> > > >
> > > > On Mon, Mar 27, 2017 at 1:07 PM, Wesley Chow <we...@chartbeat.com>
> > wrote:
> > > >
> > > > > hi all,
> > > > >
> > > > > I've been noticing that queries that return large numbers of rows
> > (1M+,
> > > > > each row maybe around 500 bytes) via the JDBC connector (and thus
> > > > sqlline)
> > > > > from our office to drillbits in EC2 consistently disconnect with a
> > > > > connection error while streaming the results back. The same query
> > > > initiated
> > > > > from an EC2 machine works fine. Any thoughts on what I should be
> > > looking
> > > > > at? When the disconnection occurs, none of my other network
> > connections
> > > > > such as ssh are affected, just the Drill JDBC connector.
> > > > >
> > > > > Thanks,
> > > > > Wes
> > > > >
> > > >
> > >
> >
>

Re: JDBC disconnections over remote networks

Posted by Wesley Chow <we...@chartbeat.com>.

Sorry I haven't had time to look into this much and fix our logging setup,
but I did try explicitly setting JVM heap values in the client rather than
relying on the default allocation and after a few runs it does seem that
fixed it. I'm going to cautiously say that was the issue. Thanks!

Would it be prudent to use G1GC for all our clients, since it's pauses are
supposed to be far less severe?

Wes


On Tue, Mar 28, 2017 at 1:42 PM, rahul challapalli <
challapallirahul@gmail.com> wrote:

> Also how much memory did you configure your client to use? If the client
> does not have sufficient memory to run, then garbage collector could start
> running and thereby causing the client to become un-responsive to
> heartbeats. So also kindly check the sqlline logs as well for any
> exceptions
>
> On Mon, Mar 27, 2017 at 1:43 PM, Wesley Chow <we...@chartbeat.com> wrote:
>
> > That's totally possible. The ErrorIds are stored on the drillbit machines
> > right? Our logging is configured incorrectly at the moment so I can't
> find
> > the error. Will fix that and report back.
> >
> > If I limit to 100,000 rows the query consistently works. If I limit to 1M
> > rows then the query consistently disconnects. If I CTAS on 1M rows then
> it
> > works, so it does appear to be an issue only when returning results to
> the
> > client. I don't know if there is some value between 100k and 1M for which
> > it sometimes works and sometimes doesn't. Is that useful to know? I can
> do
> > a little binary searching on values if that would help.
> >
> > Wes
> >
> >
> > On Mon, Mar 27, 2017 at 4:13 PM, rahul challapalli <
> > challapallirahul@gmail.com> wrote:
> >
> > > Do you think that the error you are seeing is related to DRILL-4708
> > > <https://issues.apache.org/jira/browse/DRILL-4708> ? If not kindly
> > provide
> > > more information about the error (message, stack trace etc). And also
> > does
> > > the connection error happen consistently after returning X number of
> > > records or is it random?
> > >
> > > - Rahul
> > >
> > > On Mon, Mar 27, 2017 at 1:07 PM, Wesley Chow <we...@chartbeat.com>
> wrote:
> > >
> > > > hi all,
> > > >
> > > > I've been noticing that queries that return large numbers of rows
> (1M+,
> > > > each row maybe around 500 bytes) via the JDBC connector (and thus
> > > sqlline)
> > > > from our office to drillbits in EC2 consistently disconnect with a
> > > > connection error while streaming the results back. The same query
> > > initiated
> > > > from an EC2 machine works fine. Any thoughts on what I should be
> > looking
> > > > at? When the disconnection occurs, none of my other network
> connections
> > > > such as ssh are affected, just the Drill JDBC connector.
> > > >
> > > > Thanks,
> > > > Wes
> > > >
> > >
> >
>

Re: JDBC disconnections over remote networks

Posted by rahul challapalli <ch...@gmail.com>.

Also how much memory did you configure your client to use? If the client
does not have sufficient memory to run, then garbage collector could start
running and thereby causing the client to become un-responsive to
heartbeats. So also kindly check the sqlline logs as well for any
exceptions

On Mon, Mar 27, 2017 at 1:43 PM, Wesley Chow <we...@chartbeat.com> wrote:

> That's totally possible. The ErrorIds are stored on the drillbit machines
> right? Our logging is configured incorrectly at the moment so I can't find
> the error. Will fix that and report back.
>
> If I limit to 100,000 rows the query consistently works. If I limit to 1M
> rows then the query consistently disconnects. If I CTAS on 1M rows then it
> works, so it does appear to be an issue only when returning results to the
> client. I don't know if there is some value between 100k and 1M for which
> it sometimes works and sometimes doesn't. Is that useful to know? I can do
> a little binary searching on values if that would help.
>
> Wes
>
>
> On Mon, Mar 27, 2017 at 4:13 PM, rahul challapalli <
> challapallirahul@gmail.com> wrote:
>
> > Do you think that the error you are seeing is related to DRILL-4708
> > <https://issues.apache.org/jira/browse/DRILL-4708> ? If not kindly
> provide
> > more information about the error (message, stack trace etc). And also
> does
> > the connection error happen consistently after returning X number of
> > records or is it random?
> >
> > - Rahul
> >
> > On Mon, Mar 27, 2017 at 1:07 PM, Wesley Chow <we...@chartbeat.com> wrote:
> >
> > > hi all,
> > >
> > > I've been noticing that queries that return large numbers of rows (1M+,
> > > each row maybe around 500 bytes) via the JDBC connector (and thus
> > sqlline)
> > > from our office to drillbits in EC2 consistently disconnect with a
> > > connection error while streaming the results back. The same query
> > initiated
> > > from an EC2 machine works fine. Any thoughts on what I should be
> looking
> > > at? When the disconnection occurs, none of my other network connections
> > > such as ssh are affected, just the Drill JDBC connector.
> > >
> > > Thanks,
> > > Wes
> > >
> >
>

Re: JDBC disconnections over remote networks

Posted by Wesley Chow <we...@chartbeat.com>.

That's totally possible. The ErrorIds are stored on the drillbit machines
right? Our logging is configured incorrectly at the moment so I can't find
the error. Will fix that and report back.

If I limit to 100,000 rows the query consistently works. If I limit to 1M
rows then the query consistently disconnects. If I CTAS on 1M rows then it
works, so it does appear to be an issue only when returning results to the
client. I don't know if there is some value between 100k and 1M for which
it sometimes works and sometimes doesn't. Is that useful to know? I can do
a little binary searching on values if that would help.

Wes

On Mon, Mar 27, 2017 at 4:13 PM, rahul challapalli <
challapallirahul@gmail.com> wrote:

> Do you think that the error you are seeing is related to DRILL-4708
> <https://issues.apache.org/jira/browse/DRILL-4708> ? If not kindly provide
> more information about the error (message, stack trace etc). And also does
> the connection error happen consistently after returning X number of
> records or is it random?
>
> - Rahul
>
> On Mon, Mar 27, 2017 at 1:07 PM, Wesley Chow <we...@chartbeat.com> wrote:
>
> > hi all,
> >
> > I've been noticing that queries that return large numbers of rows (1M+,
> > each row maybe around 500 bytes) via the JDBC connector (and thus
> sqlline)
> > from our office to drillbits in EC2 consistently disconnect with a
> > connection error while streaming the results back. The same query
> initiated
> > from an EC2 machine works fine. Any thoughts on what I should be looking
> > at? When the disconnection occurs, none of my other network connections
> > such as ssh are affected, just the Drill JDBC connector.
> >
> > Thanks,
> > Wes
> >
>

Re: JDBC disconnections over remote networks

Posted by rahul challapalli <ch...@gmail.com>.

Do you think that the error you are seeing is related to DRILL-4708
<https://issues.apache.org/jira/browse/DRILL-4708> ? If not kindly provide
more information about the error (message, stack trace etc). And also does
the connection error happen consistently after returning X number of
records or is it random?

- Rahul

On Mon, Mar 27, 2017 at 1:07 PM, Wesley Chow <we...@chartbeat.com> wrote:

> hi all,
>
> I've been noticing that queries that return large numbers of rows (1M+,
> each row maybe around 500 bytes) via the JDBC connector (and thus sqlline)
> from our office to drillbits in EC2 consistently disconnect with a
> connection error while streaming the results back. The same query initiated
> from an EC2 machine works fine. Any thoughts on what I should be looking
> at? When the disconnection occurs, none of my other network connections
> such as ssh are affected, just the Drill JDBC connector.
>
> Thanks,
> Wes
>