You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Omernik <jo...@omernik.com> on 2016/05/23 16:31:21 UTC

Issue with Queries Hanging

Hey all, this is separate, yet related issue to my other posts RE Parquet,
however, I thought I'd post this to see if this is normal or should be
handled (and/or JIRAed)

I am running Drill 1.6, if you've read the other posts, I am trying to CTAS
a large amount of data (largish) 120 GB from Parquet to better Parquet.

As I am running, I sometimes get the Index Out of Bounds (as in the other
threads), but depending on source data and/or settings like using the new
parquet reader, I get a odd situation.

When I refresh the profile in the WebUII get an error "VALIDATION ERROR: no
profile with given query id '' exists"

I am running this in sqlline, and at this point, there is no error, but I
can't access my query profile.

Other notes:

1. The webui is HORRIBLY slow
2. If I cancel the query, it will show me some written parquet, but obvious
it wasn't finished
3. There are no errors in any of the drillbits log files (except the forman
which starts to get "WARN" "Messos of mode (REQUEST OR RESPONSE) of type 8
(or type 1) too longer than 500ms Actual duration was (high number of ms
betwen 1900 and 3500 ms)
4. Like I said, no errors, just everything appears to hang.

My memory and such seems good here, I have 96 GB of ram DIRECT per node,
and 12 GB of HEAP per node, 5 nodes,.

The cluster seems really sluggish and out of sorts until I restart drill
bits... This seems like a very bad "error state"

Has anyone seen this? Any thoughts on this? Should I open a JIRA?


Thanks,
John

Re: Issue with Queries Hanging

Posted by John Omernik <jo...@omernik.com>.
Distributed.  (MapR FS, but via NFS)

On Mon, May 23, 2016 at 3:26 PM, Abdel Hakim Deneche <ad...@maprtech.com>
wrote:

> One question about the missing query profile: do you store the query
> profiles in the local file system or the distributed file system ?
>
> On Mon, May 23, 2016 at 9:31 AM, John Omernik <jo...@omernik.com> wrote:
>
> > Hey all, this is separate, yet related issue to my other posts RE
> Parquet,
> > however, I thought I'd post this to see if this is normal or should be
> > handled (and/or JIRAed)
> >
> > I am running Drill 1.6, if you've read the other posts, I am trying to
> CTAS
> > a large amount of data (largish) 120 GB from Parquet to better Parquet.
> >
> > As I am running, I sometimes get the Index Out of Bounds (as in the other
> > threads), but depending on source data and/or settings like using the new
> > parquet reader, I get a odd situation.
> >
> > When I refresh the profile in the WebUII get an error "VALIDATION ERROR:
> no
> > profile with given query id '' exists"
> >
> > I am running this in sqlline, and at this point, there is no error, but I
> > can't access my query profile.
> >
> > Other notes:
> >
> > 1. The webui is HORRIBLY slow
> > 2. If I cancel the query, it will show me some written parquet, but
> obvious
> > it wasn't finished
> > 3. There are no errors in any of the drillbits log files (except the
> forman
> > which starts to get "WARN" "Messos of mode (REQUEST OR RESPONSE) of type
> 8
> > (or type 1) too longer than 500ms Actual duration was (high number of ms
> > betwen 1900 and 3500 ms)
> > 4. Like I said, no errors, just everything appears to hang.
> >
> > My memory and such seems good here, I have 96 GB of ram DIRECT per node,
> > and 12 GB of HEAP per node, 5 nodes,.
> >
> > The cluster seems really sluggish and out of sorts until I restart drill
> > bits... This seems like a very bad "error state"
> >
> > Has anyone seen this? Any thoughts on this? Should I open a JIRA?
> >
> >
> > Thanks,
> > John
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: Issue with Queries Hanging

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
One question about the missing query profile: do you store the query
profiles in the local file system or the distributed file system ?

On Mon, May 23, 2016 at 9:31 AM, John Omernik <jo...@omernik.com> wrote:

> Hey all, this is separate, yet related issue to my other posts RE Parquet,
> however, I thought I'd post this to see if this is normal or should be
> handled (and/or JIRAed)
>
> I am running Drill 1.6, if you've read the other posts, I am trying to CTAS
> a large amount of data (largish) 120 GB from Parquet to better Parquet.
>
> As I am running, I sometimes get the Index Out of Bounds (as in the other
> threads), but depending on source data and/or settings like using the new
> parquet reader, I get a odd situation.
>
> When I refresh the profile in the WebUII get an error "VALIDATION ERROR: no
> profile with given query id '' exists"
>
> I am running this in sqlline, and at this point, there is no error, but I
> can't access my query profile.
>
> Other notes:
>
> 1. The webui is HORRIBLY slow
> 2. If I cancel the query, it will show me some written parquet, but obvious
> it wasn't finished
> 3. There are no errors in any of the drillbits log files (except the forman
> which starts to get "WARN" "Messos of mode (REQUEST OR RESPONSE) of type 8
> (or type 1) too longer than 500ms Actual duration was (high number of ms
> betwen 1900 and 3500 ms)
> 4. Like I said, no errors, just everything appears to hang.
>
> My memory and such seems good here, I have 96 GB of ram DIRECT per node,
> and 12 GB of HEAP per node, 5 nodes,.
>
> The cluster seems really sluggish and out of sorts until I restart drill
> bits... This seems like a very bad "error state"
>
> Has anyone seen this? Any thoughts on this? Should I open a JIRA?
>
>
> Thanks,
> John
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: Issue with Queries Hanging

Posted by John Omernik <jo...@omernik.com>.
Note: I did see after letting one just hang for a long time, a message
about HEAP space... I was running with 12GB of Heap and 96 GB of Direct, I
switched to 24 GB of Heap and 84 GB of Direct, and now my queries fail, but
all with the index out of bounds issue, and then my drill bits stay
responsive.

Could this hanging issue be related to Heap?  I thought 12 GB would be
quite a bit of Heap, but I guess not? Could we handle this better, provide
better errors?  Maybe the unresponsiveness is doing to some GC or other
work? (Just spitballing ideas here).

Thoughts?

John

On Mon, May 23, 2016 at 11:31 AM, John Omernik <jo...@omernik.com> wrote:

> Hey all, this is separate, yet related issue to my other posts RE Parquet,
> however, I thought I'd post this to see if this is normal or should be
> handled (and/or JIRAed)
>
> I am running Drill 1.6, if you've read the other posts, I am trying to
> CTAS a large amount of data (largish) 120 GB from Parquet to better
> Parquet.
>
> As I am running, I sometimes get the Index Out of Bounds (as in the other
> threads), but depending on source data and/or settings like using the new
> parquet reader, I get a odd situation.
>
> When I refresh the profile in the WebUII get an error "VALIDATION ERROR:
> no profile with given query id '' exists"
>
> I am running this in sqlline, and at this point, there is no error, but I
> can't access my query profile.
>
> Other notes:
>
> 1. The webui is HORRIBLY slow
> 2. If I cancel the query, it will show me some written parquet, but
> obvious it wasn't finished
> 3. There are no errors in any of the drillbits log files (except the
> forman which starts to get "WARN" "Messos of mode (REQUEST OR RESPONSE) of
> type 8 (or type 1) too longer than 500ms Actual duration was (high number
> of ms betwen 1900 and 3500 ms)
> 4. Like I said, no errors, just everything appears to hang.
>
> My memory and such seems good here, I have 96 GB of ram DIRECT per node,
> and 12 GB of HEAP per node, 5 nodes,.
>
> The cluster seems really sluggish and out of sorts until I restart drill
> bits... This seems like a very bad "error state"
>
> Has anyone seen this? Any thoughts on this? Should I open a JIRA?
>
>
> Thanks,
> John
>
>
>