You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Parth Chandra <pc...@maprtech.com> on 2015/09/01 07:52:17 UTC

Re: Partition pruning inconsistency

A better idea would be to return a warning in the results and let jdbc/odbc
show a warning with the result data.

On Wed, Aug 26, 2015 at 8:31 AM, Aman Sinha <as...@maprtech.com> wrote:

> We have had some issues where the same query run at different times
> (possibly with other queries running concurrently...not sure about the
> concurrency level)  either performed partition pruning or did not.  The
> times where it failed happened due to couple of reasons :
>   (a) allocateNew() in the PruneScanRule failed with an out of memory
> condition
>   (b) the interpreter evaluator encountered an error with a particular
> expression type evaluation
>
> The PruneScanRule currently logs a warning message and does not fail the
> query since this is a performance optimization.  While we will address the
> root cause of (a) and (b) (there's a JIRA open for (b) )  an important
> issue is the inconsistent behavior of a query.
>
> Should we provide a system setting that allows the query to fail in this
> situation ?
> Note that other rules in the optimizer could also fail and some rules  log
> warnings but those failures are very rare, while PruneScan rule is doing
> more complex operations - creating value vectors, doing interpreter
> evaluation - so the chances of something failing increases.
>
> Aman
>

Re: Partition pruning inconsistency

Posted by Aman Sinha <as...@maprtech.com>.
I like the idea of showing an info or warning message with the returned
query result.  This can be leveraged for other things - for example
non-existent columns showing up as nulls.
Hopefully, the warnings are few enough that they don't 'accumulate' for a
single query...

Aman

On Mon, Aug 31, 2015 at 11:17 PM, Parth Chandra <pc...@maprtech.com>
wrote:

> Yes we would need to enhance the protocol a bit. Depending on whether the
> warning is issued before results are sent (as in planning/optimization) or
> whether the warning is issued after the results are completed (for example
> we can issue a warning if, in future, Drill decides to drop bad rows and
> continue), we would send a QUERY_RUNNING_WITH_INFO or a
> QUERY_COMPLETED_WITH_INFO status message or something similar. The status
> message can already carry a DrillPBError message with it.
> Of course, JDBC/C++ client/ODBC will all have to be updated.
> We could potentially add additional information to send back to the client
> in the status message including DDL status etc.
>
>
> On Mon, Aug 31, 2015 at 10:57 PM, Jacques Nadeau <ja...@dremio.com>
> wrote:
>
> > I've been thinking that we need to add support for returning warnings.
> > Have you looked how to add to JDBC or ODBC?  We'll need to update the RPC
> > protocol since I believe we don't currently have an accommodation for
> > warnings. Maybe add along with DDL queries?
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Mon, Aug 31, 2015 at 10:52 PM, Parth Chandra <pc...@maprtech.com>
> > wrote:
> >
> > > A better idea would be to return a warning in the results and let
> > jdbc/odbc
> > > show a warning with the result data.
> > >
> > > On Wed, Aug 26, 2015 at 8:31 AM, Aman Sinha <as...@maprtech.com>
> wrote:
> > >
> > > > We have had some issues where the same query run at different times
> > > > (possibly with other queries running concurrently...not sure about
> the
> > > > concurrency level)  either performed partition pruning or did not.
> The
> > > > times where it failed happened due to couple of reasons :
> > > >   (a) allocateNew() in the PruneScanRule failed with an out of memory
> > > > condition
> > > >   (b) the interpreter evaluator encountered an error with a
> particular
> > > > expression type evaluation
> > > >
> > > > The PruneScanRule currently logs a warning message and does not fail
> > the
> > > > query since this is a performance optimization.  While we will
> address
> > > the
> > > > root cause of (a) and (b) (there's a JIRA open for (b) )  an
> important
> > > > issue is the inconsistent behavior of a query.
> > > >
> > > > Should we provide a system setting that allows the query to fail in
> > this
> > > > situation ?
> > > > Note that other rules in the optimizer could also fail and some rules
> > > log
> > > > warnings but those failures are very rare, while PruneScan rule is
> > doing
> > > > more complex operations - creating value vectors, doing interpreter
> > > > evaluation - so the chances of something failing increases.
> > > >
> > > > Aman
> > > >
> > >
> >
>

Re: Partition pruning inconsistency

Posted by Parth Chandra <pc...@maprtech.com>.
Yes we would need to enhance the protocol a bit. Depending on whether the
warning is issued before results are sent (as in planning/optimization) or
whether the warning is issued after the results are completed (for example
we can issue a warning if, in future, Drill decides to drop bad rows and
continue), we would send a QUERY_RUNNING_WITH_INFO or a
QUERY_COMPLETED_WITH_INFO status message or something similar. The status
message can already carry a DrillPBError message with it.
Of course, JDBC/C++ client/ODBC will all have to be updated.
We could potentially add additional information to send back to the client
in the status message including DDL status etc.


On Mon, Aug 31, 2015 at 10:57 PM, Jacques Nadeau <ja...@dremio.com> wrote:

> I've been thinking that we need to add support for returning warnings.
> Have you looked how to add to JDBC or ODBC?  We'll need to update the RPC
> protocol since I believe we don't currently have an accommodation for
> warnings. Maybe add along with DDL queries?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Aug 31, 2015 at 10:52 PM, Parth Chandra <pc...@maprtech.com>
> wrote:
>
> > A better idea would be to return a warning in the results and let
> jdbc/odbc
> > show a warning with the result data.
> >
> > On Wed, Aug 26, 2015 at 8:31 AM, Aman Sinha <as...@maprtech.com> wrote:
> >
> > > We have had some issues where the same query run at different times
> > > (possibly with other queries running concurrently...not sure about the
> > > concurrency level)  either performed partition pruning or did not.  The
> > > times where it failed happened due to couple of reasons :
> > >   (a) allocateNew() in the PruneScanRule failed with an out of memory
> > > condition
> > >   (b) the interpreter evaluator encountered an error with a particular
> > > expression type evaluation
> > >
> > > The PruneScanRule currently logs a warning message and does not fail
> the
> > > query since this is a performance optimization.  While we will address
> > the
> > > root cause of (a) and (b) (there's a JIRA open for (b) )  an important
> > > issue is the inconsistent behavior of a query.
> > >
> > > Should we provide a system setting that allows the query to fail in
> this
> > > situation ?
> > > Note that other rules in the optimizer could also fail and some rules
> > log
> > > warnings but those failures are very rare, while PruneScan rule is
> doing
> > > more complex operations - creating value vectors, doing interpreter
> > > evaluation - so the chances of something failing increases.
> > >
> > > Aman
> > >
> >
>

Re: Partition pruning inconsistency

Posted by Jacques Nadeau <ja...@dremio.com>.
I've been thinking that we need to add support for returning warnings.
Have you looked how to add to JDBC or ODBC?  We'll need to update the RPC
protocol since I believe we don't currently have an accommodation for
warnings. Maybe add along with DDL queries?

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Mon, Aug 31, 2015 at 10:52 PM, Parth Chandra <pc...@maprtech.com>
wrote:

> A better idea would be to return a warning in the results and let jdbc/odbc
> show a warning with the result data.
>
> On Wed, Aug 26, 2015 at 8:31 AM, Aman Sinha <as...@maprtech.com> wrote:
>
> > We have had some issues where the same query run at different times
> > (possibly with other queries running concurrently...not sure about the
> > concurrency level)  either performed partition pruning or did not.  The
> > times where it failed happened due to couple of reasons :
> >   (a) allocateNew() in the PruneScanRule failed with an out of memory
> > condition
> >   (b) the interpreter evaluator encountered an error with a particular
> > expression type evaluation
> >
> > The PruneScanRule currently logs a warning message and does not fail the
> > query since this is a performance optimization.  While we will address
> the
> > root cause of (a) and (b) (there's a JIRA open for (b) )  an important
> > issue is the inconsistent behavior of a query.
> >
> > Should we provide a system setting that allows the query to fail in this
> > situation ?
> > Note that other rules in the optimizer could also fail and some rules
> log
> > warnings but those failures are very rare, while PruneScan rule is doing
> > more complex operations - creating value vectors, doing interpreter
> > evaluation - so the chances of something failing increases.
> >
> > Aman
> >
>