You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by Josh Elser <el...@apache.org> on 2018/08/27 22:01:42 UTC

[DISCUSS] EXPLAIN'ing what we do well (was Re: [DISCUSS] Suggestions for Phoenix from HBaseCon Asia notes)

On 8/27/18 5:03 PM, Thomas D'Silva wrote:
>> 3. Better recommendations to users to not attempt certain queries.
>>
>> We definitively know that there are certain types of queries that Phoenix
>> cannot support well (compared to optimal Phoenix use-cases). Users very
>> commonly fall into such pitfalls on their own and this leaves a bad taste
>> in their mouth (thinking that the product "stinks").
>>
>> Can we do a better job of telling the user when and why it happened? What
>> would such a user-interaction model look like? Can we supplement the "why"
>> with instructions of what to do differently (even if in the abstract)?
>>
> Providing relevant feedback before/after a query is run in general is very
> hard to do. If stats are enabled we have an estimate of how many rows/bytes
> will be scanned.
> We could have an optional feature that prevent users from running queries
> if the rows/bytes scanned are above a certain threshold. We should also
> enhance our explain
> plan documentationhttp://phoenix.apache.org/explainplan.html  with example
> of queries so users know what kinds of queries Phoenix handles well.

Breaking this out..

Totally agree -- this is by no means "easy". I struggle very often 
trying to express just _why_ a query that someone is running in Phoenix 
doesn't run as well as they think it should.

Centralizing on the EXPLAIN plan is good. Making sure it's 
consumable/thorough is probably the lowest hanging fruit. If we can give 
concrete examples to the kinds of explain plans a user might see, I 
think that might get use from users/admins.

Throwing a random idea out there: with stats and the query plan, can we 
give a thumbs-up/thumbs-down? If we can, is that useful?

Re: [DISCUSS] EXPLAIN'ing what we do well (was Re: [DISCUSS] Suggestions for Phoenix from HBaseCon Asia notes)

Posted by Thomas D'Silva <td...@salesforce.com>.

I created  PHOENIX-4881 to create a  guardrail config property based on the
bytes scanned.
We already have PHOENIX-1481 to improve the explain plan documentation.

On Tue, Aug 28, 2018 at 1:40 PM, James Taylor <ja...@apache.org>
wrote:

> Thomas' idea is a good one. From the EXPLAIN plan ResultSet, you can
> directly get an estimate of the number of bytes that will be scanned. Take
> a look at this [1] documentation. We need to implement PHOENIX-4735 too (so
> that things are setup well out-of-the-box). We could have a kind of
> guardrail config property that would define the max allowed bytes allowed
> to be read and fail a query that goes over this limit. That would cover 80%
> of the issues IMHO. Other guardrail config properties could cover other
> corner cases.
>
> [1] http://phoenix.apache.org/explainplan.html
>
> On Mon, Aug 27, 2018 at 3:01 PM Josh Elser <el...@apache.org> wrote:
>
> > On 8/27/18 5:03 PM, Thomas D'Silva wrote:
> > >> 3. Better recommendations to users to not attempt certain queries.
> > >>
> > >> We definitively know that there are certain types of queries that
> > Phoenix
> > >> cannot support well (compared to optimal Phoenix use-cases). Users
> very
> > >> commonly fall into such pitfalls on their own and this leaves a bad
> > taste
> > >> in their mouth (thinking that the product "stinks").
> > >>
> > >> Can we do a better job of telling the user when and why it happened?
> > What
> > >> would such a user-interaction model look like? Can we supplement the
> > "why"
> > >> with instructions of what to do differently (even if in the abstract)?
> > >>
> > > Providing relevant feedback before/after a query is run in general is
> > very
> > > hard to do. If stats are enabled we have an estimate of how many
> > rows/bytes
> > > will be scanned.
> > > We could have an optional feature that prevent users from running
> queries
> > > if the rows/bytes scanned are above a certain threshold. We should also
> > > enhance our explain
> > > plan documentationhttp://phoenix.apache.org/explainplan.html  with
> > example
> > > of queries so users know what kinds of queries Phoenix handles well.
> >
> > Breaking this out..
> >
> > Totally agree -- this is by no means "easy". I struggle very often
> > trying to express just _why_ a query that someone is running in Phoenix
> > doesn't run as well as they think it should.
> >
> > Centralizing on the EXPLAIN plan is good. Making sure it's
> > consumable/thorough is probably the lowest hanging fruit. If we can give
> > concrete examples to the kinds of explain plans a user might see, I
> > think that might get use from users/admins.
> >
> > Throwing a random idea out there: with stats and the query plan, can we
> > give a thumbs-up/thumbs-down? If we can, is that useful?
> >
>

Re: [DISCUSS] EXPLAIN'ing what we do well (was Re: [DISCUSS] Suggestions for Phoenix from HBaseCon Asia notes)

Posted by James Taylor <ja...@apache.org>.

Thomas' idea is a good one. From the EXPLAIN plan ResultSet, you can
directly get an estimate of the number of bytes that will be scanned. Take
a look at this [1] documentation. We need to implement PHOENIX-4735 too (so
that things are setup well out-of-the-box). We could have a kind of
guardrail config property that would define the max allowed bytes allowed
to be read and fail a query that goes over this limit. That would cover 80%
of the issues IMHO. Other guardrail config properties could cover other
corner cases.

[1] http://phoenix.apache.org/explainplan.html

On Mon, Aug 27, 2018 at 3:01 PM Josh Elser <el...@apache.org> wrote:

> On 8/27/18 5:03 PM, Thomas D'Silva wrote:
> >> 3. Better recommendations to users to not attempt certain queries.
> >>
> >> We definitively know that there are certain types of queries that
> Phoenix
> >> cannot support well (compared to optimal Phoenix use-cases). Users very
> >> commonly fall into such pitfalls on their own and this leaves a bad
> taste
> >> in their mouth (thinking that the product "stinks").
> >>
> >> Can we do a better job of telling the user when and why it happened?
> What
> >> would such a user-interaction model look like? Can we supplement the
> "why"
> >> with instructions of what to do differently (even if in the abstract)?
> >>
> > Providing relevant feedback before/after a query is run in general is
> very
> > hard to do. If stats are enabled we have an estimate of how many
> rows/bytes
> > will be scanned.
> > We could have an optional feature that prevent users from running queries
> > if the rows/bytes scanned are above a certain threshold. We should also
> > enhance our explain
> > plan documentationhttp://phoenix.apache.org/explainplan.html  with
> example
> > of queries so users know what kinds of queries Phoenix handles well.
>
> Breaking this out..
>
> Totally agree -- this is by no means "easy". I struggle very often
> trying to express just _why_ a query that someone is running in Phoenix
> doesn't run as well as they think it should.
>
> Centralizing on the EXPLAIN plan is good. Making sure it's
> consumable/thorough is probably the lowest hanging fruit. If we can give
> concrete examples to the kinds of explain plans a user might see, I
> think that might get use from users/admins.
>
> Throwing a random idea out there: with stats and the query plan, can we
> give a thumbs-up/thumbs-down? If we can, is that useful?
>