You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/11/11 17:50:28 UTC

[GitHub] [druid] abhishekagarwal87 opened a new pull request #11911: Human-readable and actionable SQL error messages

abhishekagarwal87 opened a new pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   <!-- Thanks for trying to help us make Apache Druid be the best it can be! Please fill out as much of the following information as is possible (where relevant, and remove it when irrelevant) to help make the intention and scope of this PR clear in order to ease review. -->
   
   <!-- Please read the doc for contribution (https://github.com/apache/druid/blob/master/CONTRIBUTING.md) before making this PR. Also, once you open a PR, please _avoid using force pushes and rebasing_ since these make it difficult for reviewers to see what you've changed in response to their reviews. See [the 'If your pull request shows conflicts with master' section](https://github.com/apache/druid/blob/master/CONTRIBUTING.md#if-your-pull-request-shows-conflicts-with-master) for more details. -->
   
   <!-- Replace XXXX with the id of the issue fixed in this PR. Remove this section if there is no corresponding issue. Don't reference the issue in the title of this pull-request. -->
   
   <!-- If you are a committer, follow the PR action item checklist for committers:
   https://github.com/apache/druid/blob/master/dev/committer-instructions.md#pr-and-issue-action-item-checklist-for-committers. -->
   
   ### Description
   
   <!-- Describe the goal of this PR, what problem are you fixing. If there is a corresponding issue (referenced above), it's not necessary to repeat the description here, however, you may choose to keep one summary sentence. -->
   
   This PR does two things
    - It adds the capability to surface missing features in SQL to users - The calcite planner will explore through multiple rules to convert a logical SQL query to a druid native query. Some rules change the shape of the query itself, optimize it and some rules are responsible for translating the query into a druid native query. These are DruidQueryRule, DruidOuterQueryRule, DruidJoinRule, DruidUnionDataSourceRule, DruidUnionRule etc. These rules will look at SQL and will do the necessary transformation. But if the rule can't transform the query, it returns back the control to the calcite planner without recording why was it not able to transform. E.g. there is a join query with a non-equal join condition. DruidJoinRule will look at the condition, see that it is not supported, and return back the control. The reason can be that a query can be planned in many different ways so if one rule can't parse it, the query may still be parseable by other rules. In this PR, we are intercepting
  these gaps and passing them back to the user if the query could not be planned at all. 
    - The said capability has been used to generate actionable errors for some common unsupported SQL features. However, not all possible errors are covered and we can keep adding more in the future. 
    
   The code changes themselves are not intrusive and minimal. I have modified the `PlannerContext` to record any planning error and then use it when generating a user-facing exception. 
   
   
   ##### Key changed/added classes in this PR
    * `DruidPlanner`
    * `Druid*Rule`
    * `DruidUnionDataSourceRel`
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist below are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   - [ ] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 closed pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 closed pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-986807735


   This pull request **introduces 1 alert** when merging 8d29d9750e7d0dbb44b6fa50bfea785b62b76d28 into 590cf993c0e9e9286008978e1ab282518d8dc78c - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-3cac1e70a186f932c142b55e079e8627ce737be1)
   
   **new alerts:**
   
   * 1 for Missing space in string literal


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 closed pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 closed pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] rohangarg commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
rohangarg commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-978006713


   Agreeing to what Gian has mentioned, I think it could be helpful to also ask Julian regarding the general problem and whether he has some more context, ideas or experience (from other project integrations) regarding this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe edited a comment on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
cryptoe edited a comment on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-967079150


   IMHO We can change the messaging a bit. Rather than calling it an error, we can say "QueryDiagnosis".
   `We tried to change the query shape to an intermediate "Union/Join/......" query but could not due to "reason".`
   
   Also, I think for the first cut `setPlannerError` should append errors rather than overwrite the error. This way, we can show all the possible "error" paths to the end-user. This way we are safeguarding against leading the user down the incorrect diagnosis path. Consider the scenario where the user wrote a 1000 line SQL query :) . Debugging such a query with an incorrect error message may not be the most pleasant experience :)
   
   If we want to show the message as an error I echo Gian's thought:
   
   > But, is there a potential for bizarre errors that relate to rules or query types that were explored, but didn't work out, and appear like nonsense to the user? We'd want to make sure that every time `setPlanningError` is called, it's something so fundamental that the overall query can't be handled by a different planner path. The idea is that we want to avoid a situation where there are two paths A and B, and each has an error, and the error from A is reported but doesn't make sense to the user because it is sort of path-A-specific.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 merged pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 merged pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 closed pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 closed pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-966584311


   The technique here is cool and an interesting way around the fact that these rules can be "exploratory" and therefore they can't simply throw exceptions.
   
   But, is there a potential for bizarre errors that relate to rules or query types that were explored, but didn't work out, and appear like nonsense to the user? We'd want to make sure that every time `setPlanningError` is called, it's something so fundamental that the overall query can't be handled by a different planner path. The idea is that we want to avoid a situation where there are two paths A and B, and each has an error, and the error from A is reported but doesn't make sense to the user because it is sort of path-A-specific.
   
   In other words: we should make sure the error messages always refer to things that the user actually wrote in their query, and that we know we don't support. I suggest adding something about this to the javadocs for the method so people don't add new ones that end up being confusing.
   
   Although, that also makes me think: if these errors are going to be really unrecoverable, maybe we _should_ actually make them regular exceptions that halt planning immediately…?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-977953480


   > This is a good call, but I think showing all potential errors won't help with debugging a 1000 line SQL query anyway since the query is too big. Ideally, we should be able to provide some context for each error where they are coming from in the query. I'm not sure how to do it though, maybe we can keep some context of the parent node while traversing the query plan tree.
   
   The best mechanism in Calcite for doing what you describe is post-parsing validation. It operates at an earlier stage: on the parse tree instead of the relational operator tree. Most of Calcite's builtin validations happen at that stage, and at that stage there is information available about line numbers, etc. Also it generally assumes that anything that makes it past the validator is a valid query, so I think that's why it has really "weird" errors from the planner when the planner can't plan something.
   
   I added some validations in another patch that operate on that parse tree (although they don't use line number info): https://github.com/apache/druid/blob/61bc7edd163b447fb7ec7a645f074f1e6320f36b/sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java#L542-L577
   
   Maybe we can use a similar technique for this stuff? i.e. whenever possible, look at the parse tree and identify certain patterns we cannot handle, before handing it off to the planner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 closed pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 closed pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
cryptoe commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-967079150


   IMHO We can change the messaging a bit. Rather than calling it an error, we can say "QueryDiagnosis".
   `We tried to change the query shape to an intermediate "Union/Join/......" query but could not due to "reason".`
   
   Also, I think for the first cut `setPlannerError` should append errors rather than overwrite the error. This way, we can show all the possible "error" paths to the end-user. This way we are safeguarding against leading the user down the incorrect diagnosis path. Consider the scenario where the user wrote a 1000 line SQL query :) . Debugging such a query with an incorrect error message may not be the most pleasant experience :)
   
   If we want to show the error I echo Gian's thought:
   
   > But, is there a potential for bizarre errors that relate to rules or query types that were explored, but didn't work out, and appear like nonsense to the user? We'd want to make sure that every time `setPlanningError` is called, it's something so fundamental that the overall query can't be handled by a different planner path. The idea is that we want to avoid a situation where there are two paths A and B, and each has an error, and the error from A is reported but doesn't make sense to the user because it is sort of path-A-specific.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-967036681


   > The technique here is cool and an interesting way around the fact that these rules can be "exploratory" and therefore they can't simply throw exceptions.
   > 
   > But, is there a potential for bizarre errors that relate to rules or query types that were explored, but didn't work out, and appear like nonsense to the user? We'd want to make sure that every time `setPlanningError` is called, it's something so fundamental that the overall query can't be handled by a different planner path. The idea is that we want to avoid a situation where there are two paths A and B, and each has an error, and the error from A is reported but doesn't make sense to the user because it is sort of path-A-specific.
   
   Yes. It is possible. Here is one such example query
   ```
   SELECT dim2, COUNT(distinct dim1), COUNT(distinct dim2) FROM druid.foo GROUP BY dim2
   ```
   This query should ideally throw an error that suggests that multiple exact distinct count are not supported unless you set a custom flag. However, the error that the user gets is below
   ```
   Only equal conditions are supported in joins
   ```
   This would, of course, look bizarre to end users since they have not put a join in the query and it wouldn't be clear to them that a query with multiple exact distinct counts could be planned using a join. I am thinking of some ways of how we can make this experience better
   - Be smarter about setting these errors and improve on a case-by-case basis. E.g. in the above example, I might check if the join condition is of type `is-distinct-from` and if so, I add an additional note to the error message such as below
   ```
   if you are using multiple exact distinct counts, please make sure to turn flag X on.
   ```
   - Use error codes alongside error messages and document any such gotchas in our user documentation. So when user searches for an error code, they can get more details about the error message and possible query shapes that could fail with that error. 
   
   None of these are perfect solutions but will minimize the confusion to a good extent. Your suggestion about being conservative about setting the planning error makes sense so long we cover most of the common SQL errors that our users run into. 
   > 
   > In other words: we should make sure the error messages always refer to things that the user actually wrote in their query, and that we know we don't support. I suggest adding something about this to the javadocs for the method so people don't add new ones that end up being confusing.
   
   Yes. SGTM. 
    
   > Although, that also makes me think: if these errors are going to be really unrecoverable, maybe we _should_ actually make them regular exceptions that halt planning immediately…?
   
   I thought about it but the risk of a false positive (falsely declaring a query as unsupported) becomes very high with that approach. In the current scheme, setting the planning error is a no-op if the query does get planned eventually. If we are halting the planning immediately, we may end up killing genuine queries. The current approach, I feel, is safer in that way. And in the future, a developer, trying to add an error message, will not have to struggle with the dilemma of breaking some genuine query in exchange for an informative error message.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 edited a comment on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 edited a comment on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-967036681


   > The technique here is cool and an interesting way around the fact that these rules can be "exploratory" and therefore they can't simply throw exceptions.
   > 
   > But, is there a potential for bizarre errors that relate to rules or query types that were explored, but didn't work out, and appear like nonsense to the user? We'd want to make sure that every time `setPlanningError` is called, it's something so fundamental that the overall query can't be handled by a different planner path. The idea is that we want to avoid a situation where there are two paths A and B, and each has an error, and the error from A is reported but doesn't make sense to the user because it is sort of path-A-specific.
   
   Yes. It is possible. Here is one such example query
   ```
   SELECT dim2, COUNT(distinct dim1), COUNT(distinct dim2) FROM druid.foo GROUP BY dim2
   ```
   This query should ideally throw an error that suggests that multiple exact distinct count are not supported unless you set a custom flag. However, the error that the user gets is below
   ```
   Only equal conditions are supported in joins
   ```
   This would, of course, look bizarre to end users since they have not put a join in the query and it wouldn't be clear to them that a query with multiple exact distinct counts could be planned using a join. I am thinking of some ways of how we can make this experience better
   - Be smarter about setting these errors and improve on a case-by-case basis. E.g. in the above example, I might check if the join condition is of type `is-distinct-from` and if so, I add an additional note to the error message such as below
   ```
   if you are using multiple exact distinct counts, please make sure to turn flag X on.
   ```
   - Use error codes alongside error messages and document any such gotchas in our user documentation. So when user searches for an error code, they can get more details about the error message and possible query shapes that could fail with that error. 
   - Re-word the error such that the user knows it is a possible error but may not be the actual reason. E.g. an error such as below
   ```
   Possible error: Only equal conditions are supported in joins
   ```
   
   None of these are perfect solutions but will minimize the confusion to a good extent. Your suggestion about being conservative about setting the planning error makes sense so long we cover most of the common SQL errors that our users run into. any other suggestions will be helpful. 
   > 
   > In other words: we should make sure the error messages always refer to things that the user actually wrote in their query, and that we know we don't support. I suggest adding something about this to the javadocs for the method so people don't add new ones that end up being confusing.
   
   Yes. SGTM. 
    
   > Although, that also makes me think: if these errors are going to be really unrecoverable, maybe we _should_ actually make them regular exceptions that halt planning immediately…?
   
   I thought about it but the risk of a false positive (falsely declaring a query as unsupported) becomes very high with that approach. In the current scheme, setting the planning error is a no-op if the query does get planned eventually. If we are halting the planning immediately, we may end up killing genuine queries. The current approach, I feel, is safer in that way. And in the future, a developer, trying to add an error message, will not have to struggle with the dilemma of breaking some genuine query in exchange for an informative error message.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 edited a comment on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 edited a comment on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-967036681


   > The technique here is cool and an interesting way around the fact that these rules can be "exploratory" and therefore they can't simply throw exceptions.
   > 
   > But, is there a potential for bizarre errors that relate to rules or query types that were explored, but didn't work out, and appear like nonsense to the user? We'd want to make sure that every time `setPlanningError` is called, it's something so fundamental that the overall query can't be handled by a different planner path. The idea is that we want to avoid a situation where there are two paths A and B, and each has an error, and the error from A is reported but doesn't make sense to the user because it is sort of path-A-specific.
   
   Yes. It is possible. Here is one such example query
   ```
   SELECT dim2, COUNT(distinct dim1), COUNT(distinct dim2) FROM druid.foo GROUP BY dim2
   ```
   This query should ideally throw an error that suggests that multiple exact distinct count are not supported unless you set a custom flag. However, the error that the user gets is below
   ```
   Only equal conditions are supported in joins
   ```
   This would, of course, look bizarre to end users since they have not put a join in the query and it wouldn't be clear to them that a query with multiple exact distinct counts could be planned using a join. I am thinking of some ways of how we can make this experience better
   - Be smarter about setting these errors and improve on a case-by-case basis. E.g. in the above example, I might check if the join condition is of type `is-distinct-from` and if so, I add an additional note to the error message such as below
   ```
   if you are using multiple exact distinct counts, please make sure to turn flag X on.
   ```
   - Use error codes alongside error messages and document any such gotchas in our user documentation. So when user searches for an error code, they can get more details about the error message and possible query shapes that could fail with that error. 
   
   None of these are perfect solutions but will minimize the confusion to a good extent. Your suggestion about being conservative about setting the planning error makes sense so long we cover most of the common SQL errors that our users run into. any other suggestions will be helpful. 
   > 
   > In other words: we should make sure the error messages always refer to things that the user actually wrote in their query, and that we know we don't support. I suggest adding something about this to the javadocs for the method so people don't add new ones that end up being confusing.
   
   Yes. SGTM. 
    
   > Although, that also makes me think: if these errors are going to be really unrecoverable, maybe we _should_ actually make them regular exceptions that halt planning immediately…?
   
   I thought about it but the risk of a false positive (falsely declaring a query as unsupported) becomes very high with that approach. In the current scheme, setting the planning error is a no-op if the query does get planned eventually. If we are halting the planning immediately, we may end up killing genuine queries. The current approach, I feel, is safer in that way. And in the future, a developer, trying to add an error message, will not have to struggle with the dilemma of breaking some genuine query in exchange for an informative error message.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #11911:
URL: https://github.com/apache/druid/pull/11911#discussion_r762972511



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
##########
@@ -209,23 +212,27 @@ public PlannerResult plan() throws SqlParseException, ValidationException, RelCo
         // Not a CannotPlanningException, rethrow without trying with bindable
         throw e;
       }
+      if (parsed.getInsertNode() != null) {
+        // Cannot INSERT with BINDABLE.
+        throw e;
+      }
       // Try again with BINDABLE convention. Used for querying Values and metadata tables.
       try {
-        return planWithBindableConvention(explain, root);
+        return planWithBindableConvention(rootQueryRel, parsed.getExplainNode());
       }
       catch (Exception e2) {
         e.addSuppressed(e2);
-        log.noStackTrace().warn(e, "Failed to plan the query '%s'", sql);
+        log.noStackTrace().warn(e, "Failed to plan the query '%s'", plannerContext.getSql());

Review comment:
       I made the changes in `QueryLifecycle` to be consistent with logging here. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 closed pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 closed pull request #11911:
URL: https://github.com/apache/druid/pull/11911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-984870933


   Thanks @jihoonson 
   Apart from addressing your comments, I am also going to re-word the errors a bit
   e.g. "Possible error: only equal conditions are supported in join" --> "Possible error: SQL query requires a join with condition that is not supported. Currently, only equal conditions are supported" 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on pull request #11911:
URL: https://github.com/apache/druid/pull/11911#issuecomment-978121310


   Surfacing the error in post-parsing validation can be done in some places e.g. in MaxSQLAggregator, druid can tell calcite that string operands are not supported. This can be done by returning a custom operator with supported types in `calciteFunction()` instead of returning a standard operator from the calcite library.  I do not know how would we do that at other places such as an unsupported join condition or union between incompatible data sources. 
   
   It is good to see these ideas coming in for how to surface these errors in a very useful. There are two major areas of improvement I can see based on the discussion so far
    - one is the new error framework itself that is added in this PR. Here the debate is about showcasing one error or multiple errors. There is also the question of do we need this framework at all. I think we can try to minimize using it but in some places, plannerContext is the only way to surface the error. 
    - second is improving each error or scenario itself. By improving, I mean making it precise and as informative as possible. I believe this will be an ongoing exercise. In this PR, I replaced `ISE/IAE` with a custom exception to surface many errors. Going one step ahead and using calcite to surface these errors is even better as that is possible in some places. And we can continue to improve these errors and make them more precise in future improvements. I do not think it will be possible to do all of that in this PR itself. :) 
   
   So here are the open questions 
    - retaining last planning error or last X number of planning errors - I am in the camp of former since I did want to see some real-world usage first before even trying to do the latter. since these are errors, we can always change them later on. That being said, I am not very strongly opposed to the idea of retaining multiple planning errors either. 
    - HTTP response code for an unplannable query - Right now, we are returning 500 but should we return 400 instead? In fact, we do return 400 as well for some planning errors. Should we treat CannotPlanException, not as an internal server exception but as a bad query request?
    - Delegating the operand validation to calcite in some of the aggregation functions - This I can do in this PR itself. 
    - scope of the initial PR - I have attempted to clarify that in the first few paragraphs. Happy to hear any other thoughts on it. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #11911:
URL: https://github.com/apache/druid/pull/11911#discussion_r755579773



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/UnsupportedQueryFeatureException.java
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.planner;
+
+import org.apache.calcite.plan.RelOptPlanner;
+import org.apache.druid.java.util.common.StringUtils;
+
+/**
+ * This class is different from {@link org.apache.calcite.plan.RelOptPlanner.CannotPlanException} in that the error
+ * messages are user-friendly unlike it's parent class. This exception class be used instead of

Review comment:
       ```suggestion
    * messages are user-friendly unlike its parent class. This exception class should be used instead of
   ```

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
##########
@@ -185,14 +187,29 @@ public PlannerResult plan(final String sql) throws SqlParseException, Validation
     try {
       return planWithDruidConvention(explain, root);
     }
-    catch (RelOptPlanner.CannotPlanException e) {
+    catch (Exception e) {
+      Throwable cannotPlanException = Throwables.getCauseOfType(e, RelOptPlanner.CannotPlanException.class);
+      if (null == cannotPlanException) {
+        // Not a CannotPlanningException, rethrow without trying with bindable
+        throw e;
+      }
       // Try again with BINDABLE convention. Used for querying Values and metadata tables.
       try {
         return planWithBindableConvention(explain, root);
       }
       catch (Exception e2) {
         e.addSuppressed(e2);
-        throw e;
+        log.noStackTrace().warn(e, "Failed to plan the query '%s'", sql);

Review comment:
       Should it omit stack trace only when `QueryContexts.isDebug() = true`?

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/UnsupportedQueryFeatureException.java
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.planner;
+
+import org.apache.calcite.plan.RelOptPlanner;
+import org.apache.druid.java.util.common.StringUtils;
+
+/**
+ * This class is different from {@link org.apache.calcite.plan.RelOptPlanner.CannotPlanException} in that the error
+ * messages are user-friendly unlike it's parent class. This exception class be used instead of
+ * {@link org.apache.druid.java.util.common.ISE} or {@link org.apache.druid.java.util.common.IAE} when processing is
+ * to be halted during planning. Similarly, Druid planner can catch these exception and know that the error

Review comment:
       ```suggestion
    * to be halted during planning. Similarly, Druid planner can catch this exception and know that the error
   ```

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/UnsupportedQueryFeatureException.java
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.planner;
+
+import org.apache.calcite.plan.RelOptPlanner;
+import org.apache.druid.java.util.common.StringUtils;
+
+/**
+ * This class is different from {@link org.apache.calcite.plan.RelOptPlanner.CannotPlanException} in that the error
+ * messages are user-friendly unlike it's parent class. This exception class be used instead of
+ * {@link org.apache.druid.java.util.common.ISE} or {@link org.apache.druid.java.util.common.IAE} when processing is
+ * to be halted during planning. Similarly, Druid planner can catch these exception and know that the error
+ * can be directly exposed to end-user.

Review comment:
       ```suggestion
    * can be directly exposed to end-users.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #11911: Human-readable and actionable SQL error messages

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #11911:
URL: https://github.com/apache/druid/pull/11911#discussion_r759744852



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/UnsupportedSQLQueryException.java
##########
@@ -24,14 +24,14 @@
 
 /**
  * This class is different from {@link org.apache.calcite.plan.RelOptPlanner.CannotPlanException} in that the error
- * messages are user-friendly unlike it's parent class. This exception class be used instead of
+ * messages are user-friendly unlike its parent class. This exception class should be used instead of
  * {@link org.apache.druid.java.util.common.ISE} or {@link org.apache.druid.java.util.common.IAE} when processing is
- * to be halted during planning. Similarly, Druid planner can catch these exception and know that the error
- * can be directly exposed to end-user.
+ * to be halted during planning. Similarly, Druid planner can catch this exception and know that the error
+ * can be directly exposed to end-users.
  */
-public class UnsupportedQueryFeatureException extends RelOptPlanner.CannotPlanException
+public class UnsupportedSQLQueryException extends RelOptPlanner.CannotPlanException

Review comment:
       :+1: 

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
##########
@@ -209,23 +212,27 @@ public PlannerResult plan() throws SqlParseException, ValidationException, RelCo
         // Not a CannotPlanningException, rethrow without trying with bindable
         throw e;
       }
+      if (parsed.getInsertNode() != null) {
+        // Cannot INSERT with BINDABLE.
+        throw e;
+      }
       // Try again with BINDABLE convention. Used for querying Values and metadata tables.
       try {
-        return planWithBindableConvention(explain, root);
+        return planWithBindableConvention(rootQueryRel, parsed.getExplainNode());
       }
       catch (Exception e2) {
         e.addSuppressed(e2);
-        log.noStackTrace().warn(e, "Failed to plan the query '%s'", sql);
+        log.noStackTrace().warn(e, "Failed to plan the query '%s'", plannerContext.getSql());

Review comment:
       https://github.com/apache/druid/pull/11911#discussion_r755584263

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
##########
@@ -209,23 +212,27 @@ public PlannerResult plan() throws SqlParseException, ValidationException, RelCo
         // Not a CannotPlanningException, rethrow without trying with bindable
         throw e;
       }
+      if (parsed.getInsertNode() != null) {
+        // Cannot INSERT with BINDABLE.
+        throw e;
+      }
       // Try again with BINDABLE convention. Used for querying Values and metadata tables.
       try {
-        return planWithBindableConvention(explain, root);
+        return planWithBindableConvention(rootQueryRel, parsed.getExplainNode());
       }
       catch (Exception e2) {
         e.addSuppressed(e2);
-        log.noStackTrace().warn(e, "Failed to plan the query '%s'", sql);
+        log.noStackTrace().warn(e, "Failed to plan the query '%s'", plannerContext.getSql());

Review comment:
       Also, warning seems reasonable to me as this is not a system failure but is inconsistent with https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/QueryLifecycle.java#L323




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org